Is there an accurate way in Haskell to encode tuple of Foo, where some certain combinations are forb-CodePudding

(I'm sorry in advance that I don't know how to phrase this problem better)

Say I have a datatype like this:

data Foo = A | B

Now I want a pair of Foo, with a constraint that (A, A) is forbidden.

I could go with the straightforward way of just listing them all like this:

data Foo2 = AB | BA | BB

But as you can see, this gets out of hand quickly: what if we want n-tuples of Foo? or what if there are more alternatives of Foo?

An alternative is of course with newtype and smart constructors

newtype Foo2 = Foo2 (Foo, Foo)

mkFoo2 xy = Foo2 xy <$ guard (xy /= (A,A))

But this is in some sense "inaccurate", in that when we destruct Foo2, we always have to deal with cases that is in fact unreachable, yet compiler have no such knowledge:

...
case v :: Foo2 of
  ...
  Foo2 (A, A) -> error "unreachable"
...

My question is, is there a better way to have this idea of "n-tuple of Foo, where certain combinations, like (A,A), or (A,B,C) (when n=3) are impossible" represented accurately?

Side question: is subtraction / negation a thing in algebraic data type? I think what I need is basically a type that is isomorphic to Foo^n - (forbidden combinations) for n-tuple.

CodePudding user response：

There is no simple way to forbid "all As" in Haskell.

In dependently typed languages like Agda/Coq, we can put arbitrary constraints using sigma types. This however would require the programmer to write a mathematical proof each time the constructor is used, proving that we are not in fact trying to construct one of the "forbidden" values.

In Haskell, instead, we have no such option. One alternative could be to define a bunch of types.

data NotA = B | C
data Any  = A | NA NotA

-- 1-tuple, not all As
data NotAllAs1 
  = N1 NotA
-- 2-tuple, not all As
data NotAllAs2 
  = N2 NotA Any
  | N2a NotAllAs1          -- first A implicit
-- 3-tuple, not all As
data NotAllAs3
  = N3 NotA (Any, Any)
  | N3a NotAllAs1          -- first A implicit

And so on. It's not convenient at all, since we need to use a large number of constructors. Even if the end result is isomorphic to what we want, it's too cumbersome.

It could be improved using some type families, but it still looks rather inconvenient.

Another option could be to also exploit GADTs.

{-# LANGUAGE GADTs, DataKinds, TypeFamilies #-}

-- We define some tags for being A and not A
data IsA = IsA | NotA

-- Type T is indexed with the proper tag
data T (a :: IsA) where
   A :: T 'IsA
   B :: T 'NotA
   C :: T 'NotA

-- We want "at least one non-A" so we define an "or"
-- operation between two tags.
type family Or (a1 :: IsA) (a2 :: IsA) :: IsA where
   Or 'IsA a2 = a2
   Or 'NotA _ = 'NotA

-- Peano naturals to encode tuple length
data Nat = Z | S Nat

-- The wanted tuple type
type NotAllAs (n :: Nat) = NA n 'NotA

-- NA n t is the type for an n-tuple having either all As
-- (if t ~ IsA) or some non-A (if t ~ NotA)
data NA (n :: Nat) (t :: IsA) where
   Nil  :: NA 'Z 'IsA
   Cons :: T a1 -> NA n a2 -> NA ('S n) (Or a1 a2)

Finally, a few tests, uncomment one to try.

test :: NotAllAs ('S ('S ('S 'Z)))
test = 
   -- Cons A (Cons A (Cons A Nil))      -- Couldn't match type 'IsA with 'NotA
   -- Cons A (Cons A (Cons B Nil))      -- OK
   -- Cons A (Cons B (Cons A Nil))      -- OK
   -- Cons B (Cons A (Cons A Nil))      -- OK

The test below tests elimination (pattern matching). It does not trigger a warning for the impossible case A,A,A: the match is considered to be exhaustive.

elim :: NotAllAs ('S ('S ('S 'Z))) -> Int
elim (Cons A (Cons A (Cons B Nil))) = 1
elim (Cons A (Cons A (Cons C Nil))) = 2
elim (Cons A (Cons B _           )) = 3
elim (Cons A (Cons C _           )) = 4
elim (Cons B _                    ) = 5
elim (Cons C _                    ) = 6

No warnings as well: A,A is impossible.

elim2 :: NotAllAs ('S ('S 'Z)) -> Int
elim2 (Cons x (Cons A Nil)) = case x of B -> 1 ; C -> 2
elim2 (Cons _ (Cons B Nil)) = 3
elim2 (Cons _ (Cons C Nil)) = 4

In a dependently types language, it would not be that easy to perform elimination, since we would need to prove that the match is indeed exhaustive, usually by performing a dependent match on all cases including A,A and then reaching a contradiction. By comparison, here's how elimination would look like in Coq:

Inductive T: Set := A | B | C .

(* The constraint is trivial to specify. *)
Definition NotAllA2 := { p: T*T | p <> (A,A) } .

(* We will need this trivial lemma later *)
Lemma lem: forall x, (x,A) <> (A,A) -> x<>A .
Proof.
intros x h h2.
subst.
apply h.
reflexivity.
Qed.

Definition elim2 (v: NotAllA2): nat :=
   match v with
   | exist _ p h => (* h is the proof that our constraint holds *)
      match p return p<>(A,A) -> nat with
      | (x,A) => fun h2: (x,A)<>(A,A) => 
        match x return x<>A -> nat with
          (* We need to prove that A is impossible here *)
        | A => fun h3 => match h3 eq_refl with end
        | B => fun _ => 1
        | C => fun _ => 2
        end (lem x h2)
      | (_,B) => fun _ => 3
      | (_,C) => fun _ => 4
      end h
   end.

(There likely is a shorter/simpler Coq solution, but this is the first I could manage to produce.)

CodePudding user response：

Here's how I would express it with a GADT:

{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, GADTs, TypeOperators #-}
{-# LANGUAGE StandaloneDeriving #-} -- for `Show`

import Data.Type.Bool

data Foo = A | B                      -- see singletons
data TypedFoo x where                 -- version belowbe
  TypedA :: TypedFoo 'A               -- for how to avoid
  TypedB :: TypedFoo 'B               -- boilerplate
deriving instance Show (TypedFoo x)   --  / duplication

type family IsA x where
  IsA 'A = 'True
  IsA 'B = 'False

data Foo2 where
  Foo2 :: ((IsA x && IsA y) ~ 'False)
    => TypedFoo x -> TypedFoo y -> Foo2
deriving instance Show Foo2

Now

ghci> Foo2 TypedA TypedB
Foo2 TypedA TypedB
ghci> Foo2 TypedA TypedA

<interactive>:6:1: error:
    • Couldn't match type ‘'True’ with ‘'False’
        arising from a use of ‘Foo2’
    • In the expression: Foo2 TypedA TypedA
      In an equation for ‘it’: it = Foo2 TypedA TypedA

and the following does not give incompleteness warnings:

elim2 :: Foo2 -> Int
elim2 (Foo2 TypedA TypedB) = 0
elim2 (Foo2 TypedB TypedA) = 1
elim2 (Foo2 TypedB TypedB) = 2

Same solution using the singletons library:

{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, GADTs, TypeOperators #-}
{-# LANGUAGE StandaloneDeriving, StandaloneKindSignatures, TemplateHaskell #-}
{-# LANGUAGE TypeApplications                                              #-}

import Data.Type.Bool
import Data.Singletons.TH

$(singletons [d|
  data Foo = A | B
  isA :: Foo -> Bool
  isA A = True
  isA B = False
 |])
deriving instance Show Foo
deriving instance Show (SFoo x)

data Foo2 where
  Foo2 :: ((IsA x && IsA y) ~ 'False)
    => SFoo x -> SFoo y -> Foo2
deriving instance Show Foo2

I'm not sure it's really that useful to encode this sort of constraint with Haskell-data techniques. IMO a simple smart constructor is fine, just don't export it then so people also can't run into issues with cases that shouldn't be there. Instead, try to think of a higher-level reason why the case of two As does not make sense, and design the public interface to the type accordingly. Then it doesn't matter that the internal representation permits an illegal combination.