Re: [Haskell-cafe] Manual Type-Checking to provide Read instances for GADTs. (was Re: [Haskell-cafe] Read instance for GATD)

I've obtained permission to repost Gershom's slides on how to deserialize
GADTs by typechecking them yourself, which are actually a literate haskell
source file, that he was rendering to slides. Consequently, I've just pasted
the content below as a literate email.
-Edward Kmett
-
Deserializing strongly typed values
(four easy pieces about typechecking)
Gershom Bazerman
-
prior art:
Oleg (of course)
http://okmij.org/ftp/tagless-final/course/course.html
...but also
Stephanie Weirich
http://www.comlab.ox.ac.uk/projects/gip/school/tc.hs
=
Ahem...
\begin{code}
{-# LANGUAGE DeriveDataTypeable,
ExistentialQuantification,
FlexibleContexts,
FlexibleInstances,
FunctionalDependencies,
GADTs,
RankNTypes,
ScopedTypeVariables
#-}
\end{code}
=
ahem.
\begin{code}
import Data.Typeable
import Data.Maybe
import Control.Monad.Error
import Control.Applicative
import qualified Data.Map as M
import Unsafe.Coerce
\end{code}
=
A simple ADT.
\begin{code}
data SimpleExpr = SOpBi String SimpleExpr SimpleExpr
| SOpUn String SimpleExpr
| SDbl Double
| SBool Bool deriving (Read, Show, Typeable)
\end{code}
Yawn.
=
An awesome GADT!
\begin{code}
data Expr a where
EDbl :: Double -> Expr Double
EBool :: Bool -> Expr Bool
EBoolOpBi :: BoolOpBi -> Expr Bool -> Expr Bool -> Expr Bool
ENumOpBi :: NumOpBi -> Expr Double -> Expr Double -> Expr Double
ENumOpUn :: NumOpUn -> Expr Double -> Expr Double
deriving Typeable
data NumOpBi = Add | Sub deriving (Eq, Show)
data NumOpUn = Log | Exp deriving (Eq, Show)
data BoolOpBi = And | Or deriving (Eq, Show)
\end{code}
The GADT is well typed. It cannot go wrong.
-
It also cannot derive show.
=
But we can write show...
\begin{code}
showIt :: Expr a -> String
showIt (EDbl d) = show d
showIt (EBool b) = show b
showIt (EBoolOpBi op x y) = "(" ++ show op
++ " " ++ showIt x
++ " " ++ showIt y ++ ")"
showIt (ENumOpBi op x y) = "(" ++ show op
++ " " ++ showIt x
++ " " ++ showIt y ++ ")"
showIt (ENumOpUn op x) = show op ++ "(" ++ showIt x ++ ")"
\end{code}
=
And eval is *much nicer*.
It cannot go wrong --> no runtime typechecks.
\begin{code}
evalIt :: Expr a -> a
evalIt (EDbl x) = x
evalIt (EBool x) = x
evalIt (EBoolOpBi op expr1 expr2)
| op == And = evalIt expr1 && evalIt expr2
| op == Or = evalIt expr2 || evalIt expr2
evalIt (ENumOpBi op expr1 expr2)
| op == Add = evalIt expr1 + evalIt expr2
| op == Sub = evalIt expr1 - evalIt expr2
\end{code}
=
But how do we write read!?
read "EBool False" = Expr Bool
read "EDbl 12" = Expr Double
The type being read depends on the content of the string.
Even worse, we want to read not from a string that looks obvious
to Haskell (i.e. a standard showlike instance) but from
something that looks pretty to the user -- we want to *parse*.
So we parse into our simple ADT.
Then we turn our simple ADT into our nice GADT.
-
But how?
How do we go from untyped... to typed?
[And in general -- not just into an arbitrary GADT,
but an arbitrary inhabitant of a typeclass.]
[i.e. tagless final, etc]
=
Take 1:
Even if we do not know what type we are creating,
we eventually will do something with it.
So we paramaterize our typechecking function over
an arbitrary continuation.
\begin{code}
mkExpr :: (forall a. (Show a, Typeable a) => Expr a -> r) -> SimpleExpr -> r
mkExpr k expr = case expr of
SDbl d -> k $ EDbl d
SBool b -> k $ EBool b
SOpUn op expr1 -> case op of
"log" -> k $ mkExpr' (ENumOpUn Log) expr1
"exp" -> k $ mkExpr' (ENumOpUn Exp) expr1
_ -> error "bad unary op"
SOpBi op expr1 expr2 -> case op of
"add" -> k $ mkExprBi (ENumOpBi Add) expr1 expr2
"sub" -> k $ mkExprBi (ENumOpBi Sub) expr1 expr2
\end{code}
=
Where's the typechecking?
\begin{code}
mkExpr' k expr = mkExpr (appCast $ k) expr
mkExprBi k expr1 expr2 = mkExpr' (mkExpr' k expr1) expr2
appCast :: forall a b c r. (Typeable a, Typeable b) => (c a -> r) -> c b ->
r
appCast f x = maybe err f $ gcast x
where err = error $ "Type error. Expected: " ++ show (typeOf
(undefined::a))
++ ", Inferred: " ++ show (typeOf (undefined::b))
\end{code}
... We let Haskell do all the work!
=
Hmmm... the continuation can be anything.
So let's just make it an existential constructor.
\begin{code}
data ExprBox = forall a. Typeable a => ExprBox (Expr a)
appExprBox :: (forall a. Expr a -> res) -> ExprBox -> res
appExprBox f (ExprBox x) = f x
tcCast :: forall a b c. (Typeable a, Typeable b) => Expr a -> Either String
(Expr b)
tcCast x = maybe err Right $ gcast x
where err = Left $ "Type error. Expected: " ++ show (typeOf
(undefined::a))
++ ", Inferred: " ++ show (typeOf (undefined::b))
\end{code}
Now we can delay deciding what to apply until later.
Typecheck once, execute repeatedly!
=
One more trick -- monadic notation lets us
extend the context of unpacked existentials
to the end of the do block
\begin{code}
retBox x = return (ExprBox $ x, typeOf x)
typeCheck :: SimpleExpr -> Either String (ExprBox, TypeRep)
typeCheck (SDbl d) = retBox (EDbl d)
typeCheck (SBool b) = retBox (EBool b)
typeCheck (SOpBi op s1 s2) = case op of
"add" -> tcBiOp (ENumOpBi Add)
"sub" -> tcBiOp (ENumOpBi Sub)
"and" -> tcBiOp (EBoolOpBi And)
"or" -> tcBiOp (EBoolOpBi Or)
where
tcBiOp constr = do
(ExprBox e1, _) <- typeCheck s1
(ExprBox e2, _) <- typeCheck s2
e1' <- tcCast e1
e2' <- tcCast e2
retBox $ constr e1' e2'
\end{code}
=
So that's fine for *very* simple expressions.
How does it work for interesting GADTs?
(like, for example, HOAS)?
(The prior art doesn't demonstrate HOAS --
it uses DeBruijn.)
=
Our simple world
\begin{code}
type Ident = String
type TypeStr = String
data STerm = SNum Double
| SApp STerm STerm
| SVar Ident
| SLam Ident TypeRep STerm
\end{code}
Note.. terms are Church style -- each var introduced
has a definite type.
Determining this type is left as an exercise.
=
Over the rainbow in well-typed land.
\begin{code}
data Term a where
TNum :: Double -> Term Double
TApp :: Term (a -> b) -> Term a -> Term b
TLam :: Typeable a => (Term a -> Term b) -> Term (a -> b)
TVar :: Typeable a => Int -> Term a
deriving Typeable
\end{code}
Wait! DeBrujin (TVar) *and* HOAS (TLam)! The worst of both worlds.
Don't worry. In the final product all TVars are eliminated by construction.
Exercise to audience: rewrite the code so that TVar can be eliminated
from the Term type.
=
Show and eval...
\begin{code}
showT :: Int -> Term a -> String
showT c (TNum d) = show d
showT c (TApp f x) = "App (" ++ showT c f
++ ") (" ++ showT c x ++ ")"
showT c (TLam f) = "Lam " ++ ("a"++show c)
++ " -> " ++ (showT (succ c) $ f (TVar c))
showT c (TVar i) = "a"++show i
runT :: Term a -> Term a
runT (TNum d) = (TNum d)
runT (TLam f) = (TLam f)
runT (TApp f x) = case runT f of TLam f' -> runT (f' x)
runT (TVar i) = error (show i)
\end{code}
=
The existential
\begin{code}
data TermBox = forall a. Typeable a => TermBox (Term a)
appTermBox :: (forall a. Typeable a => Term a -> res) -> TermBox -> res
appTermBox f (TermBox x) = f x
\end{code}
=
The typechecker returns a box *and* a typeRep.
Cast is the usual trick.
\begin{code}
retTBox :: Typeable a => Term a -> Either String (TermBox, TypeRep)
retTBox x = return (TermBox $ x, typeOf x)
type Env = M.Map Ident (TermBox, TypeRep)
trmCast :: forall a b c. (Typeable a, Typeable b) =>
Term a -> Either String (Term b)
trmCast x = maybe err Right $ gcast x
where err = Left $ "Type error. Expected: " ++
show (typeOf (undefined::a))
++ ", Inferred: " ++
show (typeOf (undefined::b))
\end{code}
=
\begin{code}
typeCheck' :: STerm -> Env -> Either String (TermBox, TypeRep)
typeCheck' t env = go t env 0
where
go (SNum d) _ idx = retTBox (TNum d)
go (SVar i) env idx = do
(TermBox t, _) <- maybe (fail $ "not in scope: " ++ i)
return $ M.lookup i env
retTBox $ t
\end{code}
Nums and vars are easy.
=
App and Lam... less so.
\begin{code}
go (SApp s1 s2) env idx = do
(TermBox e1, tr1) <- go s1 env idx
(TermBox e2, _) <- go s2 env idx
TermBox rt <- return $ mkTerm $ head $ tail $
typeRepArgs $ head $ typeRepArgs $ tr1
-- TypeReps have their... drawbacks.
e1' <- trmCast e1
retTBox $ TApp e1' e2 `asTypeOf` rt
go (SLam i tr s) env idx = do
TermBox et <- return $ mkTerm tr
(TermBox e, _) <- go s
(M.insert i
(TermBox (TVar idx `asTypeOf` et),tr)
env
)
(idx + 1)
retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e)
\end{code}
=
How does mkTerm work?
\begin{code}
mkTerm :: TypeRep -> TermBox
mkTerm tr = go tr TermBox
where
go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res
go tr k
| tr == typeOf (0::Double) = k (TNum 0)
| typeRepTyCon tr == arrCon =
go (head $ typeRepArgs tr)
$ \xt -> go (head $ tail $ typeRepArgs tr)
$ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt))
arrCon = typeRepTyCon $ typeOf (undefined::Int -> String)
\end{code}
Same principle -- but can build arrows directly.
Doing so requires staying cps... I think.
=
And this is how we get rid of the DeBruijn terms.
\begin{code}
subst :: (Typeable a) => Term a -> Int -> Term b -> Term b
subst t i trm = go trm
where
go :: Term c -> Term c
go (TNum d) = (TNum d)
go (TApp f x) = TApp (go f) (go x)
go (TLam f) = TLam (\a -> go (f a))
go (TVar i')
| i == i' = either error id $ trmCast t
| otherwise = (TVar i')
\end{code}
Q: Now you see why DeBruijn is handy -- substitution
is otherwise a pain.
=
But note -- all functions must be monotyped.
This is the simply typed lambda calculus.
How do we represent TLam (\a -> a)?
The masses demand HM polymorphism.
-
Take 4:
A damn dirty hack.
=
All hacks begin with Nats.
\begin{code}
data Z = Z deriving (Show, Typeable)
data S a = S a deriving (Show, Typeable)
\end{code}
=
typeCheck is almost the same.
\begin{code}
typeCheck'' :: STerm -> Env -> Either String (TermBox, TypeRep)
typeCheck'' t env = go t env 0
where
go :: STerm -> Env -> Int -> Either String (TermBox, TypeRep)
go (SNum d) _ idx = retTBox (TNum d)
go (SVar i) env idx = do
(TermBox t, _) <- maybe (fail $ "not in scope: " ++ i)
return $ M.lookup i env
retTBox $ t
\end{code}
=
\begin{code}
go (SApp s1 s2) env idx = do
(TermBox e1, tr1) <- go s1 env idx
(TermBox e2, _) <- go s2 env idx
TermBox rt <- unifyAppRet e1 e2
e1' <- unifyAppFun e1 e2
retTBox $ TApp e1' e2 `asTypeOf` rt
go (SLam i tr s) env idx = do
TermBox et <- return $ mkTerm' $ tr
(TermBox e, _) <- go s (M.insert i
(TermBox (TVar idx `asTypeOf` et),tr)
env)
(idx + 1)
retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e)
\end{code}
It looks like we just factored on the nasty arrow code.
=
mkTerm is almost the same... we just extended it to deal with Nats.
\begin{code}
mkTerm' :: TypeRep -> TermBox
mkTerm' tr = go tr TermBox
where
go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res
go tr k
| tr == typeOf (0::Double) = k (TNum 0)
| tr == typeOf Z = k (TVar 0 :: Term Z)
| typeRepTyCon tr == succCon = go (head $ typeRepArgs tr)
$ \t -> k $ succTerm t
| isArr tr =
go (head $ typeRepArgs tr)
$ \xt -> go (head $ tail $ typeRepArgs tr)
$ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt))
| otherwise = error $ show tr
succCon = typeRepTyCon $ typeOf (S Z)
succTerm :: Typeable b => Term b -> Term (S b)
succTerm _ = TVar 0
\end{code}
=
Some utilities
\begin{code}
isArr :: TypeRep -> Bool
isArr x = typeRepTyCon x == (typeRepTyCon $ typeOf (undefined::Int ->
String))
splitArrCon :: TypeRep -> Either String (TypeRep, TypeRep)
splitArrCon x
| isArr x = case typeRepArgs x of
[a,b] -> Right (a,b)
_ -> Left $ "Expected function, inferred: " ++ show x
| otherwise = Left $ "Expected function, inferred: " ++ show x
\end{code}
=
Give an arrow term we unify it with its argument...
and return a witness.
\begin{code}
unifyAppRet :: forall a b. (Typeable a, Typeable b) =>
Term a -> Term b -> Either String TermBox
unifyAppRet x y = do
tr <- unifyAppTyps
(head $ typeRepArgs $ typeOf x)
(head $ typeRepArgs $ typeOf y)
return $ mkTerm' tr
\end{code}
=
Yes. Actual unification.
(although this is unification of a type template,
so at least it is local)
\begin{code}
unifyAppTyps :: TypeRep -> TypeRep -> Either String TypeRep
unifyAppTyps trf trx = do
(fl,fr) <- splitArrCon trf
env <- go M.empty fl trx
subIt env fr
where
-- go yields a substitution environment.
go :: M.Map String TypeRep -> TypeRep ->
TypeRep -> Either String (M.Map String TypeRep)
go env x y
| isFree x = case M.lookup (show x) env of
Just x' -> if x' == y then return env else Left
(error "a")
Nothing -> return $ M.insert (show x) y env
| isArr x = do
(lh,rh) <- splitArrCon x
(lh',rh') <- splitArrCon y
env' <- go env lh lh'
go env' rh rh'
| otherwise = if x == y then return env else Left (error "b")
\end{code}
=
\begin{code}
-- subIt applies it
subIt :: M.Map String TypeRep -> TypeRep -> Either String TypeRep
subIt env x
| isFree x = case M.lookup (show x) env of
Just x' -> return x'
Nothing -> Left (error "c")
| isArr x = do
(lh,rh) <- splitArrCon x
lh' <- subIt env lh
rh' <- subIt env rh
return $ mkTyConApp arrCon [lh',rh']
| otherwise = return x
succCon = typeRepTyCon $ typeOf (S Z)
zCon = typeRepTyCon $ typeOf Z
isFree x = typeRepTyCon x `elem` [zCon, succCon]
arrCon = (typeRepTyCon $ typeOf (undefined::Int -> String))
\end{code}
=
And now, we just have to convince GHC that they unify!
\begin{code}
unifyAppFun :: forall a b c. (Typeable a, Typeable b, Typeable c)
=> Term a -> Term b -> Either String (Term c)
unifyAppFun x y = do
unifyAppTyps (head $ typeRepArgs $ typeOf x) (head $ typeRepArgs $ typeOf
y)
return $ unsafeCoerce x
\end{code}
=
Problem solved.
On Fri, Jun 25, 2010 at 2:03 PM, Edward Kmett
It turns out that defining Read is somewhat tricky to do for a GADT.
Gershom Bazerman has some very nice slides on how to survive the process by manual typechecking (much in the spirit of Oleg's meta-typechecking code referenced by Stephen's follow up below)
He presented them at hac-phi this time around.
I will check with him to see if I can get permission to host them somewhere and post a link to them here.
-Edward Kmett
On Fri, Jun 25, 2010 at 5:04 AM,
wrote: Hello Haskellers,
I'm having trouble writing a Read Instance for my GATD. Arg this GATD!! It causes me more problems that it solves ;) Especially with no automatic deriving, it adds a lot of burden to my code.
data Obs a where ProposedBy :: Obs Int -- The player that proposed the tested rule Turn :: Obs Turn -- The current turn Official :: Obs Bool -- whereas the tested rule is official Equ :: (Eq a, Show a, Typeable a) => Obs a -> Obs a -> Obs Bool Plus :: (Num a) => Obs a -> Obs a -> Obs a Time :: (Num a) => Obs a -> Obs a -> Obs a Minus :: (Num a) => Obs a -> Obs a -> Obs a And :: Obs Bool -> Obs Bool -> Obs Bool Or :: Obs Bool -> Obs Bool -> Obs Bool Not :: Obs Bool -> Obs Bool Konst :: a -> Obs a
instance Read a => Read (Obs a) where readPrec = (prec 10 $ do Ident "ProposedBy" <- lexP return (ProposedBy)) +++ (prec 10 $ do Ident "Official" <- lexP return (Official)) (etc...)
Observable.lhs:120:8: Couldn't match expected type `Int' against inferred type `Bool' Expected type: ReadPrec (Obs Int) Inferred type: ReadPrec (Obs Bool)
Indeed "ProposedBy" does not have the same type that "Official". Mmh how to make it all gently mix altogether?
Best, Corentin
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

That's really nice. Very interesting. Thank you.
On Fri, Jun 25, 2010 at 9:55 PM, Edward Kmett
I've obtained permission to repost Gershom's slides on how to deserialize GADTs by typechecking them yourself, which are actually a literate haskell source file, that he was rendering to slides. Consequently, I've just pasted the content below as a literate email.
-Edward Kmett
- Deserializing strongly typed values
(four easy pieces about typechecking)
Gershom Bazerman
- prior art: Oleg (of course) http://okmij.org/ftp/tagless-final/course/course.html
...but also Stephanie Weirich http://www.comlab.ox.ac.uk/projects/gip/school/tc.hs
= Ahem... \begin{code} {-# LANGUAGE DeriveDataTypeable, ExistentialQuantification, FlexibleContexts, FlexibleInstances, FunctionalDependencies, GADTs, RankNTypes, ScopedTypeVariables #-} \end{code} = ahem. \begin{code} import Data.Typeable import Data.Maybe import Control.Monad.Error import Control.Applicative import qualified Data.Map as M import Unsafe.Coerce \end{code} = A simple ADT.
\begin{code} data SimpleExpr = SOpBi String SimpleExpr SimpleExpr | SOpUn String SimpleExpr | SDbl Double | SBool Bool deriving (Read, Show, Typeable)
\end{code} Yawn. = An awesome GADT!
\begin{code} data Expr a where EDbl :: Double -> Expr Double EBool :: Bool -> Expr Bool EBoolOpBi :: BoolOpBi -> Expr Bool -> Expr Bool -> Expr Bool ENumOpBi :: NumOpBi -> Expr Double -> Expr Double -> Expr Double ENumOpUn :: NumOpUn -> Expr Double -> Expr Double deriving Typeable
data NumOpBi = Add | Sub deriving (Eq, Show) data NumOpUn = Log | Exp deriving (Eq, Show) data BoolOpBi = And | Or deriving (Eq, Show) \end{code}
The GADT is well typed. It cannot go wrong. - It also cannot derive show. = But we can write show...
\begin{code} showIt :: Expr a -> String showIt (EDbl d) = show d showIt (EBool b) = show b showIt (EBoolOpBi op x y) = "(" ++ show op ++ " " ++ showIt x ++ " " ++ showIt y ++ ")" showIt (ENumOpBi op x y) = "(" ++ show op ++ " " ++ showIt x ++ " " ++ showIt y ++ ")" showIt (ENumOpUn op x) = show op ++ "(" ++ showIt x ++ ")" \end{code} = And eval is *much nicer*. It cannot go wrong --> no runtime typechecks.
\begin{code} evalIt :: Expr a -> a evalIt (EDbl x) = x evalIt (EBool x) = x evalIt (EBoolOpBi op expr1 expr2) | op == And = evalIt expr1 && evalIt expr2 | op == Or = evalIt expr2 || evalIt expr2
evalIt (ENumOpBi op expr1 expr2) | op == Add = evalIt expr1 + evalIt expr2 | op == Sub = evalIt expr1 - evalIt expr2 \end{code} = But how do we write read!?
read "EBool False" = Expr Bool read "EDbl 12" = Expr Double
The type being read depends on the content of the string.
Even worse, we want to read not from a string that looks obvious to Haskell (i.e. a standard showlike instance) but from something that looks pretty to the user -- we want to *parse*.
So we parse into our simple ADT.
Then we turn our simple ADT into our nice GADT. - But how?
How do we go from untyped... to typed?
[And in general -- not just into an arbitrary GADT, but an arbitrary inhabitant of a typeclass.]
[i.e. tagless final, etc]
= Take 1: Even if we do not know what type we are creating, we eventually will do something with it.
So we paramaterize our typechecking function over an arbitrary continuation.
\begin{code} mkExpr :: (forall a. (Show a, Typeable a) => Expr a -> r) -> SimpleExpr -> r mkExpr k expr = case expr of SDbl d -> k $ EDbl d SBool b -> k $ EBool b SOpUn op expr1 -> case op of "log" -> k $ mkExpr' (ENumOpUn Log) expr1 "exp" -> k $ mkExpr' (ENumOpUn Exp) expr1 _ -> error "bad unary op" SOpBi op expr1 expr2 -> case op of "add" -> k $ mkExprBi (ENumOpBi Add) expr1 expr2 "sub" -> k $ mkExprBi (ENumOpBi Sub) expr1 expr2 \end{code} = Where's the typechecking?
\begin{code} mkExpr' k expr = mkExpr (appCast $ k) expr
mkExprBi k expr1 expr2 = mkExpr' (mkExpr' k expr1) expr2
appCast :: forall a b c r. (Typeable a, Typeable b) => (c a -> r) -> c b -> r appCast f x = maybe err f $ gcast x where err = error $ "Type error. Expected: " ++ show (typeOf (undefined::a)) ++ ", Inferred: " ++ show (typeOf (undefined::b)) \end{code}
... We let Haskell do all the work! = Hmmm... the continuation can be anything. So let's just make it an existential constructor.
\begin{code} data ExprBox = forall a. Typeable a => ExprBox (Expr a)
appExprBox :: (forall a. Expr a -> res) -> ExprBox -> res appExprBox f (ExprBox x) = f x
tcCast :: forall a b c. (Typeable a, Typeable b) => Expr a -> Either String (Expr b) tcCast x = maybe err Right $ gcast x where err = Left $ "Type error. Expected: " ++ show (typeOf (undefined::a)) ++ ", Inferred: " ++ show (typeOf (undefined::b)) \end{code}
Now we can delay deciding what to apply until later. Typecheck once, execute repeatedly!
= One more trick -- monadic notation lets us extend the context of unpacked existentials to the end of the do block
\begin{code} retBox x = return (ExprBox $ x, typeOf x)
typeCheck :: SimpleExpr -> Either String (ExprBox, TypeRep) typeCheck (SDbl d) = retBox (EDbl d) typeCheck (SBool b) = retBox (EBool b) typeCheck (SOpBi op s1 s2) = case op of "add" -> tcBiOp (ENumOpBi Add) "sub" -> tcBiOp (ENumOpBi Sub) "and" -> tcBiOp (EBoolOpBi And) "or" -> tcBiOp (EBoolOpBi Or) where tcBiOp constr = do (ExprBox e1, _) <- typeCheck s1 (ExprBox e2, _) <- typeCheck s2 e1' <- tcCast e1 e2' <- tcCast e2 retBox $ constr e1' e2' \end{code} = So that's fine for *very* simple expressions. How does it work for interesting GADTs? (like, for example, HOAS)?
(The prior art doesn't demonstrate HOAS -- it uses DeBruijn.) = Our simple world
\begin{code} type Ident = String type TypeStr = String
data STerm = SNum Double | SApp STerm STerm | SVar Ident | SLam Ident TypeRep STerm \end{code}
Note.. terms are Church style -- each var introduced has a definite type.
Determining this type is left as an exercise.
= Over the rainbow in well-typed land.
\begin{code} data Term a where TNum :: Double -> Term Double TApp :: Term (a -> b) -> Term a -> Term b TLam :: Typeable a => (Term a -> Term b) -> Term (a -> b) TVar :: Typeable a => Int -> Term a deriving Typeable \end{code}
Wait! DeBrujin (TVar) *and* HOAS (TLam)! The worst of both worlds.
Don't worry. In the final product all TVars are eliminated by construction.
Exercise to audience: rewrite the code so that TVar can be eliminated from the Term type. = Show and eval...
\begin{code} showT :: Int -> Term a -> String showT c (TNum d) = show d showT c (TApp f x) = "App (" ++ showT c f ++ ") (" ++ showT c x ++ ")" showT c (TLam f) = "Lam " ++ ("a"++show c) ++ " -> " ++ (showT (succ c) $ f (TVar c)) showT c (TVar i) = "a"++show i
runT :: Term a -> Term a runT (TNum d) = (TNum d) runT (TLam f) = (TLam f) runT (TApp f x) = case runT f of TLam f' -> runT (f' x) runT (TVar i) = error (show i) \end{code} = The existential
\begin{code} data TermBox = forall a. Typeable a => TermBox (Term a) appTermBox :: (forall a. Typeable a => Term a -> res) -> TermBox -> res appTermBox f (TermBox x) = f x \end{code} = The typechecker returns a box *and* a typeRep.
Cast is the usual trick.
\begin{code} retTBox :: Typeable a => Term a -> Either String (TermBox, TypeRep) retTBox x = return (TermBox $ x, typeOf x)
type Env = M.Map Ident (TermBox, TypeRep)
trmCast :: forall a b c. (Typeable a, Typeable b) => Term a -> Either String (Term b) trmCast x = maybe err Right $ gcast x where err = Left $ "Type error. Expected: " ++ show (typeOf (undefined::a)) ++ ", Inferred: " ++ show (typeOf (undefined::b)) \end{code}
= \begin{code} typeCheck' :: STerm -> Env -> Either String (TermBox, TypeRep) typeCheck' t env = go t env 0 where go (SNum d) _ idx = retTBox (TNum d) go (SVar i) env idx = do (TermBox t, _) <- maybe (fail $ "not in scope: " ++ i) return $ M.lookup i env retTBox $ t \end{code}
Nums and vars are easy. = App and Lam... less so.
\begin{code} go (SApp s1 s2) env idx = do (TermBox e1, tr1) <- go s1 env idx (TermBox e2, _) <- go s2 env idx TermBox rt <- return $ mkTerm $ head $ tail $ typeRepArgs $ head $ typeRepArgs $ tr1 -- TypeReps have their... drawbacks. e1' <- trmCast e1 retTBox $ TApp e1' e2 `asTypeOf` rt go (SLam i tr s) env idx = do TermBox et <- return $ mkTerm tr (TermBox e, _) <- go s (M.insert i (TermBox (TVar idx `asTypeOf` et),tr) env ) (idx + 1) retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e) \end{code} = How does mkTerm work?
\begin{code} mkTerm :: TypeRep -> TermBox mkTerm tr = go tr TermBox where go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res go tr k | tr == typeOf (0::Double) = k (TNum 0) | typeRepTyCon tr == arrCon = go (head $ typeRepArgs tr) $ \xt -> go (head $ tail $ typeRepArgs tr) $ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt))
arrCon = typeRepTyCon $ typeOf (undefined::Int -> String) \end{code}
Same principle -- but can build arrows directly.
Doing so requires staying cps... I think. = And this is how we get rid of the DeBruijn terms.
\begin{code} subst :: (Typeable a) => Term a -> Int -> Term b -> Term b subst t i trm = go trm where go :: Term c -> Term c go (TNum d) = (TNum d) go (TApp f x) = TApp (go f) (go x) go (TLam f) = TLam (\a -> go (f a)) go (TVar i') | i == i' = either error id $ trmCast t | otherwise = (TVar i') \end{code}
Q: Now you see why DeBruijn is handy -- substitution is otherwise a pain.
= But note -- all functions must be monotyped. This is the simply typed lambda calculus.
How do we represent TLam (\a -> a)?
The masses demand HM polymorphism. -
Take 4:
A damn dirty hack.
= All hacks begin with Nats.
\begin{code} data Z = Z deriving (Show, Typeable) data S a = S a deriving (Show, Typeable) \end{code} = typeCheck is almost the same.
\begin{code} typeCheck'' :: STerm -> Env -> Either String (TermBox, TypeRep) typeCheck'' t env = go t env 0 where go :: STerm -> Env -> Int -> Either String (TermBox, TypeRep) go (SNum d) _ idx = retTBox (TNum d) go (SVar i) env idx = do (TermBox t, _) <- maybe (fail $ "not in scope: " ++ i) return $ M.lookup i env retTBox $ t \end{code} = \begin{code} go (SApp s1 s2) env idx = do (TermBox e1, tr1) <- go s1 env idx (TermBox e2, _) <- go s2 env idx TermBox rt <- unifyAppRet e1 e2 e1' <- unifyAppFun e1 e2 retTBox $ TApp e1' e2 `asTypeOf` rt go (SLam i tr s) env idx = do TermBox et <- return $ mkTerm' $ tr (TermBox e, _) <- go s (M.insert i (TermBox (TVar idx `asTypeOf` et),tr) env) (idx + 1) retTBox $ TLam (\x -> subst (x `asTypeOf` et) idx e) \end{code}
It looks like we just factored on the nasty arrow code. = mkTerm is almost the same... we just extended it to deal with Nats. \begin{code} mkTerm' :: TypeRep -> TermBox mkTerm' tr = go tr TermBox where go :: TypeRep -> (forall a. (Typeable a) => Term a -> res) -> res go tr k | tr == typeOf (0::Double) = k (TNum 0) | tr == typeOf Z = k (TVar 0 :: Term Z) | typeRepTyCon tr == succCon = go (head $ typeRepArgs tr) $ \t -> k $ succTerm t | isArr tr = go (head $ typeRepArgs tr) $ \xt -> go (head $ tail $ typeRepArgs tr) $ \y -> k (TLam $ \x -> const y (x `asTypeOf` xt)) | otherwise = error $ show tr
succCon = typeRepTyCon $ typeOf (S Z) succTerm :: Typeable b => Term b -> Term (S b) succTerm _ = TVar 0 \end{code} = Some utilities \begin{code} isArr :: TypeRep -> Bool isArr x = typeRepTyCon x == (typeRepTyCon $ typeOf (undefined::Int -> String))
splitArrCon :: TypeRep -> Either String (TypeRep, TypeRep) splitArrCon x | isArr x = case typeRepArgs x of [a,b] -> Right (a,b) _ -> Left $ "Expected function, inferred: " ++ show x | otherwise = Left $ "Expected function, inferred: " ++ show x
\end{code} = Give an arrow term we unify it with its argument... and return a witness. \begin{code} unifyAppRet :: forall a b. (Typeable a, Typeable b) => Term a -> Term b -> Either String TermBox unifyAppRet x y = do tr <- unifyAppTyps (head $ typeRepArgs $ typeOf x) (head $ typeRepArgs $ typeOf y) return $ mkTerm' tr
\end{code} = Yes. Actual unification. (although this is unification of a type template, so at least it is local) \begin{code} unifyAppTyps :: TypeRep -> TypeRep -> Either String TypeRep unifyAppTyps trf trx = do (fl,fr) <- splitArrCon trf env <- go M.empty fl trx subIt env fr where -- go yields a substitution environment. go :: M.Map String TypeRep -> TypeRep -> TypeRep -> Either String (M.Map String TypeRep) go env x y | isFree x = case M.lookup (show x) env of Just x' -> if x' == y then return env else Left (error "a") Nothing -> return $ M.insert (show x) y env | isArr x = do (lh,rh) <- splitArrCon x (lh',rh') <- splitArrCon y env' <- go env lh lh' go env' rh rh' | otherwise = if x == y then return env else Left (error "b") \end{code} = \begin{code} -- subIt applies it subIt :: M.Map String TypeRep -> TypeRep -> Either String TypeRep subIt env x | isFree x = case M.lookup (show x) env of Just x' -> return x' Nothing -> Left (error "c") | isArr x = do (lh,rh) <- splitArrCon x lh' <- subIt env lh rh' <- subIt env rh return $ mkTyConApp arrCon [lh',rh'] | otherwise = return x succCon = typeRepTyCon $ typeOf (S Z) zCon = typeRepTyCon $ typeOf Z isFree x = typeRepTyCon x `elem` [zCon, succCon] arrCon = (typeRepTyCon $ typeOf (undefined::Int -> String)) \end{code}
= And now, we just have to convince GHC that they unify!
\begin{code} unifyAppFun :: forall a b c. (Typeable a, Typeable b, Typeable c) => Term a -> Term b -> Either String (Term c) unifyAppFun x y = do unifyAppTyps (head $ typeRepArgs $ typeOf x) (head $ typeRepArgs $ typeOf y) return $ unsafeCoerce x \end{code} = Problem solved.
On Fri, Jun 25, 2010 at 2:03 PM, Edward Kmett
wrote: It turns out that defining Read is somewhat tricky to do for a GADT.
Gershom Bazerman has some very nice slides on how to survive the process by manual typechecking (much in the spirit of Oleg's meta-typechecking code referenced by Stephen's follow up below)
He presented them at hac-phi this time around.
I will check with him to see if I can get permission to host them somewhere and post a link to them here.
-Edward Kmett
On Fri, Jun 25, 2010 at 5:04 AM,
wrote: Hello Haskellers,
I'm having trouble writing a Read Instance for my GATD. Arg this GATD!! It causes me more problems that it solves ;) Especially with no automatic deriving, it adds a lot of burden to my code.
data Obs a where ProposedBy :: Obs Int -- The player that proposed the tested rule Turn :: Obs Turn -- The current turn Official :: Obs Bool -- whereas the tested rule is official Equ :: (Eq a, Show a, Typeable a) => Obs a -> Obs a -> Obs Bool Plus :: (Num a) => Obs a -> Obs a -> Obs a Time :: (Num a) => Obs a -> Obs a -> Obs a Minus :: (Num a) => Obs a -> Obs a -> Obs a And :: Obs Bool -> Obs Bool -> Obs Bool Or :: Obs Bool -> Obs Bool -> Obs Bool Not :: Obs Bool -> Obs Bool Konst :: a -> Obs a
instance Read a => Read (Obs a) where readPrec = (prec 10 $ do Ident "ProposedBy" <- lexP return (ProposedBy)) +++ (prec 10 $ do Ident "Official" <- lexP return (Official)) (etc...)
Observable.lhs:120:8: Couldn't match expected type `Int' against inferred type `Bool' Expected type: ReadPrec (Obs Int) Inferred type: ReadPrec (Obs Bool)
Indeed "ProposedBy" does not have the same type that "Official". Mmh how to make it all gently mix altogether?
Best, Corentin
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (2)
-
Dupont Corentin
-
Edward Kmett