Finally tagless - stuck with implementation of "lam"

Hi all, I'm playing around with finally tagless. Here is the class for my Syntax: class HOAS repr where lam :: (repr a -> repr b) -> repr (a -> b) app :: repr (a -> b) -> repr a -> repr b fix :: (repr a -> repr a) -> repr a let_ :: repr a -> (repr a -> repr b) -> repr b int :: Int -> repr Int add :: repr Int -> repr Int -> repr Int sub :: repr Int -> repr Int -> repr Int mul :: repr Int -> repr Int -> repr Int and here is one instance of that class for semantics. newtype I a = I { unI :: IO a } instance HOAS I where app e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 return $ e1' e2') int i = I (putStrLn ("setting an integer: " ++ show i) >> return i) add e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "adding %d with %d" e1' e2') return $ e1' + e2') sub e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "subtracting %d from %d" e1' e2') return $ e1' - e2') mul e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "multiplying %d with %d" e1' e2') return $ e1' * e2') I'm stuck with the "lam" method, for 2 days now I've been trying to get it right, but failed. Is there a possibility that it isn't even possible? Günther

It's possible, but it's not nice. You need to be able to "get out of the monad" to make the types match, i.e. lam f = I (return $ \x -> let y = I (return x) in unsafePerformIO $ unI (f y)) The use of IO 'forces' lam to transform its effectful input into an even more effectful result. Actually, same goes for any monad used inside a 'repr'. let_ and fix follow a similar pattern (although you can hide that somewhat by re-using lam if you wish). Jacques Günther Schmidt wrote:
Hi all,
I'm playing around with finally tagless.
Here is the class for my Syntax:
class HOAS repr where lam :: (repr a -> repr b) -> repr (a -> b) app :: repr (a -> b) -> repr a -> repr b fix :: (repr a -> repr a) -> repr a let_ :: repr a -> (repr a -> repr b) -> repr b
int :: Int -> repr Int add :: repr Int -> repr Int -> repr Int sub :: repr Int -> repr Int -> repr Int mul :: repr Int -> repr Int -> repr Int
and here is one instance of that class for semantics.
newtype I a = I { unI :: IO a }
instance HOAS I where app e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 return $ e1' e2') int i = I (putStrLn ("setting an integer: " ++ show i) >> return i) add e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "adding %d with %d" e1' e2') return $ e1' + e2') sub e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "subtracting %d from %d" e1' e2') return $ e1' - e2') mul e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "multiplying %d with %d" e1' e2') return $ e1' * e2')
I'm stuck with the "lam" method, for 2 days now I've been trying to get it right, but failed.
Is there a possibility that it isn't even possible?
Günther
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hello Jacques,
thanks, that is disappointing in some way, guess I still have a lot to
learn.
Günther
Am 05.10.2009, 18:06 Uhr, schrieb Jacques Carette
It's possible, but it's not nice. You need to be able to "get out of the monad" to make the types match, i.e. lam f = I (return $ \x -> let y = I (return x) in unsafePerformIO $ unI (f y))
The use of IO 'forces' lam to transform its effectful input into an even more effectful result. Actually, same goes for any monad used inside a 'repr'.
let_ and fix follow a similar pattern (although you can hide that somewhat by re-using lam if you wish).
Jacques
Günther Schmidt wrote:
Hi all,
I'm playing around with finally tagless.
Here is the class for my Syntax:
class HOAS repr where lam :: (repr a -> repr b) -> repr (a -> b) app :: repr (a -> b) -> repr a -> repr b fix :: (repr a -> repr a) -> repr a let_ :: repr a -> (repr a -> repr b) -> repr b
int :: Int -> repr Int add :: repr Int -> repr Int -> repr Int sub :: repr Int -> repr Int -> repr Int mul :: repr Int -> repr Int -> repr Int
and here is one instance of that class for semantics.
newtype I a = I { unI :: IO a }
instance HOAS I where app e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 return $ e1' e2') int i = I (putStrLn ("setting an integer: " ++ show i) >> return i) add e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "adding %d with %d" e1' e2') return $ e1' + e2') sub e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "subtracting %d from %d" e1' e2') return $ e1' - e2') mul e1 e2 = I (do e1' <- unI e1 e2' <- unI e2 putStrLn (printf "multiplying %d with %d" e1' e2') return $ e1' * e2')
I'm stuck with the "lam" method, for 2 days now I've been trying to get it right, but failed.
Is there a possibility that it isn't even possible?
Günther
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

BTW, here is a more symmetric version of the same code:
lam f = I . return $ unsafePerformIO . unI . f . I . return
which has a very clear pattern of
lam f = embed $ extract . f. embed
where 'embed' is often called "return". Implementations of lam in
"tagless final" style tend to follow the above pattern of embed/extract,
each specialized to the particular repr implementation.
In some cases, one only needs a "context-sensitive" extract, i.e. an
extract which is only valid in the context of an outer 'embed'. The
most important example is code-generation in metaocaml where the
'compiler' version of lam reads
let lam f = .
Hello Jacques,
thanks, that is disappointing in some way, guess I still have a lot to learn.
Günther
Am 05.10.2009, 18:06 Uhr, schrieb Jacques Carette
: It's possible, but it's not nice. You need to be able to "get out of the monad" to make the types match, i.e. lam f = I (return $ \x -> let y = I (return x) in unsafePerformIO $ unI (f y))
The use of IO 'forces' lam to transform its effectful input into an even more effectful result. Actually, same goes for any monad used inside a 'repr'.
let_ and fix follow a similar pattern (although you can hide that somewhat by re-using lam if you wish).
Jacques

Hi Günther, The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value. The tricky point is that the types of the embedded language cannot be interpreted directly as Haskell types wrapped in a monad. You need to either take the call-by-name or the call-by-value interpretation of types: For CBN: [[ Int ]] = IO Int [[ A -> B ]] = [[ A ]] -> [[ B ]] For CBV: [[ Int ]] = Int [[ A -> B ]] = [[ A ]] -> IO [[ B ]] (obviously, there is nothing special about the IO monad here, or in the rest of this email: we could replace it with any other monad). To implement the translation of embedded language types to Haskell types in Haskell we use type families. Here is the code for call-by-name:
module CBN where
First of all, we define some dummy types so that we have a different representation of types in the embedded language than those in the meta-language, and we don't get confused:
data IntT data a :-> b infixr 5 :->
Now we give the definition of the higher-order abstract syntax of our embedded language:
class HOAS exp where lam :: (exp a -> exp b) -> exp (a :-> b) app :: exp (a :-> b) -> exp a -> exp b
int :: Int -> exp IntT add :: exp IntT -> exp IntT -> exp IntT
This is as in your definition, except I have used the different type names to highlight the distinction between embedded and meta-language types. Next, we give the CBN semantics for the types, using the definition above, with a type family.
type family Sem a :: * type instance Sem IntT = IO Int type instance Sem (a :-> b) = Sem a -> Sem b
Now we can give an instance of the syntax's type class, that translates a piece of embedded code into its denotational semantics.
newtype S a = S { unS :: Sem a }
instance HOAS S where int = S . return add (S x) (S y) = S $ do a <- x b <- y putStrLn "Adding" return (a + b)
lam f = S (unS . f . S) app (S x) (S y) = S (x y)
As an example, we can give a term and see what happens when we evaluate it:
let_ :: HOAS exp => exp a -> (exp a -> exp b) -> exp b let_ x y = (lam y) `app` x
t :: HOAS exp => exp IntT t = (lam $ \x -> let_ (x `add` x) $ \y -> y `add` y) `app` int 10
Evaluating 'unS t' in GHCi gives the output: Adding Adding Adding 40 Note that the add operation is invoked three times: the "x `add` x" that is let-bound is evaluated twice. The second option is call-by-value. This is a bit more interesting because we have to change the syntax of the embedded language to deal with the different substitution rules for CBV.
module CBV where
Again, we use a different representation for embedded-language types:
data IntT data a :-> b infixr 5 :->
There is a difference in the syntax between CBN and CBV that is not always obvious from the usual paper presentations. There is a split between pieces of syntax that are "values" and those that are "computations". Values do not have side-effects, while computations may. In this presentation, I have chosen that the only values are variables, while everything else is a computation. To represent this split, we use a multi-parameter type class. Note that the technique of using multi-parameter type classes is generally useful for representing abstract syntaxes in the HOAS style with multiple mutually defined syntactic categories.
class HOAS exp val | exp -> val, val -> exp where val :: val a -> exp a
lam :: (val a -> exp b) -> exp (a :-> b) app :: exp (a :-> b) -> exp a -> exp b
int :: Int -> exp IntT add :: exp IntT -> exp IntT -> exp IntT
The 'val' operation injects values into computations: this will be interpreted by the 'return' of the monad. We now give the semantics of types for CBV, following the definition above:
type family Sem a :: * type instance Sem IntT = Int type instance Sem (a :-> b) = Sem a -> IO (Sem b)
We need two newtypes for the two different syntactic categories. Values are interpeted by 'SV', and are just pure values; computations are interpreted by wrapping them in the monad.
newtype SV a = SV { unSV :: Sem a } newtype SC a = SC { unSC :: IO (Sem a) }
Now we can define the semantics of the syntax in terms of these types:
instance HOAS SC SV where val = SC . return . unSV
lam f = SC $ return (unSC . f . SV) app (SC x) (SC y) = SC $ do f <- x; a <- y; f a
int i = SC $ return i add (SC x) (SC y) = SC $ do a <- x b <- y putStrLn "Adding" return (a + b)
The same example term is expressed in CBV as follows:
let_ :: HOAS exp val => exp a -> (val a -> exp b) -> exp b let_ x y = (lam y) `app` x
t :: HOAS exp val => exp IntT t = (lam $ \x -> let_ (val x `add` val x) $ \y -> val y `add` val y) `app` int 10
Running 'unSC t' in GHCi gives the output: Adding Adding 40 Note that the add operation is only invoked twice this time: the let-bound expression is evaluated to a value, and then this value is used twice in the body of the let. The other obvious evaluation strategy is call-by-need, as used by Haskell itself, but I don't know if it is possible to implement this. I guess it may be possible using the ST monad as the mutable storage of thunks. Hope this was helpful. Bob -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

2009/10/5 Robert Atkey
Hi Günther,
The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value.
I wonder how easily one can provide both, like in Algol.
The other obvious evaluation strategy is call-by-need, as used by Haskell itself, but I don't know if it is possible to implement this. I guess it may be possible using the ST monad as the mutable storage of thunks.
Perhaps something along the lines of "Purely Functional Lazy
Non-deterministic Programming"?
http://okmij.org/ftp/Computation/monads.html#lazy-sharing-nondet
Obviously, doing a deterministic version is simpler. You can probably
get away with representing values as simple self-evaluating thunks.
data Thunk s a = STRef s (Either a (ST s a))
evalThunk :: Thunk s a -> ST s a
evalThunk r = readSTRef r >>= either return update
where
update m = m >>= \x -> writeSTRef r (Left x) >> return x
--
Dave Menendez

On Mon, 2009-10-05 at 19:22 -0400, David Menendez wrote:
The two obvious options are call-by-name and call-by-value.
I wonder how easily one can provide both, like in Algol.
Fairly easy, you can either do a language that has an explicit monad (a bit like Haskell with only the IO monad), or you can go the whole hog and embed Paul Levy's Call-by-push-value. This requires a bit more cleverness, and an explicit representation of the embedded types in order to define the algebra maps on the computation types.
Obviously, doing a deterministic version is simpler. You can probably get away with representing values as simple self-evaluating thunks.
data Thunk s a = STRef s (Either a (ST s a))
evalThunk :: Thunk s a -> ST s a evalThunk r = readSTRef r >>= either return update where update m = m >>= \x -> writeSTRef r (Left x) >> return x
I had a go at this and it seems to work. Happily, it is a tiny tweak to the CBV implementation I gave: you just replace "return" with your "evalThunk" (or something like it) and wrap the argument in the 'app' case with a function that creates a thunk. Bob -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Robert Atkey
To implement the translation of embedded language types to Haskell types in Haskell we use type families.
This type-to-type translation is indeed the crux of the trickiness. By the way, Section 4.3 of our (JFP) paper describes how to follow such a translation without type families. If I were to avoid type families in Haskell, I would make Symantics into a multiparameter type class class HOAS repr arrow int where lam :: (repr a -> repr b) -> repr (arrow a b) app :: repr (arrow a b) -> repr a -> repr b fix :: (repr a -> repr a) -> repr a let_ :: repr a -> (repr a -> repr b) -> repr b int :: int -> repr int add :: repr int -> repr int -> repr int sub :: repr int -> repr int -> repr int mul :: repr int -> repr int -> repr int
The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value.
(Section 5 of our (JFP) paper addresses both CBN and CBV.) -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig Here's a souvenir of it. He made me write this down so I'd remember to get this book and read it. Transcendent Speculations on Apparent Design in the Fate of the Individual, that's a mouthful isn't it. I wrote this down at gun-point. - Gaddis

ccshan:
class HOAS repr arrow int where
lam :: (repr a -> repr b) -> repr (arrow a b) app :: repr (arrow a b) -> repr a -> repr b fix :: (repr a -> repr a) -> repr a let_ :: repr a -> (repr a -> repr b) -> repr b
int :: int -> repr int add :: repr int -> repr int -> repr int sub :: repr int -> repr int -> repr int mul :: repr int -> repr int -> repr int
The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value.
(Section 5 of our (JFP) paper addresses both CBN and CBV.)
Do you have a link to the paper? -- Don

Don Stewart
ccshan:
(Section 5 of our (JFP) paper addresses both CBN and CBV.) Do you have a link to the paper?
Sorry, here it is. http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig Here's a souvenir of it. He made me write this down so I'd remember to get this book and read it. Transcendent Speculations on Apparent Design in the Fate of the Individual, that's a mouthful isn't it. I wrote this down at gun-point. - Gaddis

On Mon, 2009-10-05 at 22:06 -0400, Chung-chieh Shan wrote:
Robert Atkey
wrote in article <1254778973.3675.42.camel@bismuth> in gmane.comp.lang.haskell.cafe: To implement the translation of embedded language types to Haskell types in Haskell we use type families.
This type-to-type translation is indeed the crux of the trickiness. By the way, Section 4.3 of our (JFP) paper describes how to follow such a translation without type families. If I were to avoid type families in Haskell, I would make Symantics into a multiparameter type class
Yes, this is another way to do it. I prefer having an explicit representation of the types of the embedded language though. For one, the multi-parameter type classes get a bit unwieldy if you have lots of types in your embedded language. Also, sometimes you might want to do special things with the denotations of each type. For instance, I did an embedding of Levy's CBPV, where the type system is split into value types and computation types. For the computation types X there is always an algebra map of type m [[ X ]] -> [[ X ]]. To define this I needed a term level representative of the type, and also the type family to give the semantics of the embedded type. I don't know if this is easily done using multi-parameter type classes.
The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value.
(Section 5 of our (JFP) paper addresses both CBN and CBV.)
Yes, thanks for reminding me. I vaguely remembered when I went to bed after posting that you had done something via Plotkin's CPS transformations. I rather like the direct approach though; sometimes it is nice not to have to solve every problem by hitting it with the continuations hammer ;). Bob -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

On Mon, 2009-10-05 at 22:42 +0100, Robert Atkey wrote:
There is a difference in the syntax between CBN and CBV that is not always obvious from the usual paper presentations. There is a split between pieces of syntax that are "values" and those that are "computations". Values do not have side-effects, while computations may. In this presentation, I have chosen that the only values are variables, while everything else is a computation.
I should correct this: it is possible to give the CBV semantics to the original syntax definition (with the single parameter in the type class), you just have to move the "return" of the monad into the "lam" case. I implied in the paragraph above that the split between values and computations was required for CBV case; this is not true. Bob -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
participants (6)
-
Chung-chieh Shan
-
David Menendez
-
Don Stewart
-
Günther Schmidt
-
Jacques Carette
-
Robert Atkey