
Hi Günther, The underlying problem with the implementation of 'lam' is that you have to pick an evaluation order for the side effects you want in the semantics of your embedded language. The two obvious options are call-by-name and call-by-value. The tricky point is that the types of the embedded language cannot be interpreted directly as Haskell types wrapped in a monad. You need to either take the call-by-name or the call-by-value interpretation of types: For CBN: [[ Int ]] = IO Int [[ A -> B ]] = [[ A ]] -> [[ B ]] For CBV: [[ Int ]] = Int [[ A -> B ]] = [[ A ]] -> IO [[ B ]] (obviously, there is nothing special about the IO monad here, or in the rest of this email: we could replace it with any other monad). To implement the translation of embedded language types to Haskell types in Haskell we use type families. Here is the code for call-by-name:
module CBN where
First of all, we define some dummy types so that we have a different representation of types in the embedded language than those in the meta-language, and we don't get confused:
data IntT data a :-> b infixr 5 :->
Now we give the definition of the higher-order abstract syntax of our embedded language:
class HOAS exp where lam :: (exp a -> exp b) -> exp (a :-> b) app :: exp (a :-> b) -> exp a -> exp b
int :: Int -> exp IntT add :: exp IntT -> exp IntT -> exp IntT
This is as in your definition, except I have used the different type names to highlight the distinction between embedded and meta-language types. Next, we give the CBN semantics for the types, using the definition above, with a type family.
type family Sem a :: * type instance Sem IntT = IO Int type instance Sem (a :-> b) = Sem a -> Sem b
Now we can give an instance of the syntax's type class, that translates a piece of embedded code into its denotational semantics.
newtype S a = S { unS :: Sem a }
instance HOAS S where int = S . return add (S x) (S y) = S $ do a <- x b <- y putStrLn "Adding" return (a + b)
lam f = S (unS . f . S) app (S x) (S y) = S (x y)
As an example, we can give a term and see what happens when we evaluate it:
let_ :: HOAS exp => exp a -> (exp a -> exp b) -> exp b let_ x y = (lam y) `app` x
t :: HOAS exp => exp IntT t = (lam $ \x -> let_ (x `add` x) $ \y -> y `add` y) `app` int 10
Evaluating 'unS t' in GHCi gives the output: Adding Adding Adding 40 Note that the add operation is invoked three times: the "x `add` x" that is let-bound is evaluated twice. The second option is call-by-value. This is a bit more interesting because we have to change the syntax of the embedded language to deal with the different substitution rules for CBV.
module CBV where
Again, we use a different representation for embedded-language types:
data IntT data a :-> b infixr 5 :->
There is a difference in the syntax between CBN and CBV that is not always obvious from the usual paper presentations. There is a split between pieces of syntax that are "values" and those that are "computations". Values do not have side-effects, while computations may. In this presentation, I have chosen that the only values are variables, while everything else is a computation. To represent this split, we use a multi-parameter type class. Note that the technique of using multi-parameter type classes is generally useful for representing abstract syntaxes in the HOAS style with multiple mutually defined syntactic categories.
class HOAS exp val | exp -> val, val -> exp where val :: val a -> exp a
lam :: (val a -> exp b) -> exp (a :-> b) app :: exp (a :-> b) -> exp a -> exp b
int :: Int -> exp IntT add :: exp IntT -> exp IntT -> exp IntT
The 'val' operation injects values into computations: this will be interpreted by the 'return' of the monad. We now give the semantics of types for CBV, following the definition above:
type family Sem a :: * type instance Sem IntT = Int type instance Sem (a :-> b) = Sem a -> IO (Sem b)
We need two newtypes for the two different syntactic categories. Values are interpeted by 'SV', and are just pure values; computations are interpreted by wrapping them in the monad.
newtype SV a = SV { unSV :: Sem a } newtype SC a = SC { unSC :: IO (Sem a) }
Now we can define the semantics of the syntax in terms of these types:
instance HOAS SC SV where val = SC . return . unSV
lam f = SC $ return (unSC . f . SV) app (SC x) (SC y) = SC $ do f <- x; a <- y; f a
int i = SC $ return i add (SC x) (SC y) = SC $ do a <- x b <- y putStrLn "Adding" return (a + b)
The same example term is expressed in CBV as follows:
let_ :: HOAS exp val => exp a -> (val a -> exp b) -> exp b let_ x y = (lam y) `app` x
t :: HOAS exp val => exp IntT t = (lam $ \x -> let_ (val x `add` val x) $ \y -> val y `add` val y) `app` int 10
Running 'unSC t' in GHCi gives the output: Adding Adding 40 Note that the add operation is only invoked twice this time: the let-bound expression is evaluated to a value, and then this value is used twice in the body of the let. The other obvious evaluation strategy is call-by-need, as used by Haskell itself, but I don't know if it is possible to implement this. I guess it may be possible using the ST monad as the mutable storage of thunks. Hope this was helpful. Bob -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.