Monadic bind with associated types + PHOAS?

This is an idea that has been bouncing around in my head for a while, having to do with embedded languages. A few of the talks at CUFP this year mentioned using Haskell to embed a DSL that gets compiled into the output of the program; the hydraulic engine talk embedded the code for the real-time safety computation in the trucks, and one of the finance talks embedded a language for designing Excel spreadsheets. One thing that often comes up is a desire to do a pass on the resultant code to optimize it, but it's pretty difficult with the standard monadic formulation because of embedded functions. You can't do introspection on functions in Haskell; they aren't elements of Eq or Show. This has caused, for example, some implementations of FRP to switch to using arrows instead. However, I find the arrow syntax cumbersome and I feel it tends to obfuscate what is actually going on in the code. An earlier talk at ICFP mentioned using PHOAS instead of HOAS to encode expressions. In HOAS, a lambda expression in the target language is represented by a function in the source language:
data ExpE t where ApE :: ExpE (a -> b) -> ExpE a -> ExpE b LamE :: (ExpE a -> ExpE b) -> ExpE (a -> b)
But without a way to inspect the function inside LamE, you can't get back to the "source code". You end up adding special constructors for "Primitive" or "Variable" to let you do something with the resultant structure, which leads to the expression language containing "unsafe" constructors which need to be hidden. PHOAS makes a small change to make many of these things possible without compromising safety:
data ExpP v t where VarP :: v t -> ExpP v t ApP :: ExpP v (a -> b) -> ExpP v a -> ExpP v b LamP :: (v a -> ExpP v b) -> ExpP v (a -> b)
newtype Exp t = Exp (forall v. ExpP v t)
Now, by parametricity, if you are constructing an Exp t, the only useful thing you can do with the variable passed to LamP is give it to VarP. This means the code using it is basically the same as the HOAS code, except a couple of extra "VarP" constructors inserted, but you gain the ability to inspect inside the lambda and extract the code it generates! An couple of examples, from a PHOAS test I wrote up: 1) An evaluator
eval :: Exp t -> t eval (Exp e) = evalP e
newtype Prim a = Prim a evalP :: ExpP Prim t -> t evalP (VarP (Prim a)) = a evalP (ApP e1 e2) = evalPrim e1 $ evalPrim e2 evalP (LamP f) = evalPrim . f . VarP . Prim
2) using "show" to peek inside functions! ] -- implementation is an exercise for the reader ] -- you'll learn a lot! :) ] ] newtype Var a = Var String ] showExp :: ExpP Var a -> ShowS ghci> let test = Exp $ Lam $ \a -> Lam $ \b -> Var a ghci> :t test test :: Exp (t -> t1 -> t) ghci> print test \x y -> x It seems to me this exact transformation could be useful to express "variable binding" in embedded languages!
type family Primitive m a type family BoundVar m a class PMonad m where preturn :: Primitive m a -> m a pbind :: m a -> (BoundVar m a -> m b) -> m b
Now, this strictly generalizes regular monads:
newtype Wrap m a = Wrap (m a) unwrap (Wrap m) = m type instance Primitive (Wrap m) a = a type instance BoundVar (Wrap m) a = a instance Monad m => PMonad (Wrap m) where preturn x = WM (return x) pbind m f = WM (unwrap m >>= unwrap . f)
So, it seems like the do-notation should work on this type, although you lose pattern matching ([x] <- getArgs). And you can use PHOAS techniques to build up a structure that you can introspect and optimize. Does anyone think this is a direction worth pursuing? It seems to me like a point where the current syntax of Haskell could mesh well with the new type-system technology that we keep getting from GHC HQ. -- ryan

Hi Ryan, On 19 Nov 2008, at 04:39, Ryan Ingram wrote:
In HOAS, a lambda expression in the target language is represented by a function in the source language:
data ExpE t where ApE :: ExpE (a -> b) -> ExpE a -> ExpE b LamE :: (ExpE a -> ExpE b) -> ExpE (a -> b)
But without a way to inspect the function inside LamE, you can't get back to the "source code". You end up adding special constructors for "Primitive" or "Variable" to let you do something with the resultant structure, which leads to the expression language containing "unsafe" constructors which need to be hidden.
There's a perfectly legitimate way to incorporate free variables in your expression data type, without sacrificing type safety. You just need to parametrise your expression type by the context:
data Exp g t where App :: Exp g (a -> b) -> Exp g a -> Exp g b Lam :: (Exp g a -> Exp g b) -> Exp g (a -> b) Var :: Var g a -> Exp g a
data Var g a where Top :: Var (a,g) a Pop :: Var a g -> Var a (b, g)
I wouldn't call this "unsafe" (or at least no more unsafe than HOAS). Note that in particular "Exp Empty a" correspond to closed terms, where Empty denotes the empty type. Secondly, you can always turn a HOAS Exp into non-HOAS expression. Essentially, you map applications to applications, etc. The only interesting case deal with lambda abstractions - there you need to generate a fresh variable name, apply the function to the fresh variable, and continue traversing the resulting expression. You might be interested in a tutorial Andres Loeh, Conor McBride, and I wrote about implementing a dependently typed lambda calculus: http://www.cs.nott.ac.uk/~wss/Publications/Tutorial.pdf The "quote" function in Figure 7 gives the code.
Does anyone think this is a direction worth pursuing?
I'm not convinced yet. The problem is that there's no best way to handle binding. HOAS is great for some things (you don't have to write substitution), but annoying for others (optimisations or showing code). From your message, I couldn't understand why you want to use monads/do-notation to handle binding. Care to elaborate? All the best, Wouter

On Wed, Nov 19, 2008 at 1:04 AM, Wouter Swierstra
I'm not convinced yet. The problem is that there's no best way to handle binding. HOAS is great for some things (you don't have to write substitution), but annoying for others (optimisations or showing code).
PHOAS solves "showing code" quite nicely, in my opinion. I haven't tried to write a real optimizer using it yet. It's probably better to convert to a better intermediate representation and then back when you're done. The nice thing about HOAS, in my opinion, is not that it's a good representation of a language; rather it is that the syntax for *writing* it in your code is much nicer than the explicit variable references needed otherwise, because you have lifted part of the syntax into the host language.
From your message, I couldn't understand why you want to use monads/do-notation to handle binding. Care to elaborate?
Sure; I guess I should explain the motivation a bit. I really like the syntax for do-notation. And I really like how great Haskell is as writing embedded languages, a lot of which comes from the "programmable semicolon" that monadic bind gives you. But there's no equivalent "programmable variable bind", or "programmable function application"; both are generalizations of do-notation that would be useful for embedded languages. So I have been thinking about what makes variable binding special. What is it that we get from a variable bind? How can we create a structure that allows one to program & control the results of the binding values to variables? This has many uses in embedded languages: 1) We can observe sharing; solving this in DSLs in Haskell usually leads to really ugly syntax. 2) We can write interpretation functions that can examine all possible execution paths; this is basically "introspection on functions". This lets us do whole program optimization or other transformations on the result of the monadic code. (2) is usually solved via arrows, although "arr" makes a true solution quite difficult. I don't know of an elegant solution for (1) at all. Do you have any ideas along these lines? -- ryan

Ryan Ingram
I really like the syntax for do-notation. And I really like how great Haskell is as writing embedded languages, a lot of which comes from the "programmable semicolon" that monadic bind gives you.
AFAICT, standard Haskell 98 already gives you the ability to express binding and sharing in PHOAS, except not using do-notation: class Symantics repr where lam :: (repr a -> repr b) -> repr (a -> b) app :: repr (a -> b) -> repr a -> repr b add :: repr Int -> repr Int -> repr Int int :: Int -> repr Int let_ :: Symantics repr => repr a -> (repr a -> repr b) -> repr b let_ e body = app (lam body) e testSharing :: Symantics repr => repr Int testSharing = let_ (add (int 3) (int 4)) (\x -> add x x) Oleg has already noted that you can nicely express the showing, optimization (such as partial evaluation), and transformation (such as to CPS) of embedded code in this framework. I agree with him and you that this representation is a promising direction to explore. All that's missing in standard Haskell 98, then, is the ability to use do-notation: (>>=) = let_ testSharing = do x <- add (int 3) (int 4) add x x But -XNoImplicitPrelude in GHC 6.10 is supposed to enable it, no? -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig 2008-11-20 Universal Children's Day http://unicef.org/ 1948-12-10 Universal Declaration of Human Rights http://everyhumanhasrights.org
participants (3)
-
Chung-chieh Shan
-
Ryan Ingram
-
Wouter Swierstra