
On Feb 9, 2008 12:28 AM, Tom Hawkins
5) Forget embedding the DSL, and write a direct compiler.
In addition to the sharing problem, another shortcoming of Haskell DSLs is they can not fully exploit the benefits of algebraic datatypes. Specifically, pattern matching ADTs can only be used to control the compile-time configuration of the target, it can't be used to describe the target's behavior -- at least for DSLs that generate code that executes outside of Haskell's runtime.
Only partly true. Probably you are not aware of them (I myself learned about its existence a few days ago) but pattern quasiquoting (available in GHC's HEAD) can be used for that. http://www.haskell.org/ghc/dist/current/docs/users_guide/template-haskell.ht...
Writing a real compiler would solve both of these problems. Is there any Haskell implementation that has a clean cut-point, from which I can start from a fully type-checked, type-annotated intermediate representation?
If you have to write a compiler why not define a language which fits better with the semantics of the embedded language instead of using plain Haskell? The approach you propose has the disadvantages of both the embedded and the standalone languages. On one hand you have to stick with the syntax of the host language which may not fit with your exact semantical requirements and, on the other hand, you cannot take advantage of all the existing machinery around the host language (you have to code your own compiler). Furthermore, the first citizen status of functions make it impossible (or really difficult at least) to compile EDSL descriptions avoiding runtime and simply applying a static analysis approach (using Core or plain Haskell as input).
And thanks for the link to John's paper describing Hydra's use of Template Haskell. I will definiately consider TH.
Well, TH would be one of those static analysis approaches. Actually, O'Donell's implementation (which uses an outdated version of Template Haskell) only works with a small Haskell subset. So using TH you'd probably be changing the host language anyhow. Furthermore, the TH approach consists in adding node labels by preprocessing the EDSL description, making sharing observable. That makes the original EDSL description inpure. The only difference is that side effects are added by preprocessing instead of using runtime unsafe functions. ---- Some pointers covering the topic: [1] and [2] summarize what are the alternatives to observe sharing in Haskell whereas [3] compares the embedded approach vs standalone approach and advocates the last one. 1) http://www.imit.kth.se/~ingo/MasterThesis/ThesisAlfonsoAcosta2007.pdf (section 2.4.1 and 3.1) 2) http://www.cs.um.edu.mt/svrg/Papers/csaw2006-01.pdf (section 3) 3) http://web.cecs.pdx.edu/~sheard/papers/secondLook.ps