
I just remembered: Andy Gill has a new paper "Type Directed Observable Sharing" (http://www.ittc.ku.edu/~andygill/paper.php?label=DSLExtract09) that looks very relevant. Abstract: Haskell is a great language for writing and supporting embedded Domain
Specific Languages (DSLs). Some form of observable sharing is often a critical capability for allowing so-called deep DSLs to be compiled and processed. In this paper, we describe and explore uses of an IO function for reification which allows direct observation of sharing.
On Tue, May 26, 2009 at 4:49 PM, Conal Elliott
Hi Tom,
I've been working on another code-generating graphics compiler, generating GPU code. As always, I run into the problem of efficient common subexpression elimination. In Pan, Vertigo & Pajama, I used lazy memoization, using stable pointers and weak references, to avoid the worst-case-exponential behavior you mention below. I'm now using a bottom-up CSE method that's slower and more complicated than I'm going for.
What's your latest wisdom about CSE in DSELs?
Thanks, - Conal
On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins
wrote: I've been programming with Haskell for a few years and love it. One of my favorite applications of Haskell is using for domain specific languages. However, after designing a handful of DSLs, I continue to hit what appears to be a fundamental hurdle -- or at least I have yet to find an adequate solution.
My DSLs invariably define a datatype to capture expressions; something like this:
data Expression = Add Expression Expression | Sub Expression Expression | Variable String | Constant Int deriving Eq
Using the datatype Expression, it is easy to mass a collections of functions to help assemble complex expressions, which leads to very concise programs in the DSL.
The problem comes when I want to generate efficient code from an Expression (ie. to C or some other target language). The method I use invovles converting the tree of subexpressions into an acyclic graphic to eliminate common subexpressions. The nodes are then topologically ordered and assigned an instruction, or statement for each node. For example:
let a = Add (Constant 10) (Variable "i1") b = Sub (Variable "i2") (Constant 2) c = Add a b
would compile to a C program that may look like this:
a = 10 + i1; b = i2 - 2; c = a + b;
The process of converting an expression tree to a graph uses either Eq or Ord (either derived or a custom instance) to search and build a set of unique nodes to be ordered for execution. In this case "a", then "b", then "c". The problem is expressions often have shared, equivalent subnodes, which dramatically grows the size of the tree. For example:
let d = Add c c e = Add d d -- "e" now as 16 leaf nodes.
As these trees grow in size, the equality comparison in graph construction quickly becomes the bottleneck for DSL compilation. What's worse, the phase transition from tractable to intractable is very sharp. In one of my DSL programs, I made a seemingly small change, and compilation time went from milliseconds to not-in-a-million-years.
Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this problem in OCaml because each "let" expression was mutable, and I could use the physical equality operator to perform fast comparisons. Unfortunately, I have grown to love Haskell's type system and its lack of side effects, and could never go back.
Is there anything that can be done to dramatically speed up comparisons, or is there a better approach I can take to extract common subexpressions? I should point out I have an opportunity to get Haskell on a real industrial application. But if I can't solve this problem, I may have to resort to far less eloquent languages. :-(
Thanks for any and all help.
-Tom _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe