Re: [Haskell-cafe] I love purity, but it's killing me.

27 May 2009


      I just remembered: Andy Gill has a new paper "Type Directed Observable
Sharing" (http://www.ittc.ku.edu/~andygill/paper.php?label=DSLExtract09)
that looks very relevant.

Abstract:

Haskell is a great language for writing and supporting embedded Domain
...
Specific Languages (DSLs). Some form of observable sharing is often a
critical capability for allowing so-called deep DSLs to be compiled and
processed. In this paper, we describe and explore uses of an IO function for
reification which allows direct observation of sharing.
On Tue, May 26, 2009 at 4:49 PM, Conal Elliott  wrote:
...
Hi Tom,
I've been working on another code-generating graphics compiler, generating
GPU code.  As always, I run into the problem of efficient common
subexpression elimination.  In Pan, Vertigo & Pajama, I used lazy
memoization, using stable pointers and weak references, to avoid the
worst-case-exponential behavior you mention below.  I'm now using a
bottom-up CSE method that's slower and more complicated than I'm going for.
What's your latest wisdom about CSE in DSELs?
Thanks,  - Conal
On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins wrote:
...
I've been programming with Haskell for a few years and love it.  One
of my favorite applications of Haskell is using for domain specific
languages.  However, after designing a handful of DSLs, I continue to
hit what appears to be a fundamental hurdle -- or at least I have yet
to find an adequate solution.
My DSLs invariably define a datatype to capture expressions; something
like this:
data Expression
 = Add Expression Expression
 | Sub Expression Expression
 | Variable String
 | Constant Int
 deriving Eq
Using the datatype Expression, it is easy to mass a collections of
functions to help assemble complex expressions, which leads to very
concise programs in the DSL.
The problem comes when I want to generate efficient code from an
Expression (ie. to C or some other target language).  The method I use
invovles converting the tree of subexpressions into an acyclic graphic
to eliminate common subexpressions.  The nodes are then topologically
ordered and assigned an instruction, or statement for each node.  For
example:
let a = Add (Constant 10) (Variable "i1")
   b = Sub (Variable "i2") (Constant 2)
   c = Add a b
would compile to a C program that may look like this:
a = 10 + i1;
 b = i2 - 2;
 c = a + b;
The process of converting an expression tree to a graph uses either Eq
or Ord (either derived or a custom instance) to search and build a set
of unique nodes to be ordered for execution.  In this case "a", then
"b", then "c".  The problem is expressions often have shared,
equivalent subnodes, which dramatically grows the size of the tree.
For example:
let d = Add c c
   e = Add d d    -- "e" now as 16 leaf nodes.
As these trees grow in size, the equality comparison in graph
construction quickly becomes the bottleneck for DSL compilation.
What's worse, the phase transition from tractable to intractable is
very sharp.  In one of my DSL programs, I made a seemingly small
change, and compilation time went from milliseconds to
not-in-a-million-years.
Prior to Haskell, I wrote a few DSLs in OCaml.  I didn't have this
problem in OCaml because each "let" expression was mutable, and I
could use the physical equality operator to perform fast comparisons.
Unfortunately, I have grown to love Haskell's type system and its lack
of side effects, and could never go back.
Is there anything that can be done to dramatically speed up
comparisons, or is there a better approach I can take to extract
common subexpressions?  I should point out I have an opportunity to
get Haskell on a real industrial application.  But if I can't solve
this problem, I may have to resort to far less eloquent languages.
:-(
Thanks for any and all help.
-Tom
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe