Re: DData revision

21 Mar 2004

      ...
I belive that using an accumulator to avoid ++'s
bad behaviour is the second most used optimisation in functional programs.
[...] Wadler [...]
If Haskell implementations would use this, programmers could really spare
some efforts and programs would be clearer.
GHC does implement this kind of optimization to remove ++ and :.  They work 
pretty well (see Simon Peyton Jones' page for papers about it and follow 
citations to find other recent work on the topic).

In practice though, it is useful to have data structures, algorithms, etc. 
that already contain the optimization though.  Some of the major reasons are:

1) The analyses that drive the optimizations are fragile and hard to predict.  
When the optimization triggers, you can get a large speedup or dramatic 
reduction in memory consumption but then you make a small, insignificant 
change to your program, the optimization can't be applied an you get a large 
change in performance.  Lesson: predictable performance is an important 
property.

2) This particular class of optimizations contain a space-time tradeoff.  If 
the data structure is shared, we can save time by building one copy and using 
it multiple times or we can save space by building a fresh copy each time.  
How is the compiler to know which you want to apply?  For some programs, you 
want to save space, for others, you want to save time.  Indeed, for small 
datasets, you want to save time while for bigger datasets you often have to 
choose to save space at the cost of time.  Lesson: some programming decisions 
can't be automated.

--
Alastair Reid      www.haskell-consulting.com