
On Thu, Nov 25, 2010 at 11:32 AM, Joachim Breitner wrote: Hi, although semantically it could, ghc does not do common subexpression
elimination (CSE), at least not in every case. The canonical example
where it would be bad is the function (with div to avoid numerical
casting): avg :: [Int] -> Int
avg n = (sum [0..n] `div` length [0..n]) The reason why [0..n] is not shared is because then it would be kept in
memory completely after the evaluation of sum, because length will still
need it, possibly overflowing the memory. In the non-shared form,
though, the garbage collector can clean up directly behind sum and
length, and the program runs in constant space. Note that the shared expression would be very large compared to the
thunks. Now consider the program: avg' :: [Int] -> (Int, Int)
avg' n = (sum [0..n] `div` length [0..n], length [0..n]) avg n = (sum [0..n] `div` l, l)
where l = length [0..n]
is safe, because the shared expression (an Int) is smaller than the I’d say that CSE to
thunks and the compiler can tell. So I wonder:
* Is sharing values of type Int (and Bool and similar small values)
always safe?
* If so: does GHC already do that?
* Would it be technically possible?
* Is there an established theory that can tell, for a sharing
candidate, whether it is safe to share it? I'm just going to answer your last question. Jörgen Gustavsson and David Sands developed the theory of space improvement
for call-by-need. That can help you answer the other questions you have. But
that being said, it's fairly complicated stuff, and it's not a very easy to
use theory. I think it's inherent in the problem though, the
space behavior of lazy programs is just weird. If you're curious to read
about space improvement see papers [GS01] and [GS99] on the following page:
http://www.cse.chalmers.se/~dave/davewww.html
Cheers,
Josef