Re: [Haskell-beginners] Newbie performance question

15 Oct 2010

      On Saturday 16 October 2010 00:03:42, Jordan Ellis wrote:
...
I'm trying (really trying!) to get my head around Haskell, but one
(among many) significant stumbling blocks for me so far has been trying
to figure out how my code is actually going to perform. Here's a
contrived example of the sort of thing that confuses me. Let's say I
have a function which produces a list of odd integers from 1 to n: f n =
filter (odd) [1..n]
...and I have another function that makes a list of all sums of pairs of
odd numbers from 1 to n: g n = [a + b | a <- (f n), b <- (f n)]
I assume that in most languages, (f n) would get called twice (once for
a, and once for b), but what happens in Haskell? Does GHC recognize
that, since f is a pure function, it only has to be evaluated once for a
given n?
That depends. If you compile without optimisations, the list f n won't be 
shared (that may of course change in future versions), if you compile with 
optimisations, it will be shared.

GHC does very little common subexpression elimination, and CSE is not 
always a good thing.

In the case above, it's fine for small n, but consider what happens for 
large n. If the list is not shared, for every a you get a new application 
of f (at least, that's GHC's current behaviour), and the algorithm runs in 
constant space. If the list is shared, the entire list remains in memory 
after it has been constructed for the first a. Thus the algorithm runs in 
O(n) space (but it runs faster unless n is really large).
In such cases, you can usually suppress undesired sharing with -fno-cse.
...
If not, would a 'where clause' make any difference? e.g., g n =
[a + b | a <- h, b <- h] where h = f n
Yes, when compiled without optimisations, no when compiled with.
In general, when you want a value to be shared, give it a name.
It may be shared if you don't but with a name, a decent implementation will 
share it.
...
Thanks.

Re: [Haskell-beginners] Newbie performance question

Daniel Fischer