Re: [Haskell-cafe] Fusing foldr's

29 Oct 2007

      On 10/28/07, Isaac Dupree  wrote:
...
Josef Svenningsson wrote:
...
Less bogus timing:
avg4: 18.0s
avgS: 2.2s
avgP: 17.4s
OK, so these figures make an even stronger case for my conclusion :-)
Single traversal can be much faster than multiple traversals *when
done right*.
Did you use +RTS -N2 on your program (or whatever it is that makes GHC
actually use multiple threads? or is that not necessary?)  Anyway I
assume you wouldn't get better than 9.0s, which is still much worse than
2.2s.
Oh, this is getting embarrassing.. Indeed, I forgot to use +RTS -N2.
But using those flags yielded a very interesting result:

avgP: 4.3s

Superlinear speedup!? As you say, I would have expected something
slightly larger than 9s. I think what happens here is that for avg4
the entire list has to be kept in memory between the two traversals
whereas for avgP the beginning of the list can be garbage collected
incrementally as the two threads traverse it. This could mean that the
list never moves to the second generation in the memory manager and
that can maybe account for the additional time savings. I'm not sure
how to verify that this is the case though.

Cheers,

/Josef