
Shin-Cheng Mu wrote:
Hi,
I am trying to fix a space leak which I suspected to be caused by an old problem involving lazy pattern matching for pairs. P. Wadler and J. Sparud each suggested fixes to the compiler to resolve the space leak. I was wondering whether they are implemented in GHC. In the GHC Users mailing list archive I found a message in 2004 from Simon Peyton Jones saying that GHC does have the fix, but its effects are fragile when mixed with other optimisations:
http://www.haskell.org/pipermail/glasgow-haskell-users/2004-August/ 007023.html
I am wondering what the situation is now, and whether there is a way to make the fix work for my code.
The situation is still the same. The optimisation is implemented, but doesn't work in all cases. Fixing the general case seems difficult at least - we don't have a clear idea for how to do it.
* * *
The problematic code looks like:
idX (StartEvent a : strm) = let (ts, strm') = idX strm (us, strm'') = idX strm' in (StartEvent a : ts ++ EndEvent a : us, strm'')
The program processes a stream of XML events, and in this case it simply returns the stream itself with a minimal amount of validation. The intention of the code was to thread "strm" through the recursive calls and free the items in strm as soon as they are processed. So the program should use constant space in the heap (and stack space proportional to the depth of the XML tree). However, with this definition I needed about 8Mb of heap for a 1Mb input file.
The situation is the same even if I do not use -O or -O2. So it seems that the GC fix did not work for this case. Is there a way to make it work?
If I do have to alter the code to avoid the space leak, what is the best way to do it?
I'm sorry I can't help with specifics. This kind of space leak usually requires deep insight and a lot of time staring at the code and heap profiles. It's a big problem, I know. Cheers, Simon