
On Mon, 2005-06-06 at 13:15 +0200, Gracjan Polak wrote:
Hello,
My space problems continued...
I have foldl that produces list, some combining function and quite large source list:
let xyz = foldl f state myBigList
This setting should lazyli consume myBigList when next elements of xyz are demanded. Except that it seems that myBigList is held by state to the end of computation :(
Question: is there any way to see what is holding my source list? I did try to guess, but without results as of now:(
foldl suffers from a fairly notorious space leak when used under lazy evaluation. Here is foldl: foldl f acc [] = acc foldl f acc (x:xs) = foldl f (f acc x) xs Here is a "symbolic" computation using it: foo = foldl g init [a,b,c] = foldl g (g init a) [b,c] = foldl g (g (g init a) b) [c] = foldl g (g (g (g init a) b) c) [] = (g (g (g init a) b) c) Notice that the "accumulator" argument grows with size proportional to the amount of list consumed. I would guess that your program is suffering from this problem. The solution? One theoretical solution is to avoid lazy evaluation in the language implementation. For instance an "optimistic" evaluator might avoid the space leak. GHC has an experimental branch that supports this, but as far as I know it has not seen an official release. A more practical solution is to force the compiler to generate more strict code. Data.List provides a function called foldl' which has the same type as foldl, but has different strictness. In particular it forces the accumulator argument to be "evaluated" before each recursive call to foldl. Unfortunately foldl' is not always as strict as you want, because it only forces the accumulator to be evaluated to what is called Weak Head Normal Form. If your accumulated value (state) has a lazy data constructor, such as the tuple constructor, you might find that the space usage remains very high. Exercise for the reader: why is this so? The solution in that case might be to add strictness flags to the arguments of the state constructors, though this may have adverse effects elsewhere in the program.
How do I debug and/or reason about such situation?
Very good question. One solution is to practice your term re-writing skills and try to reason about the size of the intermediate terms that are generated. You might also find GHood useful: http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/GHood/ Cheers, Bernie.