foldl and space problems

Hello, My space problems continued... I have foldl that produces list, some combining function and quite large source list: let xyz = foldl f state myBigList This setting should lazyli consume myBigList when next elements of xyz are demanded. Except that it seems that myBigList is held by state to the end of computation :( Question: is there any way to see what is holding my source list? I did try to guess, but without results as of now:( How do I debug and/or reason about such situation? -- Gracjan

On Mon, 2005-06-06 at 13:15 +0200, Gracjan Polak wrote:
Hello,
My space problems continued...
I have foldl that produces list, some combining function and quite large source list:
let xyz = foldl f state myBigList
This setting should lazyli consume myBigList when next elements of xyz are demanded. Except that it seems that myBigList is held by state to the end of computation :(
Question: is there any way to see what is holding my source list? I did try to guess, but without results as of now:(
foldl suffers from a fairly notorious space leak when used under lazy evaluation. Here is foldl: foldl f acc [] = acc foldl f acc (x:xs) = foldl f (f acc x) xs Here is a "symbolic" computation using it: foo = foldl g init [a,b,c] = foldl g (g init a) [b,c] = foldl g (g (g init a) b) [c] = foldl g (g (g (g init a) b) c) [] = (g (g (g init a) b) c) Notice that the "accumulator" argument grows with size proportional to the amount of list consumed. I would guess that your program is suffering from this problem. The solution? One theoretical solution is to avoid lazy evaluation in the language implementation. For instance an "optimistic" evaluator might avoid the space leak. GHC has an experimental branch that supports this, but as far as I know it has not seen an official release. A more practical solution is to force the compiler to generate more strict code. Data.List provides a function called foldl' which has the same type as foldl, but has different strictness. In particular it forces the accumulator argument to be "evaluated" before each recursive call to foldl. Unfortunately foldl' is not always as strict as you want, because it only forces the accumulator to be evaluated to what is called Weak Head Normal Form. If your accumulated value (state) has a lazy data constructor, such as the tuple constructor, you might find that the space usage remains very high. Exercise for the reader: why is this so? The solution in that case might be to add strictness flags to the arguments of the state constructors, though this may have adverse effects elsewhere in the program.
How do I debug and/or reason about such situation?
Very good question. One solution is to practice your term re-writing skills and try to reason about the size of the intermediate terms that are generated. You might also find GHood useful: http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/GHood/ Cheers, Bernie.

Bernard Pope wrote:
A more practical solution is to force the compiler to generate more strict code.
I tried to put strictness annotation in every place I could think of. Without result :(
You might also find GHood useful:
http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/GHood/
Thanks for the pointer.
Cheers, Bernie.
-- Gracjan

On Tue, 2005-06-07 at 12:35 +0200, Gracjan Polak wrote:
Bernard Pope wrote:
A more practical solution is to force the compiler to generate more strict code.
I tried to put strictness annotation in every place I could think of. Without result :(
Did you try Data.List.foldl' ? Perhaps you could post the definition of the state type? Or even better, a small example of code that runs badly. Cheers, Bernie.

Bernard Pope wrote:
Perhaps you could post the definition of the state type? Or even better, a small example of code that runs badly.
I still don't know where old code had problems, but after rewriting everything it seems to run smoothly now :) Thanks for all ideas, btw. I "invented" something like this: type RestAndState = (MyState -> [MyData]) -> MyState -> [MyData] this is the type of funtions that take current state and continuation as parameters. Example: putValues v1 v2 v3 rest state = [v1,v2,v3] ++ rest state there is a bit of syntactic ugliness when state chages: putValueEx v1 rest state = let newstate = state { ... } in [v1] ++ rest nstate good thing is that function can be composed quite easily with ($): putV v1 v2 v3 v4 rest = -- state skipped here :) putValue v1 v2 v3 $ putValueEx v4 rest It works as I want it to. But I have strange feeling that there must be a better way to "compose" foldr and state... Does anybody have any idea how to put monad into this? -- Gracjan

On Mon, Jun 13, 2005 at 03:29:21PM +0200, Gracjan Polak wrote:
Bernard Pope wrote:
Perhaps you could post the definition of the state type? Or even better, a small example of code that runs badly.
I still don't know where old code had problems, but after rewriting everything it seems to run smoothly now :) Thanks for all ideas, btw.
I "invented" something like this:
type RestAndState = (MyState -> [MyData]) -> MyState -> [MyData]
this is the type of funtions that take current state and continuation as parameters. Example:
putValues v1 v2 v3 rest state = [v1,v2,v3] ++ rest state
there is a bit of syntactic ugliness when state chages:
putValueEx v1 rest state = let newstate = state { ... } in [v1] ++ rest nstate
good thing is that function can be composed quite easily with ($):
putV v1 v2 v3 v4 rest = -- state skipped here :) putValue v1 v2 v3 $ putValueEx v4 rest
It works as I want it to. But I have strange feeling that there must be a better way to "compose" foldr and state... Does anybody have any idea how to put monad into this?
I haven't followed the thread from the beginning, but here is one way of expressing your examples, using State and foldr1: putValues :: a -> a -> a -> State s ([a] -> [a]) putValues v1 v2 v3 = return ([v1,v2,v3] ++) putValueEx :: a -> State s ([a] -> [a]) putValueEx v1 = do modify (\s -> undefined) return (v1 :) putV :: a -> a -> a -> a -> State s ([a] -> [a]) putV v1 v2 v3 v4 = foldr1 (liftM2 (.)) [putValues v1 v2 v3, putValueEx v4] test = evalState (putV 1 2 3 4) undefined [] I'm not sure exactly what you're trying to accomplish, but maybe this will give you some ideas. Andrew

On 06/06/05, Gracjan Polak
Question: is there any way to see what is holding my source list? I did try to guess, but without results as of now:(
How do I debug and/or reason about such situation?
I heard that NHC has excellent heap profiling support, maybe it would be able to help?
participants (4)
-
Andrew Pimlott
-
Bernard Pope
-
Gracjan Polak
-
Samuel Bronson