How do I get a long iteration to run in constant space

Hi, I'm starting to learn haskell, and I was considering it for a simulation I've been designing, but I ran into this problem with a simple numerical calculation. The attached program is supposed to solve a stiff differential equation, and you have to use a very small stepsize and lots of steps to resolve the part of the solution near a singularity. It looks to me like this program should run in constant space because the iterate2 function is tail-recursive, but when I run it on hugs or ghc 2.6.1, I get an error that it runs out of stack space on the third calculation. (And the quick fix is to give it a flag that increases the stack, but if I'm to use haskell for a potentially large and long simulation, I can't afford to let it run for three days only to get an error and figure out that it needs more stack space than the computer can give.) I suppose in my ODE example it's building up expressions somewhere for lazy evaluation, so I've tried putting seq and $! in various places. I have another example where using "else seq x (iterate2 ...)" helps, but that doesn't solve my problem here. I've also tried -O but it doesn't help. Can anyone give me some idea as to what to do? Thanks, -wgm iterate2 f x n = if n == 0 then x else iterate2 f (f x) (n - 1) integrate step f xInit xFinal yInit nSteps = let h = (xFinal - xInit) / fromIntegral nSteps next = step f h in iterate2 next (xInit, yInit) nSteps rk4Next f h (x, y) = let a1 = h * f x y a2 = h * f (x + h/2) (y + a1/2) a3 = h * f (x + h/2) (y + a2/2) a4 = h * f (x + h) (y + a3) in (x + h, y + a1/6 + a2/3 + a3/3 + a4/6) rk4 = integrate rk4Next stiff x y = -( ((exp (x - 1)) + (x - 1) * y) / (1 - x - 0.0001 * y) ) main = do putStr "Begin\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 1000)) putStr "\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 10000)) putStr "\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 100000)) putStr "\n" __________________________________ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail

W M
I suppose in my ODE example it's building up expressions somewhere for lazy evaluation,
Exactly right. The trick is spotting which expressions. Until you have some experience of likely causes, rather than guessing I can recommend studying some heap profiles to learn what kind of values are building up - e.g. are they closures for 'integrate' or for '+'? Here are a couple of simple changes that seem to fix your space leak.
iterate2 f x n = if n == 0 then x else iterate2 f (f x) (n - 1)
Try x `seq` iterate2 f (f x) (n - 1) to ensure that the accumulator is holding simple values rather than closures. Also, currently you are building pairs lazily. Making a strict pair type helps: data Pair a = P !a !a
integrate step f xInit xFinal yInit nSteps = let h = (xFinal - xInit) / fromIntegral nSteps next = step f h in iterate2 next (xInit, yInit) nSteps
iterate2 next (P xInit yInit) nSteps
rk4Next f h (x, y) = let a1 = h * f x y a2 = h * f (x + h/2) (y + a1/2) a3 = h * f (x + h/2) (y + a2/2) a4 = h * f (x + h) (y + a3) in (x + h, y + a1/6 + a2/3 + a3/3 + a4/6)
P (x + h) (y + a1/6 + a2/3 + a3/3 + a4/6)
rk4 = integrate rk4Next
stiff x y = -( ((exp (x - 1)) + (x - 1) * y) / (1 - x - 0.0001 * y) )
main = do putStr "Begin\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 1000)) putStr "\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 10000)) putStr "\n" putStr (show (rk4 stiff 0 1 (exp (-1)) 100000)) putStr "\n"

I added two lines to your code: iterate2 f x n | seq f $ seq x $ seq n $ False = undefined iterate2 f x n = --- as before rk4Next f h (x, y) | seq f $ seq h $ seq x $ seq y $ False = undefined rk4Next f h (x, y) = -- as before I also increased ten times the number of steps for the last iteration, to make the example more illustrative. putStr (show (rk4 stiff 0 1 (exp (-1)) 1000000)) The rest of the code is unchanged. The program now runs on GHCi *Foo> main Begin (1.0000000000000007,-6.503275017254139) (0.9999999999999062,-6.497717470015538) (1.000000000007918,-6.497716616653335) on on hugs Begin (1.0,-6.50327501725414) (0.999999999999906,-6.49771747001554) (1.00000000000792,-6.49771661665334) with the default stack size for both interpreters. It seems the code does run in constant space now. The added lines are meant to turn the relevant functions from lazy to strict. When you see something like '(n-1)' and 'y + a1/6', it is a red flag. These are exactly the kinds of expressions that lead to memory exhaustion. Perhaps it is because the size of an unevaluated thunk for (n-1) is so much bigger than the size of the evaluated thunk. It seems that arithmetic expressions are the best candidates for some opportunistic evaluation...

I've been starting to take note of discussion about space leaks in Haskell. So far, it's not a topic that's bothered me, other than obvious programming errors, and I've never found the need to force strict evaluation of my Haskell programs. I find myself wondering why this is. Your comment about arithmetic expressions is telling: the kind of program I have been writing (parsers, symbolic processing, etc.) performs almost no arithmetic. (So far, I've used lists instead of arrays, so a usual source of arithmetic functions is missing.) I've also worked with datasets that fit in memory, so failure to "stream" data hasn't been a problem. I suspect that's the more pernicious case for space leaks, since the causes aren't always so obvious. Are there any guidelines or warning signs to look out for that may be indicative of potential space-leak problems? So far, I can think of: - arithmetic results - multiple uses of a large data value #g -- At 00:44 05/10/04 -0700, oleg@pobox.com wrote:
I added two lines to your code:
iterate2 f x n | seq f $ seq x $ seq n $ False = undefined iterate2 f x n = --- as before
rk4Next f h (x, y) | seq f $ seq h $ seq x $ seq y $ False = undefined rk4Next f h (x, y) = -- as before
I also increased ten times the number of steps for the last iteration, to make the example more illustrative. putStr (show (rk4 stiff 0 1 (exp (-1)) 1000000))
The rest of the code is unchanged. The program now runs on GHCi
*Foo> main Begin (1.0000000000000007,-6.503275017254139) (0.9999999999999062,-6.497717470015538) (1.000000000007918,-6.497716616653335)
on on hugs
Begin (1.0,-6.50327501725414) (0.999999999999906,-6.49771747001554) (1.00000000000792,-6.49771661665334)
with the default stack size for both interpreters. It seems the code does run in constant space now.
The added lines are meant to turn the relevant functions from lazy to strict. When you see something like '(n-1)' and 'y + a1/6', it is a red flag. These are exactly the kinds of expressions that lead to memory exhaustion. Perhaps it is because the size of an unevaluated thunk for (n-1) is so much bigger than the size of the evaluated thunk. It seems that arithmetic expressions are the best candidates for some opportunistic evaluation... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

In my experience, any time your program makes use of some state-like structure which gets updated over a number of iterations, you're going to be in trouble with space leaks. There's a library function which summarises this nicely, mapAccumL :: (state -> x -> (state, y)) -> state -> [x] -> (state, [y]) The GHC library docs describe mapAccumL as "The mapAccumL function behaves like a combination of map and foldl; it applies a function to each element of a list, passing an accumulating parameter from left to right, and returning a final value of this accumulator together with the new list." Now imagine that state is some large structure. The [x] list is a couple of hundred elements long, and you print out some -> but not all <- of the [y] elements. While the [y] list remains live, a whole collection of half evaluated intermediate states also remain live - enter the massive space leak. Projects I have written which suffered massive space leaks include: -> A parallel lazy evaluator, http://cs.anu.edu.au/ample. The state: the machine state, threads, stacks, the heaps. The [x] list: machine instructions. The [y] list: profiling info. If you don't print out all possible profiling info then all those intermediate states remain live and you've got a massive space leak. -> An astroids game I wrote for X. The state: position, velocities of ship, astroids, missiles. The [x] list: ship control data, key codes. The [y] list: a list of graphics prims which get rendered to the screen. You would think that because the list of prims gets consumed by the renderer it wouldn't leak.. but it did, about 200k / sec. Then I changed some seemingly irrelevant part of the code and the space leak went away.. and I have no idea why. yay. -> My ICFP contest entry for this year, Not really a mapAccumL type problem, but still a space leak. At one stage I had a bug in the parser for my assembly language where it didn't handle comments properly. One of the entries I ran through my simulator had comments at the end of lines with Drop statements, but nowhere else. The simulator ran fine until an ant found some food, carried it back to the hill, then attempted to drop it.. parser error. My Haskell program hadn't fully evaluated the thunks to read / parse / assemble that line of code until the ant had tried to use that part of the program.. I laughed, and laughed.. :). ... I think the only way to totally slay bugs like these is to use some deepSeq'esque function on all your intermediate states, or any time when you reckon some evaluation should be 'finished'. Either that or explicly define intermediate structures to be fully strict. I see the GHC people have got a seqExpr function in compiler/coreSyn/CoreSyn.lhs, which I imagine would be applied to the expression tree after every compiler stage. Ben. Graham Klyne wrote:
I've been starting to take note of discussion about space leaks in Haskell. So far, it's not a topic that's bothered me, other than obvious programming errors, and I've never found the need to force strict evaluation of my Haskell programs. I find myself wondering why this is.
Your comment about arithmetic expressions is telling: the kind of program I have been writing (parsers, symbolic processing, etc.) performs almost no arithmetic. (So far, I've used lists instead of arrays, so a usual source of arithmetic functions is missing.)
I've also worked with datasets that fit in memory, so failure to "stream" data hasn't been a problem. I suspect that's the more pernicious case for space leaks, since the causes aren't always so obvious.
Are there any guidelines or warning signs to look out for that may be indicative of potential space-leak problems? So far, I can think of: - arithmetic results - multiple uses of a large data value

On Wed, Oct 06, 2004 at 12:27:35PM +1000, Ben Lippmeier wrote:
Now imagine that state is some large structure. The [x] list is a couple of hundred elements long, and you print out some -> but not all <- of the [y] elements. While the [y] list remains live, a whole collection of half evaluated intermediate states also remain live - enter the massive space leak. Projects I have written which suffered massive space leaks include: -> A parallel lazy evaluator, http://cs.anu.edu.au/ample. The state: the machine state, threads, stacks, the heaps. The [x] list: machine instructions. The [y] list: profiling info. If you don't print out all possible profiling info then all those intermediate states remain live and you've got a massive space leak. -> An astroids game I wrote for X. The state: position, velocities of ship, astroids, missiles. The [x] list: ship control data, key codes. The [y] list: a list of graphics prims which get rendered to the screen.
Know any tricks for 2D recurrences where only points along the i=j line (which depend on all points where i <= j) are used for the final result? -- wli

I take it that the extra case always fails but forces the arguments to be evaluated? Nice trick. Thanks! --wgm --- oleg@pobox.com wrote:
I added two lines to your code:
iterate2 f x n | seq f $ seq x $ seq n $ False = undefined iterate2 f x n = --- as before
rk4Next f h (x, y) | seq f $ seq h $ seq x $ seq y $ False = undefined rk4Next f h (x, y) = -- as before
I also increased ten times the number of steps for the last iteration, to make the example more illustrative. putStr (show (rk4 stiff 0 1 (exp (-1)) 1000000))
The rest of the code is unchanged. The program now runs on GHCi
*Foo> main Begin (1.0000000000000007,-6.503275017254139) (0.9999999999999062,-6.497717470015538) (1.000000000007918,-6.497716616653335)
on on hugs
Begin (1.0,-6.50327501725414) (0.999999999999906,-6.49771747001554) (1.00000000000792,-6.49771661665334)
with the default stack size for both interpreters. It seems the code does run in constant space now.
The added lines are meant to turn the relevant functions from lazy to strict. When you see something like '(n-1)' and 'y + a1/6', it is a red flag. These are exactly the kinds of expressions that lead to memory exhaustion. Perhaps it is because the size of an unevaluated thunk for (n-1) is so much bigger than the size of the evaluated thunk. It seems that arithmetic expressions are the best candidates for some opportunistic evaluation...
__________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail
participants (6)
-
Ben Lippmeier
-
Graham Klyne
-
Malcolm Wallace
-
oleg@pobox.com
-
W M
-
William Lee Irwin III