
On Tuesday 05 October 2010 10:06:03, Jeroen van Maanen wrote:
A few weeks ago I was greatly helped by members of this list to expose an error that was hiding in my, by now quite extensive, Haskell program. I should still thank Daniel Fisher for his lucid explanation of how an overdose of strictness can prevent a try-expression from wrapping an exception.
Now that the error is gone, I face a new challenge: when I let my program run for a while (it is a computationally intensive task that never stops by design), after fifteen minutes or so the CPU usage drops from nearly 100% to around 15% and a few minutes later the process dies with the message:
"lexau: memory allocation failed (requested 2097152 bytes)"
The whole thing is a tail biting snail pit of threads that are communicating through MVars. Every thread runs a tail recursive function. (I read in another post that it is not a good idea to use explicit recursion, but when I compared alternatives the fold variants produced even worse stack/heap problems.)
That depends. Using a combinator (folds etc.) gives you more elegant, more general and usually more readable code. Using a low-level direct recursion however gives you more control over strictness, speed and allocation behaviour. Sometimes you have to stick your hands in the low-level details to get adequate performance.
The thread that is likely to cause the problem is an optimizer that tries many possible improvements of a complex data structure and incrementally applies the successful ones. I use a strict foldl style in an attempt to limit the memory used by the optimizer.
May be not strict enough (or too strict). What's the suspect function?
Frankly, however, I have no idea what eats up so much heap space.
Now I have always proudly labeled myself a 'printf programmer', but I am afraid that I am going to need some profiling tool to determine where the problem is. Any suggestions where I should start?
As a very first measure, run your programme with the "-hT" RTS-option ( $ ./lexau +RTS -hT -RTS args -- wait until it dies or ctrl-C it when CPU usage has dropped $ hp2ps -c lexau.hp -- look at the graph in lexau.ps ). If that doesn't reveal the source of the problem, compile for profiling, $ ghc -O2 -prof -auto-all -osuf p_o -hisuf p_hi --make lexau -o profLexau and run with some heap profiling options, http://darcs.haskell.org/download/docs/6.12.3/html/users_guide/profiling.htm... explains how.
Cheers, Jeroen
P.S. For an idea of what is living in the snail pit, have a look at:
http://lexau.svn.sourceforge.net/viewvc/lexau/branches/totem/src/LExAu/P ipeline/Concurrent.hs?view=markup http://lexau.svn.sourceforge.net/viewvc/lexau/branches/totem/src/LExAu/M odel/HistoryTree.hs?view=markup
Ewww.
(I know, HistoryTree.hs badly needs to be split up into smaller modules.)
+ 5