Debugging misbehaving multi-threaded programs

I've written a multi-threaded Haskell program that I'm trying to
debug. Basically what's happening is the program runs for a while, and
then at some point one of the threads goes crazy and spins the CPU
while allocating memory; this proceeds until the system runs out of
available memory. I can't fix this without figuring out what code is
being executed in the loop (or at least which thread is misbehaving,
which would narrow things down a lot).
I was hopeful that I could compile the program with profiling support
and then use +RTS -M100M along with some of the RTS profiling options
and get profiling information on CPU and memory usage at the time that
my program runs out of memory. The thinking here is that nearly all of
the CPU time and heap space will be from the misbehaving thread, so at
that point I could do more investigation into exactly what is
happening. Unfortunately, this doesn't seem to work; whenever the
program terminates due to running out of heap space, the generated
.prof file is empty.
Another strategy I tried was running the program in ghci and use
-fbreak-on-exception and :trace; by hitting Ctrl-C I was hopeful I'd
stop the program in whatever is looping (this is all described and
suggested in the ghc docs). Unfortunately, this also didn't seem to
work, because the Ctrl-C only stops the main thread.
Does anyone have any tips for dealing this? Have other people run into
similar problems? I'm out of ideas, so I'm open to any suggestions.
--
Evan Klitzke

Evan Klitzke
[...] Unfortunately, this doesn't seem to work; whenever the program terminates due to running out of heap space, the generated .prof file is empty.
Unless there's some specific problem with profiling in combination with threading, you can get heap profiling from a crashing program. Not what you wanted, but you might at least be able to see what is allocated. Previously, you'd have to edit the profiling output (.hp file) by hand to chop off the last and partial profiling record, but I think this might have been fixed in later GHC's. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Thu, Jun 11, 2009 at 12:40 AM, Ketil Malde
Evan Klitzke
writes: [...] Unfortunately, this doesn't seem to work; whenever the program terminates due to running out of heap space, the generated .prof file is empty.
Unless there's some specific problem with profiling in combination with threading, you can get heap profiling from a crashing program. Not what you wanted, but you might at least be able to see what is allocated.
After fiddling around a bit, I found out that you can get a heap
profile from a program that dies from an error, but not one that dies
from actually running out of heap space. I was able to take advantage
of this by making the main thread do a threadDelay and then error,
with the threadDelay timed to occur during the program's misbehavior.
--
Evan Klitzke

I've written a multi-threaded Haskell program that I'm trying to debug. Basically what's happening is the program runs for a while, and then at some point one of the threads goes crazy and spins the CPU while allocating memory; this proceeds until the system runs out of available memory. I can't fix this without figuring out what code is being executed in the loop (or at least which thread is misbehaving, which would narrow things down a lot). .. Does anyone have any tips for dealing this? Have other people run into similar problems? I'm out of ideas, so I'm open to any suggestions.
Don't know whether this still works, but there was a Concurrent Haskell Debugger here: http://www.informatik.uni-kiel.de/~fhu/chd/ The idea being that you put an indirection module between your code and the Concurrent Haskell imports, and then instrument the indirections to give you more information (they had built more tools on top of that idea). In a similar direction, I once suggested a shell-jobs-like thread interface for GHCi, in the context of this _|_-ed ticket: http://hackage.haskell.org/trac/ghc/ticket/1399#comment:3 Claus

On 11/06/2009 05:40, Evan Klitzke wrote:
I've written a multi-threaded Haskell program that I'm trying to debug. Basically what's happening is the program runs for a while, and then at some point one of the threads goes crazy and spins the CPU while allocating memory; this proceeds until the system runs out of available memory. I can't fix this without figuring out what code is being executed in the loop (or at least which thread is misbehaving, which would narrow things down a lot).
I was hopeful that I could compile the program with profiling support and then use +RTS -M100M along with some of the RTS profiling options and get profiling information on CPU and memory usage at the time that my program runs out of memory. The thinking here is that nearly all of the CPU time and heap space will be from the misbehaving thread, so at that point I could do more investigation into exactly what is happening. Unfortunately, this doesn't seem to work; whenever the program terminates due to running out of heap space, the generated .prof file is empty.
We fixed this recently (GHC 6.10.2): http://hackage.haskell.org/trac/ghc/ticket/2592 In 6.12.1 you'll be able to use ThreadScope, our parallel profiling tool. You could try it right now if you're brave enough to compile GHC (it needs GHC 6.11). The ThreadScope code is here: http://code.haskell.org/ThreadScope/ and shortly the Haskell Symposium paper about it will be available (we're just making the final corrections now). Cheers, Simon
participants (4)
-
Claus Reinke
-
Evan Klitzke
-
Ketil Malde
-
Simon Marlow