
Thanks folks! Forward progress is made...
Unfortunately, programs don't seem to write out their threadscope event logs
until they terminate, and mine hangs until I kill it, so I can't get at the
event log.
Tracing has taught me that before the hang-cause, my program splits its time
in pthread_cond_wait in two different threads, and select in a third. After
the hang, it no longer calls select and one of those pthread_cond_waits in
the other. In the version without -threaded that doesn't hang, it never does
any pthread_cond_wait and never misses the select.
Now to go figure out what impossible condition it's waiting on, I guess.
Aran
On Thu, May 13, 2010 at 2:13 AM, Ketil Malde
Aran Donohue
writes: I have a program that I can reliably cause to hang. It's concurrent using STM, so I think it could be a deadlock or related issue. I also do some IO, so I think it could be blocking in a system call.
If it's the latter, 'strace' might help you. Use 'strace -p PID' to attach to a running process. Similarly, 'ltrace' can trace library calls (but probably less useful in this context?)
(This is on Linux, but other OSes are likely to have similar tools.)
-k -- If I haven't seen further, it is by standing in the footprints of giants