Thanks folks! Forward progress is made...
Unfortunately, programs don't seem to write out their threadscope event logs until they terminate, and mine hangs until I kill it, so I can't get at the event log.
Tracing has taught me that before the hang-cause, my program splits its time in pthread_cond_wait in two different threads, and select in a third. After the hang, it no longer calls select and one of those pthread_cond_waits in the other. In the version without -threaded that doesn't hang, it never does any pthread_cond_wait and never misses the select.
Now to go figure out what impossible condition it's waiting on, I guess.
Aran
On Thu, May 13, 2010 at 2:13 AM, Ketil Malde
<ketil@malde.org> wrote:
Aran Donohue <
aran.donohue@gmail.com> writes:
> I have a program that I can reliably cause to hang. It's concurrent using
> STM, so I think it could be a deadlock or related issue. I also do some IO,
> so I think it could be blocking in a system call.
If it's the latter, 'strace' might help you. Use 'strace -p PID' to
attach to a running process. Similarly, 'ltrace' can trace library
calls (but probably less useful in this context?)
(This is on Linux, but other OSes are likely to have similar tools.)
-k
--
If I haven't seen further, it is by standing in the footprints of giants