
My understanding was that this error occurred when one thread was blocked, waiting on an MVar, and no other thread in the program has a reference to that MVar (this can be detected during GC). Ergo, the blocked thread will end up waiting forever because no-one can ever wake it up again.
That certainly seems a sensible rule - I'll see if that can help me debug my problem.
Do you actually have use of MVars in your program directly, or are they being used via a library? And do you at least know which thread is throwing this exception? It should be catchable so you can probably wrap the arguments to your forkIO calls with a catcher than indicates which thread blew up.
I use MVar's directly, use Chan/QSem, and have about 5 concurrency data types built on top of MVar's - they're everywhere. I also have a thread pool structure, so tasks move between threads regularly - knowing which thread got blocked isn't very interesting. Thanks for the information, Neil