Understanding behavior of BlockedIndefinitelyOnMVar exception

I'm trying to really understand how the BlockedIndefinitelyOnMVar exception works in concurrent code as I would like to rely on it as a useful runtime signal in a concurrency library I'm working on. Here is some code illustrating a function restoring an abandoned lock in a single-threaded program and works as I would expect: -------- START CODE -------- module Main where import Control.Concurrent import Control.Exception -- This raises the exception only once and the lock is successfully restored: main1 = do lock <- newMVar () lockPrint "good1" lock badLockPrint "bad" lock -- exception is raised and lock is restored here: lockPrint "good2" lock -- no exception raised: lockPrint "good3" lock readMVar lock lockPrint :: String -> MVar () -> IO () lockPrint name v = do e <- try $ takeMVar v :: IO (Either BlockedIndefinitelyOnMVar ()) -- either print exception, or print name: either print (const $ putStrLn name) e `finally` putMVar v () -- perhaps simulates an operation that died before it could return a lock: badLockPrint :: String -> MVar () -> IO () badLockPrint s v = do takeMVar v putStrLn s -- Forgot to return the lock here!: -------- END CODE -------- Now here is a variation of 'main' that forks the operations: -------- START CODE -------- main0 = do lock <- newMVar () forkIO $ lockPrint "good1" lock threadDelay 1000000 forkIO $ badLockPrint "bad" lock -- these both raise blocked indefinitely exception threadDelay 1000000 forkIO $ lockPrint "good2" lock threadDelay 1000000 forkIO $ lockPrint "good3" lock threadDelay 1000000 -------- END CODE -------- What I think I've learned here is that the BlockedIndefinitelyOnMVar exception is raised in all the blocked threads "at once" as it were. That despite the fact that the handler code in 'lockPrint' restores the lock for successive threads. This would also seem to imply that putMVar's in an exception handler don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But that doesn't really seem right. Can anyone comment on the two conclusions above? FWIW, this was an interesting related thread: http://comments.gmane.org/gmane.comp.lang.haskell.glasgow.user/18667 Thanks, Brandon Simmons http://coder.bsimmons.name

On Sun, Jul 24, 2011 at 7:56 PM, Brandon Simmons
What I think I've learned here is that the BlockedIndefinitelyOnMVar exception is raised in all the blocked threads "at once" as it were. That despite the fact that the handler code in 'lockPrint' restores the lock for successive threads.
This would also seem to imply that putMVar's in an exception handler don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But that doesn't really seem right.
Does anything change if you somehow force a GC sometime after "good2"? Perhaps with some calculation generating garbage, perhaps with performGC. IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC. But I'm probably wrong =). Cheers, -- Felipe.

Excerpts from Felipe Almeida Lessa's message of Sun Jul 24 22:02:36 -0400 2011:
Does anything change if you somehow force a GC sometime after "good2"? Perhaps with some calculation generating garbage, perhaps with performGC. IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC. But I'm probably wrong =).
That's correct. resurrectThreads is called after garbage collection on the list of threads found to be garbage. Each of these threads will be woken up and sent a signal: BlockedOnDeadMVar if the thread was blocked on an MVar, or NonTermination if the thread was blocked on a Black Hole. Cheers, Edward

On Sun, Jul 24, 2011 at 10:07 PM, Edward Z. Yang
Excerpts from Felipe Almeida Lessa's message of Sun Jul 24 22:02:36 -0400 2011:
Does anything change if you somehow force a GC sometime after "good2"? Perhaps with some calculation generating garbage, perhaps with performGC. IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC. But I'm probably wrong =).
That's correct.
resurrectThreads is called after garbage collection on the list of threads found to be garbage. Each of these threads will be woken up and sent a signal: BlockedOnDeadMVar if the thread was blocked on an MVar, or NonTermination if the thread was blocked on a Black Hole.
Cheers, Edward
Thanks, Edward. I'm going to take a look at the GHC source and see if I can grok any of it. Any comment on whether it is correct behavior to have the exception raised in all the threads attempting a readMVar at once (if that's actually what's happening), even though an exception handler will fill the MVar for subsequent threads? I think I'm not totally clear on what qualifies as "indefinitely" Thanks again, Brandon

On Sun, Jul 24, 2011 at 10:02 PM, Felipe Almeida Lessa
On Sun, Jul 24, 2011 at 7:56 PM, Brandon Simmons
wrote: What I think I've learned here is that the BlockedIndefinitelyOnMVar exception is raised in all the blocked threads "at once" as it were. That despite the fact that the handler code in 'lockPrint' restores the lock for successive threads.
This would also seem to imply that putMVar's in an exception handler don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But that doesn't really seem right.
Does anything change if you somehow force a GC sometime after "good2"? Perhaps with some calculation generating garbage, perhaps with performGC. IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC. But I'm probably wrong =).
Here is a variation that calls 'performGC' after the first thread is forked. It prints the exception simultaneously right before the last 'threadDelay': main2 = do lock <- newMVar () forkIO $ lockPrint "good1" lock threadDelay 1000000 forkIO $ badLockPrint "bad" lock -- these both raise blocked indefinitely exception threadDelay 1000000 forkIO $ lockPrint "good2" lock performGC threadDelay 1000000 forkIO $ lockPrint "good3" lock threadDelay 1000000 Perhaps laziness is confusing the issue as well? Thanks and sorry for the delayed response, Brandon Simmons
Cheers,
-- Felipe.

Hello Brandon, The answer is subtle, and has to do with what references are kept in code, which make an object considered reachable. Essentially, the main thread itself keeps the MVar live while it still has forking to do, so that it cannot get garbage collected and trigger these errors. Here is a simple demonstrative program: main = do lock <- newMVar () forkIO (takeMVar lock) forkIO (takeMVar lock) forkIO (takeMVar lock) Consider what the underlying code needs to do after it has performed the first forkIO. 'lock' is a local variable that the code generator knows it's going to need later in the function body. So what does it do? It saves it on the stack. // R1 is a pointer to the MVar cqo: Hp = Hp + 8; if (Hp > HpLim) goto cqq; I32[Hp - 4] = spd_info; I32[Hp + 0] = R1; I32[Sp + 0] = R1; R1 = Hp - 3; I32[Sp - 4] = spe_info; Sp = Sp - 4; jump stg_forkzh (); (Ignore the Hp > HpLim; that's just the heap check.) This lives on until we continue executing the main thread at spe_info (at which point we may or may not deallocate the stack frame). But what happens instead? cqk: Hp = Hp + 8; if (Hp > HpLim) goto cqm; I32[Hp - 4] = sph_info; I32[Hp + 0] = I32[Sp + 4]; R1 = Hp - 3; I32[Sp + 0] = spi_info; jump stg_forkzh (); We keep the pointer to the MVar to the stack, because we know there is yet /another/ forkIO (takeMVar lock) coming up. (It's located at Sp + 4; you have to squint a little since Sp is being fiddled with, but it's still there, we just overwrite the infotable with a new one.) Finally, spi_info decides we don't need the contents of Sp + 4 anymore, and overwrites it accordingly: cqg: Hp = Hp + 8; if (Hp > HpLim) goto cqi; I32[Hp - 4] = spl_info; I32[Hp + 0] = I32[Sp + 4]; R1 = Hp - 3; I32[Sp + 4] = spm_info; Sp = Sp + 4; jump stg_forkzh (); But in the meantime (esp. between invocation 2 and 3), the MVar cannot be garbage collected, because it is live on the stack. Could GHC have been more clever in this case? Not in general, since deciding whether or not a reference will actually be used or not boils down to the halting problem. loop = threadDelay 100 >> loop -- prevent blackholing from discovering this main = do lock <- newEmptyMVar t1 <- newEmptyMVar forkIO (takeMVar lock >> putMVar t1 ()) forkIO (loop `finally` putMVar lock ()) takeMVar t1 Maybe we could do something where MVar references are known to be writer ends or read ends, and let the garbage collector know that an MVar with only read ends left is a deadlocked one. However, this would be a very imprecise analysis, and would not help in your original code (since all of your remaining threads had the possibility of writing to the MVar: it doesn't become clear that they can't until they all hit their takeMVar statements.) Cheers, Edward

On Tue, Jul 26, 2011 at 1:25 AM, Edward Z. Yang
Hello Brandon,
The answer is subtle, and has to do with what references are kept in code, which make an object considered reachable. Essentially, the main thread itself keeps the MVar live while it still has forking to do, so that it cannot get garbage collected and trigger these errors.
Ah, okay. That seems like an obvious explanation for the exceptions to be raised at the same time in the forked threads.
Here is a simple demonstrative program:
main = do lock <- newMVar () forkIO (takeMVar lock) forkIO (takeMVar lock) forkIO (takeMVar lock)
(snip)
But in the meantime (esp. between invocation 2 and 3), the MVar cannot be garbage collected, because it is live on the stack.
Could GHC have been more clever in this case? Not in general, since deciding whether or not a reference will actually be used or not boils down to the halting problem.
loop = threadDelay 100 >> loop -- prevent blackholing from discovering this main = do lock <- newEmptyMVar t1 <- newEmptyMVar forkIO (takeMVar lock >> putMVar t1 ()) forkIO (loop `finally` putMVar lock ()) takeMVar t1
Maybe we could do something where MVar references are known to be writer ends or read ends, and let the garbage collector know that an MVar with only read ends left is a deadlocked one. However, this would be a very imprecise analysis, and would not help in your original code (since all of your remaining threads had the possibility of writing to the MVar: it doesn't become clear that they can't until they all hit their takeMVar statements.)
I think this is the crux of what I was confused about. I had assumed read vs. write was being taken into account by the runtime in raising BlockedIndefinitelyOnMVar. This makes it obvious: loop = threadDelay 100 >> loop -- prevent blackholing from discovering this main = do lock <- newEmptyMVar forkIO (loop `finally` takeMVar lock) takeMVar lock Given that, I still can't say I understand what is happening in my original code. I'll try to work out an even simpler example on my own. Thanks for the thoughtful response, Brandon
Cheers, Edward
participants (3)
-
Brandon Simmons
-
Edward Z. Yang
-
Felipe Almeida Lessa