Re: [reactive] Re: black hole detection and concurrency

Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I
think getting this bug fixed will solve Reactive's mysterious bugs and
unblock its progress.
- Conal
On Sat, Jan 3, 2009 at 1:20 PM, Peter Verswyvelen
That is very good news! Let's hope it's a bug that is easy enough to fix, since I guess the RTS is very tricky.
Thanks for all this effort. It would explain a lot of strange behaviour.
Cheers, Peter Verswyvelen CTO - Anygma.com
Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running
example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, *unless* it's compiled with -threaded and run with +RTS -N2. In
On Sat, Jan 3, 2009 at 4:48 PM, Bertram Felgenhauer
wrote: the the latter case, it locks up after a while.
It seems that we've found an RTS bug. If a thread is started with exceptions blocked, then throwTo might never deliver its exception and block forever, as can be seen with the following test program, which locks up after a while (immediately with the non-threaded RTS)
import Control.Exception import Control.Concurrent import Control.Monad
test n = do t <- block $ forkIO yield yield putStr $ show n ++ ": kill\n" killThread t
main = forM_ [1..] test
Or, even more convincing:
import Control.Exception import GHC.Conc
main = do t1 <- block $ forkIO yield t2 <- forkIO $ killThread t1 yield yield threadStatus t1 >>= print threadStatus t2 >>= print
prints (fairly reliably, it seems):
ThreadFinished ThreadBlocked BlockedOnException
(Trac is giving me errors right now. I'm planning to report this later.)
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
-- This one locks up after a while even in ghci. When compiled -threaded -- and run +RTS -N2, it locks up almost immediately. a `race` b = do v <- newEmptyMVar let t x = x >>= putMVar v withThread (t a) $ withThread (t b) $ takeMVar v where withThread u v = brackAsync (forkIO u) killThread (const v)
At the point the 'forkIO' is run, exceptions are blocked, making the thread basically immortal. Using
withThread u v = brackAsync (forkIO $ unblock u) killThread (const v)
we get the same behaviour as with my 'race' - it works for a while, but locks up eventually.
I believe that the actual lockup is similar to the test programs above in all cases - what's different is just the probability of triggering it.
regards,
Bertram _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive
_______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too? Cheers, Simon

I don't know if the bug would explain <<loop>>. My guess is that the
black-hole-detection code is incorrectly concluding there is a black hole
because a thunk was marked as in the process of being evaluated, then the
evaluating thread is killed without unmarking the thunk, and then another
thread needs the whnf. This is a fairly naive guess. I don't know this
machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon

The bugs that we have identified result in deadlocks - the runtime doesn't notice that a thread blocked in throwTo can continue. I can't think of a way that this could lead to bogus <<loop>>, but I suppose I wouldn't be too surprised if it were possible. The best way forward is for you to test out a snapshot once these patches have made it into a build. Does that sound reasonable? I've been running your TestRace program for quite a while on 4 processors without any hangs now. Cheers, Simon Conal Elliott wrote:
I don't know if the bug would explain <<loop>>. My guess is that the black-hole-detection code is incorrectly concluding there is a black hole because a thunk was marked as in the process of being evaluated, then the evaluating thread is killed without unmarking the thunk, and then another thread needs the whnf. This is a fairly naive guess. I don't know this machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
mailto:marlowsd@gmail.com> wrote: Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon

Hi Simon,
That's great news. Please let me know when there's a build I can grab, and
I'll try it out.
Regards, - Conal
On Wed, Jan 7, 2009 at 7:22 AM, Simon Marlow
The bugs that we have identified result in deadlocks - the runtime doesn't notice that a thread blocked in throwTo can continue. I can't think of a way that this could lead to bogus <<loop>>, but I suppose I wouldn't be too surprised if it were possible.
The best way forward is for you to test out a snapshot once these patches have made it into a build. Does that sound reasonable? I've been running your TestRace program for quite a while on 4 processors without any hangs now.
Cheers, Simon
Conal Elliott wrote:
I don't know if the bug would explain <<loop>>. My guess is that the black-hole-detection code is incorrectly concluding there is a black hole because a thunk was marked as in the process of being evaluated, then the evaluating thread is killed without unmarking the thunk, and then another thread needs the whnf. This is a fairly naive guess. I don't know this machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
> wrote: Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon
participants (2)
-
Conal Elliott
-
Simon Marlow