Re: black hole detection and concurrency

On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
The key part here is 'myThreadId >>= killThread' which throws an asynchronous exception to the thread itself, causing the update frames to be saved on the heap.
Note that 'myThreadId >>= killThread' is not equivalent to 'throw ThreadKilled'; it is a synchronous exception and replaces thunks pointed to by the update frames by another call to the raise primitive - the result being that the exception gets rethrown whenever such a thunk is evaluated. This happens with 'finally' and 'bracket': they use 'throw' for re-throwing the exception.
See rts/RaiseAsync.c (raiseAsync() in particular) for the gory details for the first case, and rts/Schedule.c, raiseExceptionHelper() for the second case.
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in the works for 6.12 or so to improve rethrowing of asynchronous exceptions?) brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing This brackAsync just drops in to the previous code where bracket was and appears to perform correctly. Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however. Cheers, Sterl.

Sterling Clover wrote:
On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in the works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack. (\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ... might do the trick. My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous. If an exception is thrown, 'after a' is run again after the computation has resumed. That's why I did the cleanup within the 'catch'. But there's no reason why you couldn't do that as well: brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads, withThread u v = brackAsync (forkIO (unblock u)) killThread (const v) but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly. I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other. Enjoy, Bertram 'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented: http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes. race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e

I fail to understand this part of the code: case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) So the current thread gets killed synchronously, then then the race function is evaluated again? The latter I don't get. On Sun, Dec 28, 2008 at 3:03 AM, Bertram Felgenhauer < bertram.felgenhauer@googlemail.com> wrote:
On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in
Sterling Clover wrote: the
works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

This is a neat trick indeed! I'd appreciate an explanation of killing one's
own thread and then continuing (with a restart in this case). How does the
post-kill resumption occur? That is, how does control pass to the
tail-recursive call after the self-kill?
- Conal
2008/12/28 Peter Verswyvelen
I fail to understand this part of the code: case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b)
So the current thread gets killed synchronously, then then the race function is evaluated again? The latter I don't get.
On Sun, Dec 28, 2008 at 3:03 AM, Bertram Felgenhauer < bertram.felgenhauer@googlemail.com> wrote:
On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us
behavior we'd prefer (speaking of which, I recall there's something in
Sterling Clover wrote: the the
works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive
_______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Peter Verswyvelen wrote:
I fail to understand this part of the code: case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b)
So the current thread gets killed synchronously, then then the race function is evaluated again? The latter I don't get.
Let's look at what happens when an asynchronous exception arrives. The current thread gets killed. It gets killed asynchronously; as far as the RTS knows, the exceptionit might happen inside a pure computation which may be accessible to other threads. This means that the RTS has to patch things up so that reentering the corresponding thunks continues the computation - because another thread might need the value later. It does that by traversing the stack and turning update frames into ap nodes on the heap, and linking the entered thunks to those using an indirection (again, see RaiseAsync.c for details). Now in fact, IO actions are indistinguishable from pure computations by the RTS, so this mechanism also makes IO actions resumable, in principle, if you can access the corresponding thunk somehow. Normally you can't - there is no reference to that thunk - but unsafePerformIO gives you that reference. So in the above example, the current thread gets killed. However, the IO action (suspended right before the 'unblock (race a b)') is still accessible through the unsafePerformIO thunk. When another thread enters that thunk, execution resumes at that point. It may also be the same thread if it caught the exception further down the stack and later enters the unsafePerformIO thunk again. I don't understand all the interactions here - I don't know why the code is racy in the parallel RTS. Bertram

Bertram Felgenhauer wrote:
Now in fact, IO actions are indistinguishable from pure computations by the RTS, so this mechanism also makes IO actions resumable, in principle, if you can access the corresponding thunk somehow. Normally you can't - there is no reference to that thunk - but unsafePerformIO gives you that reference.
I wonder if other things break in the presence of resumable IO computations... the first thing that comes to mind is, inside a "block" or "unblock" (which have to initiate, take down and deal with all that infrastructure -- luckily you only use them inside a separate thread, a forkIO within the unsafePerformIO... Which admittedly makes things horribly confusing in its own way. Also does forkIO copy any state? it copies blocked status now, which the unamb-calling thread might have...). The state of the thread when entering the computation again, could be different than it was when the computation was first suspended, I'm guessing. (At least you need to be careful about what invariants you assume; I'm not yet sure if it's possible to be careful enough.) -Isaac

Isaac Dupree wrote:
Bertram Felgenhauer wrote:
Now in fact, IO actions are indistinguishable from pure computations by the RTS, so this mechanism also makes IO actions resumable, in principle, if you can access the corresponding thunk somehow. Normally you can't - there is no reference to that thunk - but unsafePerformIO gives you that reference.
I wonder if other things break in the presence of resumable IO computations... the first thing that comes to mind is, inside a "block" or "unblock" (which have to initiate, take down and deal with all that infrastructure -- luckily you only use them inside a separate thread, a forkIO within the unsafePerformIO... Which admittedly makes things horribly confusing in its own way. Also does forkIO copy any state? it copies blocked status now, which the unamb-calling thread might have...). The state of the thread when entering the computation again, could be different than it was when the computation was first suspended, I'm guessing. (At least you need to be careful about what invariants you assume; I'm not yet sure if it's possible to be careful enough.)
There's one other bug in this area that we've known about for a while. In the comments in Exception.cmm: NB. there's a bug in here. If a thread is inside an unsafePerformIO, and inside blockAsyncExceptions# (there is an unblockAsyncExceptions_ret on the stack), and it is blocked in an interruptible operation, and it receives an exception, then the unsafePerformIO thunk will be updated with a stack object containing the unblockAsyncExceptions_ret frame. Later, when someone else evaluates this thunk, the blocked exception state is not restored. this probably isn't affecting reactive, but I thought I'd mention it. Cheers, Simon

therefore mapException is equally buggy!
mapException :: (Exception e1, Exception e2) => (e1 -> e2) -> a -> a mapException f v = unsafePerformIO (catch (evaluate v) (\x -> throw (f x)))
If it maps an asynchronous exception.. and it's re-thrown as synchronous... the same non-resumability happens! mapException is a particular culprit because of the unsafePerformIO (so you had a good reason to expect resumability, since it's for a pure computation, not in IO) - does anyone use mapException? - is there some reason that we don't have all "throw"s (synch. or asynch.) "patch up" the thunks? (memory leaks or serious inefficiency or something?) if "yes", I don't think mapException can currently be implemented; we'd need some way in its "catch" to detect whether the thrown exception was asynchronous, and *iff* so, throw the new exception asynchronously (if synchronous, rethrow synchronously). Or some equivalent. Or maybe some add some magic to unsafePerformIO (probably a bad idea). -Isaac

Isaac Dupree wrote:
therefore mapException is equally buggy!
mapException :: (Exception e1, Exception e2) => (e1 -> e2) -> a -> a mapException f v = unsafePerformIO (catch (evaluate v) (\x -> throw (f x)))
If it maps an asynchronous exception.. and it's re-thrown as synchronous... the same non-resumability happens! mapException is a particular culprit because of the unsafePerformIO (so you had a good reason to expect resumability, since it's for a pure computation, not in IO)
- does anyone use mapException?
- is there some reason that we don't have all "throw"s (synch. or asynch.) "patch up" the thunks? (memory leaks or serious inefficiency or something?)
In theory you could get a nasty space leak, but it's quite unlikely. When an exception is thrown, instead of just updating each thunk with "raise# DivByZero" for example, we would save the current computation in the heap to be restarted if the thunk was ever evaluated again. If the current computation refers to a large amount of heap data, technically that's a space leak. So one way to work around this would be to do all rethrows using throwTo. In most cases this will have no effect, because the rethrows happen in IO code which normally has no enclosing thunks, but in the case of mapException and unsafePerformIO it will fix (or work around) the problems we have with re-throwing async exceptions. Cheers, Simon

Thanks very much for these ideas. Peter Verswyvelen suggested running the example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, *unless* it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while. I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately. I've attached a module, TestRace.hs, containing these experiments. - Conal On Sat, Dec 27, 2008 at 6:03 PM, Bertram Felgenhauer < bertram.felgenhauer@googlemail.com> wrote:
On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in
Sterling Clover wrote: the
works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurren... (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Not sent to glasgow-haskell, because I feel it would be getting off topic: After playing around with the track we were on I got in a hopeless muddle, and somewhat doubt that the current ghc runtime makes it easy. So I started again from first principles and got the following, which seems to work. A casual test with the TestRace harness doesn't seem to show any obvious leaks. The ghci runtime *should* detect when sparks block indefinitely on putting into the blocked MVar (the semaphore, which goes out of scope), and kill them. The messiness of the story of unamb so far makes me doubt this solution, or wonder if trusting the runtime to eventually get around to it is insufficient, but nonetheless, it may just work. I've also tried a versuion with real threads instead of sparks, to the same effect, relying on the runtime to detect and kill indefinite blocking, rather than using explicit asynchronous exceptions. Both would need more stress testing for me to feel confident that they don't leak. unamb :: a -> a -> a a `unamb` b = unsafePerformIO (a `amb` b) amb a b = do v <- newEmptyMVar sem <- newEmptyMVar forkIO . evaluate $ unsafePerformIO (evaluate a >>= \x -> putMVar sem True >> putMVar v x) `par` unsafePerformIO (evaluate b
= \x -> putMVar sem True >> putMVar v x) takeMVar v
--This version tries the same thing, but with normal threads. it seems zippier at first, but also like it may have a leak. amb' a b = block $ do v <- newEmptyMVar sem <- newEmptyMVar forkIO $ evaluate a >>= \x -> putMVar sem True >> putMVar v x forkIO $ evaluate b >>= \x -> putMVar sem True >> putMVar v x takeMVar v Cheers, Sterl. On Dec 28, 2008, at 7:34 PM, Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running the example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, unless it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while.
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
I've attached a module, TestRace.hs, containing these experiments.
- Conal
On Sat, Dec 27, 2008 at 6:03 PM, Bertram Felgenhauer
wrote: Sterling Clover wrote: On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in the works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented: http://www.haskell.org/ghc/docs/latest/html/libraries/base/ Control-Concurrent.html#v%3AforkIO (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

What does 'putMVar sem True' accomplish that 'putMVar v x' doesn't alone?
If 'evaluate a' succeeds first (for instance), 'evaluate b' will keep on
working, won't it? I doubt the RTS is smart enough to notice that the
result of 'evaluate b' is going to be put into an MVar, let alone a full one
with no takeMVar, or that 'evaluate b' is followed by writing to a similarly
full & abandoned boolean MVar.
- Conal
On Sun, Dec 28, 2008 at 11:05 PM, Sterling Clover
Not sent to glasgow-haskell, because I feel it would be getting off topic:
After playing around with the track we were on I got in a hopeless muddle, and somewhat doubt that the current ghc runtime makes it easy. So I started again from first principles and got the following, which seems to work. A casual test with the TestRace harness doesn't seem to show any obvious leaks. The ghci runtime *should* detect when sparks block indefinitely on putting into the blocked MVar (the semaphore, which goes out of scope), and kill them. The messiness of the story of unamb so far makes me doubt this solution, or wonder if trusting the runtime to eventually get around to it is insufficient, but nonetheless, it may just work.
I've also tried a versuion with real threads instead of sparks, to the same effect, relying on the runtime to detect and kill indefinite blocking, rather than using explicit asynchronous exceptions.
Both would need more stress testing for me to feel confident that they don't leak.
unamb :: a -> a -> a a `unamb` b = unsafePerformIO (a `amb` b)
amb a b = do v <- newEmptyMVar sem <- newEmptyMVar forkIO . evaluate $ unsafePerformIO (evaluate a >>= \x -> putMVar sem True >> putMVar v x) `par` unsafePerformIO (evaluate b >>= \x -> putMVar sem True >> putMVar v x) takeMVar v
--This version tries the same thing, but with normal threads. it seems zippier at first, but also like it may have a leak. amb' a b = block $ do v <- newEmptyMVar sem <- newEmptyMVar forkIO $ evaluate a >>= \x -> putMVar sem True >> putMVar v x forkIO $ evaluate b >>= \x -> putMVar sem True >> putMVar v x takeMVar v
Cheers, Sterl.
On Dec 28, 2008, at 7:34 PM, Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running the
example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, unless it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while.
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
I've attached a module, TestRace.hs, containing these experiments.
- Conal
On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us
behavior we'd prefer (speaking of which, I recall there's something in
On Sat, Dec 27, 2008 at 6:03 PM, Bertram Felgenhauer < bertram.felgenhauer@googlemail.com> wrote: Sterling Clover wrote: the the
works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented: http://www.haskell.org/ghc/docs/latest/html/libraries/base/ Control-Concurrent.html#v%3AforkIO (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

You're absolutely right. The threads get killed when they hang on putting the full semaphore, not in the course of evaluation. I knew that there was going to be a fatal flaw, but I just didn't think it would be that obvious. :-) Cheers, Sterl. On Dec 29, 2008, at 2:23 AM, Conal Elliott wrote:
What does 'putMVar sem True' accomplish that 'putMVar v x' doesn't alone?
If 'evaluate a' succeeds first (for instance), 'evaluate b' will keep on working, won't it? I doubt the RTS is smart enough to notice that the result of 'evaluate b' is going to be put into an MVar, let alone a full one with no takeMVar, or that 'evaluate b' is followed by writing to a similarly full & abandoned boolean MVar.
- Conal
On Sun, Dec 28, 2008 at 11:05 PM, Sterling Clover
wrote: Not sent to glasgow-haskell, because I feel it would be getting off topic: After playing around with the track we were on I got in a hopeless muddle, and somewhat doubt that the current ghc runtime makes it easy. So I started again from first principles and got the following, which seems to work. A casual test with the TestRace harness doesn't seem to show any obvious leaks. The ghci runtime *should* detect when sparks block indefinitely on putting into the blocked MVar (the semaphore, which goes out of scope), and kill them. The messiness of the story of unamb so far makes me doubt this solution, or wonder if trusting the runtime to eventually get around to it is insufficient, but nonetheless, it may just work.
I've also tried a versuion with real threads instead of sparks, to the same effect, relying on the runtime to detect and kill indefinite blocking, rather than using explicit asynchronous exceptions.
Both would need more stress testing for me to feel confident that they don't leak.
unamb :: a -> a -> a a `unamb` b = unsafePerformIO (a `amb` b)
amb a b = do v <- newEmptyMVar sem <- newEmptyMVar forkIO . evaluate $ unsafePerformIO (evaluate a >>= \x -> putMVar sem True >> putMVar v x) `par` unsafePerformIO (evaluate b
= \x -> putMVar sem True >> putMVar v x) takeMVar v
--This version tries the same thing, but with normal threads. it seems zippier at first, but also like it may have a leak. amb' a b = block $ do v <- newEmptyMVar sem <- newEmptyMVar forkIO $ evaluate a >>= \x -> putMVar sem True >> putMVar v x forkIO $ evaluate b >>= \x -> putMVar sem True >> putMVar v x takeMVar v
Cheers, Sterl.
On Dec 28, 2008, at 7:34 PM, Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running the example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, unless it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while.
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
I've attached a module, TestRace.hs, containing these experiments.
- Conal
On Sat, Dec 27, 2008 at 6:03 PM, Bertram Felgenhauer
wrote: Sterling Clover wrote: On Dec 27, 2008, at 9:02 AM, Bertram Felgenhauer wrote:
In the above code, there is a small window between catching the ThreadKilled exception and throwing it again though, where other exceptions may creep in. The only way I see of fixing that is to use 'block' and 'unblock' directly.
That certainly seems to do the trick for the simple example at least. One way to reason about it better would be, instead of folding everything into the race function, to simply modify ghc's bracket function to give us the behavior we'd prefer (speaking of which, I recall there's something in the works for 6.12 or so to improve rethrowing of asynchronous exceptions?)
brackAsync before after thing = block (do a <- before r <- catch (unblock (thing a)) (\_ -> after a >> myThreadId >>= killThread >> brackAsync before after thing ) after a return r ) where threadKilled ThreadKilled = Just () threadKilled _ = Nothing
This code turns any exception into ThreadKilled further down the stack.
(\e -> do after a myThreadId >>= flip throwTo (e :: SomeException) ...
might do the trick.
My assumption was that anything but 'ThreadKilled' would be a real error. This isn't really true, I guess - thanks to throwTo, any exception could be asynchronous.
If an exception is thrown, 'after a' is run again after the computation has resumed.
That's why I did the cleanup within the 'catch'.
But there's no reason why you couldn't do that as well:
brackAsync before after thing = block $ do a <- before catch (unblock (thing a) >>= \r -> after a >> return r) $ \e -> do after a myThreadId >>= flip throwTo (e :: SomeException) brackAsync before after thing )
This brackAsync just drops in to the previous code where bracket was and appears to perform correctly.
Right. 'race' should also unblock exceptions in the worker threads,
withThread u v = brackAsync (forkIO (unblock u)) killThread (const v)
but that's an independent change.
Further, if we place a trace after the killThread, we se it gets executed once when the example is read (i.e. a resumption) but it does not get executed if the (`seq` v) is removed from the example So this gives me some hope that this is actually doing what we'd like. I don't doubt it may have further kinks however.
At least the GHC RTS has support for the hard part - unwinding the stack so that computations can be resumed seamlessly.
I'm not sure which of the approaches I like better - it seems that we have a choice between turning async exceptions into sync ones or vice versa, and neither choice is strictly superior to the other.
Enjoy,
Bertram
'race' update: - Bugfix: Previously, only AsyncException-s would be caught. Use 'fromException' to select the ThreadKilled exception. - I tried using a custom 'SuspendException' type, but this resulted in 'test: SuspendException' messages on the console, while ThreadKilled is silently ignored... as documented: http://www.haskell.org/ghc/docs/latest/html/libraries/base/ Control-Concurrent.html#v%3AforkIO (http://tinyurl.com/9t5pxs) - Tweak: Block exceptions while running 'cleanup' to avoid killing threads twice. - Trick: takeMVar is a blocking operation, so exceptions can be delivered while it's waiting - there's no need to use 'unblock' for this. In other words, unblock (takeMVar v) and takeMVar v are essentially equivalent for our purposes.
race :: IO a -> IO a -> IO a race a b = block $ do v <- newEmptyMVar let t x = unblock (x >>= putMVar v) ta <- forkIO (t a) tb <- forkIO (t b) let cleanup = killThread ta >> killThread tb (do r <- takeMVar v; cleanup; return r) `catch` \e -> cleanup >> case fromException e of Just ThreadKilled -> do myThreadId >>= killThread unblock (race a b) _ -> throwIO e _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running the example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, *unless* it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while.
It seems that we've found an RTS bug. If a thread is started with exceptions blocked, then throwTo might never deliver its exception and block forever, as can be seen with the following test program, which locks up after a while (immediately with the non-threaded RTS) import Control.Exception import Control.Concurrent import Control.Monad test n = do t <- block $ forkIO yield yield putStr $ show n ++ ": kill\n" killThread t main = forM_ [1..] test Or, even more convincing: import Control.Exception import GHC.Conc main = do t1 <- block $ forkIO yield t2 <- forkIO $ killThread t1 yield yield threadStatus t1 >>= print threadStatus t2 >>= print prints (fairly reliably, it seems): ThreadFinished ThreadBlocked BlockedOnException (Trac is giving me errors right now. I'm planning to report this later.)
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
-- This one locks up after a while even in ghci. When compiled -threaded -- and run +RTS -N2, it locks up almost immediately. a `race` b = do v <- newEmptyMVar let t x = x >>= putMVar v withThread (t a) $ withThread (t b) $ takeMVar v where withThread u v = brackAsync (forkIO u) killThread (const v)
At the point the 'forkIO' is run, exceptions are blocked, making the thread basically immortal. Using
withThread u v = brackAsync (forkIO $ unblock u) killThread (const v)
we get the same behaviour as with my 'race' - it works for a while, but locks up eventually. I believe that the actual lockup is similar to the test programs above in all cases - what's different is just the probability of triggering it. regards, Bertram

That is very good news! Let's hope it's a bug that is easy enough to
fix, since I guess the RTS is very tricky.
Thanks for all this effort. It would explain a lot of strange behaviour.
Cheers,
Peter Verswyvelen
CTO - Anygma.com
On Sat, Jan 3, 2009 at 4:48 PM, Bertram Felgenhauer
Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running the example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, *unless* it's compiled with -threaded and run with +RTS -N2. In the latter case, it locks up after a while.
It seems that we've found an RTS bug. If a thread is started with exceptions blocked, then throwTo might never deliver its exception and block forever, as can be seen with the following test program, which locks up after a while (immediately with the non-threaded RTS)
import Control.Exception import Control.Concurrent import Control.Monad
test n = do t <- block $ forkIO yield yield putStr $ show n ++ ": kill\n" killThread t
main = forM_ [1..] test
Or, even more convincing:
import Control.Exception import GHC.Conc
main = do t1 <- block $ forkIO yield t2 <- forkIO $ killThread t1 yield yield threadStatus t1 >>= print threadStatus t2 >>= print
prints (fairly reliably, it seems):
ThreadFinished ThreadBlocked BlockedOnException
(Trac is giving me errors right now. I'm planning to report this later.)
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
-- This one locks up after a while even in ghci. When compiled -threaded -- and run +RTS -N2, it locks up almost immediately. a `race` b = do v <- newEmptyMVar let t x = x >>= putMVar v withThread (t a) $ withThread (t b) $ takeMVar v where withThread u v = brackAsync (forkIO u) killThread (const v)
At the point the 'forkIO' is run, exceptions are blocked, making the thread basically immortal. Using
withThread u v = brackAsync (forkIO $ unblock u) killThread (const v)
we get the same behaviour as with my 'race' - it works for a while, but locks up eventually.
I believe that the actual lockup is similar to the test programs above in all cases - what's different is just the probability of triggering it.
regards,
Bertram _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I
think getting this bug fixed will solve Reactive's mysterious bugs and
unblock its progress.
- Conal
On Sat, Jan 3, 2009 at 1:20 PM, Peter Verswyvelen
That is very good news! Let's hope it's a bug that is easy enough to fix, since I guess the RTS is very tricky.
Thanks for all this effort. It would explain a lot of strange behaviour.
Cheers, Peter Verswyvelen CTO - Anygma.com
Conal Elliott wrote:
Thanks very much for these ideas. Peter Verswyvelen suggested running
example repeatedly to see if it always runs correctly. He found, and I verified, that the example runs fine with Bertram's last version of unamb below, *unless* it's compiled with -threaded and run with +RTS -N2. In
On Sat, Jan 3, 2009 at 4:48 PM, Bertram Felgenhauer
wrote: the the latter case, it locks up after a while.
It seems that we've found an RTS bug. If a thread is started with exceptions blocked, then throwTo might never deliver its exception and block forever, as can be seen with the following test program, which locks up after a while (immediately with the non-threaded RTS)
import Control.Exception import Control.Concurrent import Control.Monad
test n = do t <- block $ forkIO yield yield putStr $ show n ++ ": kill\n" killThread t
main = forM_ [1..] test
Or, even more convincing:
import Control.Exception import GHC.Conc
main = do t1 <- block $ forkIO yield t2 <- forkIO $ killThread t1 yield yield threadStatus t1 >>= print threadStatus t2 >>= print
prints (fairly reliably, it seems):
ThreadFinished ThreadBlocked BlockedOnException
(Trac is giving me errors right now. I'm planning to report this later.)
I also tried a version with brackAsync and found that it eventually locks up even under ghci. When compiled & run multi-threaded, it locks up almost immediately.
-- This one locks up after a while even in ghci. When compiled -threaded -- and run +RTS -N2, it locks up almost immediately. a `race` b = do v <- newEmptyMVar let t x = x >>= putMVar v withThread (t a) $ withThread (t b) $ takeMVar v where withThread u v = brackAsync (forkIO u) killThread (const v)
At the point the 'forkIO' is run, exceptions are blocked, making the thread basically immortal. Using
withThread u v = brackAsync (forkIO $ unblock u) killThread (const v)
we get the same behaviour as with my 'race' - it works for a while, but locks up eventually.
I believe that the actual lockup is similar to the test programs above in all cases - what's different is just the probability of triggering it.
regards,
Bertram _______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive
_______________________________________________ Reactive mailing list Reactive@haskell.org http://www.haskell.org/mailman/listinfo/reactive

Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too? Cheers, Simon

I don't know if the bug would explain <<loop>>. My guess is that the
black-hole-detection code is incorrectly concluding there is a black hole
because a thunk was marked as in the process of being evaluated, then the
evaluating thread is killed without unmarking the thunk, and then another
thread needs the whnf. This is a fairly naive guess. I don't know this
machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon

The bugs that we have identified result in deadlocks - the runtime doesn't notice that a thread blocked in throwTo can continue. I can't think of a way that this could lead to bogus <<loop>>, but I suppose I wouldn't be too surprised if it were possible. The best way forward is for you to test out a snapshot once these patches have made it into a build. Does that sound reasonable? I've been running your TestRace program for quite a while on 4 processors without any hangs now. Cheers, Simon Conal Elliott wrote:
I don't know if the bug would explain <<loop>>. My guess is that the black-hole-detection code is incorrectly concluding there is a black hole because a thunk was marked as in the process of being evaluated, then the evaluating thread is killed without unmarking the thunk, and then another thread needs the whnf. This is a fairly naive guess. I don't know this machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
mailto:marlowsd@gmail.com> wrote: Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon

Hi Simon,
That's great news. Please let me know when there's a build I can grab, and
I'll try it out.
Regards, - Conal
On Wed, Jan 7, 2009 at 7:22 AM, Simon Marlow
The bugs that we have identified result in deadlocks - the runtime doesn't notice that a thread blocked in throwTo can continue. I can't think of a way that this could lead to bogus <<loop>>, but I suppose I wouldn't be too surprised if it were possible.
The best way forward is for you to test out a snapshot once these patches have made it into a build. Does that sound reasonable? I've been running your TestRace program for quite a while on 4 processors without any hangs now.
Cheers, Simon
Conal Elliott wrote:
I don't know if the bug would explain <<loop>>. My guess is that the black-hole-detection code is incorrectly concluding there is a black hole because a thunk was marked as in the process of being evaluated, then the evaluating thread is killed without unmarking the thunk, and then another thread needs the whnf. This is a fairly naive guess. I don't know this machinery well enough to have confidence.
What do you think?
- Conal
On Tue, Jan 6, 2009 at 6:27 AM, Simon Marlow
> wrote: Conal Elliott wrote:
Indeed -- many thanks to Bertram, Sterling, Peter & others for the help! I think getting this bug fixed will solve Reactive's mysterious bugs and unblock its progress.
Ok, we can fix the fairly simple bug that a thread created in blocked mode blocks throwTos after the thread has finished. But I'm slightly suspicious of the <<loop>> results you were getting - does this bug explain those too?
Cheers, Simon
participants (7)
-
Bertram Felgenhauer
-
Conal Elliott
-
Isaac Dupree
-
Isaac Dupree
-
Peter Verswyvelen
-
Simon Marlow
-
Sterling Clover