ghc 7.10.1 hard lock on exit with shake, OS X 10.10

After I upgraded to 7.10.1 I started noticing that my shakefile would lock up on exit. It's after the 'main' function exits, and none of the shake tests have a problem, so presumably it's a GHC thing, that shake somehow causes to happen. Only kill -9 gets it to quit. Here's a stack trace from the OS X sampler: Call graph: 2801 Thread_909901 DispatchQueue_2: com.apple.libdispatch-manager (serial) 2801 _dispatch_mgr_thread (in libdispatch.dylib) + 52 [0x7fff8828ca6a] 2801 kevent64 (in libsystem_kernel.dylib) + 10 [0x7fff90ec0232] Total number in stack (recursive counted multiple, when >=5): Sort by top of stack, same collapsed (when >= 5): kevent64 (in libsystem_kernel.dylib) 2801 I know there aren't a lot of details here, but does this sound familiar to anyone? I can't see anything on https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.2 that looks like this. Is there any way I can get more information I can get to report this? It used to be frequent (once in 10 runs maybe), but later became quite infrequent (once in a couple hundred runs, maybe). I downgraded to 7.8.4 and it hasn't happened again.

could you share a minimal program that reproduces the problem?
On Tue, Jun 2, 2015 at 9:36 PM, Evan Laforge
After I upgraded to 7.10.1 I started noticing that my shakefile would lock up on exit. It's after the 'main' function exits, and none of the shake tests have a problem, so presumably it's a GHC thing, that shake somehow causes to happen. Only kill -9 gets it to quit. Here's a stack trace from the OS X sampler:
Call graph: 2801 Thread_909901 DispatchQueue_2: com.apple.libdispatch-manager (serial) 2801 _dispatch_mgr_thread (in libdispatch.dylib) + 52 [0x7fff8828ca6a] 2801 kevent64 (in libsystem_kernel.dylib) + 10 [0x7fff90ec0232]
Total number in stack (recursive counted multiple, when >=5):
Sort by top of stack, same collapsed (when >= 5): kevent64 (in libsystem_kernel.dylib) 2801
I know there aren't a lot of details here, but does this sound familiar to anyone? I can't see anything on https://ghc.haskell.org/trac/ghc/wiki/Status/GHC-7.10.2 that looks like this. Is there any way I can get more information I can get to report this?
It used to be frequent (once in 10 runs maybe), but later became quite infrequent (once in a couple hundred runs, maybe). I downgraded to 7.8.4 and it hasn't happened again. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

On Tue, Jun 2, 2015 at 7:20 PM, Carter Schonwald
could you share a minimal program that reproduces the problem?
That's the thing, it's a thousand line shakefile that builds a 100k line program, and it's happening only rarely now. Since it happens so rarely it seems really difficult to prune away bits to see if it still happens. I suppose since the building is all just running commands, the source it's building doesn't matter, but since it's a build, it runs a different sequence of commands every time. I suppose I could "stub out" the program by replacing ghc with a shell script that sleeps and touches the output files, but it feels like I could spend days on it because there are tons of little details. I'm pretty sure it's related to the threaded runtime, because it doesn't happen without -threaded. I could try with -debug, but that probably turns off -threaded too, so no more problem. Shake is heavily threaded and nondeterministic. I haven't seen other shake users report it though.

Perhaps #10317 is related?
https://ghc.haskell.org/trac/ghc/ticket/10317
You might try building with the latest ghc-7.10 branch.
On Wed, Jun 3, 2015 at 12:27 AM, Evan Laforge
On Tue, Jun 2, 2015 at 7:20 PM, Carter Schonwald
wrote: could you share a minimal program that reproduces the problem?
That's the thing, it's a thousand line shakefile that builds a 100k line program, and it's happening only rarely now. Since it happens so rarely it seems really difficult to prune away bits to see if it still happens. I suppose since the building is all just running commands, the source it's building doesn't matter, but since it's a build, it runs a different sequence of commands every time. I suppose I could "stub out" the program by replacing ghc with a shell script that sleeps and touches the output files, but it feels like I could spend days on it because there are tons of little details.
I'm pretty sure it's related to the threaded runtime, because it doesn't happen without -threaded. I could try with -debug, but that probably turns off -threaded too, so no more problem. Shake is heavily threaded and nondeterministic. I haven't seen other shake users report it though. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

I've been trying the 7.10.2 testing release for the last few days, and
so far no lock-ups.
Maybe that was it!
On Tue, Jun 2, 2015 at 11:29 PM, Austin Seipp
Perhaps #10317 is related?
https://ghc.haskell.org/trac/ghc/ticket/10317
You might try building with the latest ghc-7.10 branch.
On Wed, Jun 3, 2015 at 12:27 AM, Evan Laforge
wrote: On Tue, Jun 2, 2015 at 7:20 PM, Carter Schonwald
wrote: could you share a minimal program that reproduces the problem?
That's the thing, it's a thousand line shakefile that builds a 100k line program, and it's happening only rarely now. Since it happens so rarely it seems really difficult to prune away bits to see if it still happens. I suppose since the building is all just running commands, the source it's building doesn't matter, but since it's a build, it runs a different sequence of commands every time. I suppose I could "stub out" the program by replacing ghc with a shell script that sleeps and touches the output files, but it feels like I could spend days on it because there are tons of little details.
I'm pretty sure it's related to the threaded runtime, because it doesn't happen without -threaded. I could try with -debug, but that probably turns off -threaded too, so no more problem. Shake is heavily threaded and nondeterministic. I haven't seen other shake users report it though. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

Oops, no it wasn't. Still getting lock-ups with 7.10.1.20150612,
though they are rare.
But this report seems not so useful since I don't really know how to
make progress on reducing and reproducing. Maybe it's best to wait to
see if any other reports come in. A large company doing many builds a
day would see this a lot, so unless it's somehow specific to my
configuration, eventually more reports should come in.
On Tue, Jun 16, 2015 at 11:19 PM, Evan Laforge
I've been trying the 7.10.2 testing release for the last few days, and so far no lock-ups. Maybe that was it!
On Tue, Jun 2, 2015 at 11:29 PM, Austin Seipp
wrote: Perhaps #10317 is related?
https://ghc.haskell.org/trac/ghc/ticket/10317
You might try building with the latest ghc-7.10 branch.
On Wed, Jun 3, 2015 at 12:27 AM, Evan Laforge
wrote: On Tue, Jun 2, 2015 at 7:20 PM, Carter Schonwald
wrote: could you share a minimal program that reproduces the problem?
That's the thing, it's a thousand line shakefile that builds a 100k line program, and it's happening only rarely now. Since it happens so rarely it seems really difficult to prune away bits to see if it still happens. I suppose since the building is all just running commands, the source it's building doesn't matter, but since it's a build, it runs a different sequence of commands every time. I suppose I could "stub out" the program by replacing ghc with a shell script that sleeps and touches the output files, but it feels like I could spend days on it because there are tons of little details.
I'm pretty sure it's related to the threaded runtime, because it doesn't happen without -threaded. I could try with -debug, but that probably turns off -threaded too, so no more problem. Shake is heavily threaded and nondeterministic. I haven't seen other shake users report it though. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/
participants (3)
-
Austin Seipp
-
Carter Schonwald
-
Evan Laforge