[GHC] #12019: Profiling option -hb is not thread safe

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Runtime | Version: 8.1 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime crash Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: #11978, #12009 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- This ticket is a continuation of #11978 and #12009. After fixing a couple of issues in those two tickets I found that the profiling run time is not thread safe. Have a trivial test program (written as one of the tests for #11978): {{{ import Control.Concurrent import Control.Concurrent.MVar import Control.Exception import Control.Monad main :: IO () main = do putStrLn "Start ..." mvar <- newMVar (0 :: Int) let count = 50 forM_ [ 1 .. count ] $ const $ forkIO $ do threadDelay 100 i <- takeMVar mvar putMVar mvar $! i + 1 threadDelay 1000000 end <- takeMVar mvar putStrLn $ "Final result " ++ show end assert (end == count) $ return () }}} Compiling that with a compiler that has bug fixes arising from #11978 and #12009 as: {{{ inplace/bin/ghc-stage2 testsuite/tests/profiling/should_run/T11978b.hs \ -fforce-recomp -rtsopts -fno-warn-tabs -O -prof -static -auto-all \ -threaded -debug -o T11978b }}} and run as: {{{ ./T11978b +RTS -p -hb -N10 }}} crashes in a number of different ways. I've seen at least 3 different assertion failures and numerous segfaults (in different `stg_ap_*` functions). Replace `-hb` with other profiling options like `-hr` etc do not seem to crash. Looking at code, one example of lack of thread safetly is the function `LDV_recordDead` which mutates global variable `censuses` which does not have any locking around it. Only figured this out because the following assert (in `LDV_recordDead`) was being triggered occasionally. {{{ ASSERT(censuses[t].void_total < censuses[t].not_used); }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Yes, when I made the rest of profiling thread-safe I didn't look at +RTS -hb. Perhaps we should disable it except at `-N1`, unless you want to have a go at fixing it? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by erikd): Yes, I'm going to try to fix it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by erikd): Fixing this is definitely not trivial. I've placed locks around some of the shared mutable data structures, but I still get the same assertion (in function `processHeapClosureForDead`) failing: {{{ ASSERT(((LDVW(c) & LDV_CREATE_MASK) >> LDV_SHIFT) > 0); }}} because the `overwritingClosure` has already been called on it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): We should at least try to disable it except at `-N1`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => patch * differential: => Phab:D2516 @@ -7,1 +7,1 @@ - {{{ + {{{#!hs New description: This ticket is a continuation of #11978 and #12009. After fixing a couple of issues in those two tickets I found that the profiling run time is not thread safe. Have a trivial test program (written as one of the tests for #11978): {{{#!hs import Control.Concurrent import Control.Concurrent.MVar import Control.Exception import Control.Monad main :: IO () main = do putStrLn "Start ..." mvar <- newMVar (0 :: Int) let count = 50 forM_ [ 1 .. count ] $ const $ forkIO $ do threadDelay 100 i <- takeMVar mvar putMVar mvar $! i + 1 threadDelay 1000000 end <- takeMVar mvar putStrLn $ "Final result " ++ show end assert (end == count) $ return () }}} Compiling that with a compiler that has bug fixes arising from #11978 and #12009 as: {{{ inplace/bin/ghc-stage2 testsuite/tests/profiling/should_run/T11978b.hs \ -fforce-recomp -rtsopts -fno-warn-tabs -O -prof -static -auto-all \ -threaded -debug -o T11978b }}} and run as: {{{ ./T11978b +RTS -p -hb -N10 }}} crashes in a number of different ways. I've seen at least 3 different assertion failures and numerous segfaults (in different `stg_ap_*` functions). Replace `-hb` with other profiling options like `-hr` etc do not seem to crash. Looking at code, one example of lack of thread safetly is the function `LDV_recordDead` which mutates global variable `censuses` which does not have any locking around it. Only figured this out because the following assert (in `LDV_recordDead`) was being triggered occasionally. {{{ ASSERT(censuses[t].void_total < censuses[t].not_used); }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe
-------------------------------------+-------------------------------------
Reporter: erikd | Owner: erikd
Type: bug | Status: patch
Priority: normal | Milestone: 8.0.2
Component: Runtime System | Version: 8.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Runtime crash | Test Case:
Blocked By: | Blocking:
Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: 8.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed Comment: Merged to `ghc-8.0` as c51caafae7669d4246f4efd3d1a6858020780e02. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: erikd Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Runtime System | Version: Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * version: 8.1 => -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * owner: erikd => * status: closed => new * resolution: fixed => * milestone: 8.0.2 => Comment: Reopening since this is still an issue; we just happen to fail more gracefully now. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Comment (by carter): The fix on the 8.0 branch broke the build on my Mac. Reverting it for now for my local build. But surely it's impacted other folks -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe -------------------------------------+------------------------------------- Reporter: erikd | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516 Wiki Page: | -------------------------------------+------------------------------------- Comment (by carter): {{{ rts/ProfHeap.c: In function 'initHeapProfiling': rts/ProfHeap.c:389:49: error: error: 'PAR_FLAGS {aka struct _PAR_FLAGS}' has no member named 'nCapabilities' if (doingLDVProfiling() && RtsFlags.ParFlags.nCapabilities > 1) { ^ }}} I think the patch / CPP somehow doesn't quite work on the build ways matrix of the RTS in the expected way, i'll try to poke at it myself if i have time, but it def died -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12019#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12019: Profiling option -hb is not thread safe
-------------------------------------+-------------------------------------
Reporter: erikd | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version:
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Runtime crash | Test Case:
Blocked By: | Blocking:
Related Tickets: #11978, #12009 | Differential Rev(s): Phab:D2516
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari
participants (1)
-
GHC