[GHC] #15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to hang

#15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to hang -------------------------------------+------------------------------------- Reporter: syntheorem | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime | Version: 8.4.3 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime crash Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- An unsafe foreign call which calls `hs_try_putmvar` can cause the RTS to hang, preventing any Haskell threads from making progress. However, compiling with `-debug` causes it instead to fail an assertion in the scheduler: {{{ internal error: ASSERTION FAILED: file rts/Schedule.c, line 510 (GHC version 8.4.3 for x86_64_apple_darwin) }}} Here is a minimal test case which reproduces the assertion. It needs to be built with `-debug -threaded` and run with `+RTS -N2` or higher. {{{#!hs import Control.Concurrent (forkIO, threadDelay) import Control.Concurrent.MVar (MVar, newEmptyMVar, takeMVar) import Control.Monad (forever) import Foreign.C.Types (CInt(..)) import Foreign.StablePtr (StablePtr) import GHC.Conc (PrimMVar, newStablePtrPrimMVar) foreign import ccall unsafe hs_try_putmvar :: CInt -> StablePtr PrimMVar -> IO () main = do mvar <- newEmptyMVar forkIO $ forever $ do takeMVar mvar forkIO $ forever $ do sp <- newStablePtrPrimMVar mvar hs_try_putmvar (-1) sp threadDelay 1 -- Let it spin a few times to trigger the bug threadDelay 500 }}} I actually checked out GHC and added this as a test case and did some debugging. The specific assertion that fails is `ASSERT(task->cap == cap)`. This seems to happen because of this code in `hs_try_putmvar`: {{{#!c Task *task = getTask(); // ... ACQUIRE_LOCK(&cap->lock); // If the capability is free, we can perform the tryPutMVar immediately if (cap->running_task == NULL) { cap->running_task = task; task->cap = cap; RELEASE_LOCK(&cap->lock); // ... releaseCapability(cap); } else { // ... } }}} Basically it assumes that the current thread's task isn't currently running a capability, so it takes a new one and then releases it without restoring the previous value of `task->cap`. Modifying the code to restore the value of `task->cap` after releasing the capability fixes the assertion. But I don't know enough about the RTS to be sure I'm not missing something here. In particular, is there a problem with the task basically holding two capabilities for a short time? My other thought is that maybe it should check if its task is currently running a capability, and in that case do something else. But I'm not sure what. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15427 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to hang -------------------------------------+------------------------------------- Reporter: syntheorem | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Runtime System | Version: 8.4.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: simonmar (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15427#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to hang -------------------------------------+------------------------------------- Reporter: syntheorem | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Runtime System | Version: 8.4.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): `hs_try_putmvar` is designed to be called without a Capability, I hadn't anticipated that someone might want to call it from an unsafe FFI call. What's your use case? We could definitely make this clearer in the docs, and perhaps make it fail in a better way. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15427#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC