
#12607: Memory effects in doomed STM transactions -------------------------------------+------------------------------------- Reporter: fryguybob | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 8.0.1 System | Keywords: STM, allocate | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- == Problem == GHC's STM implementation does "lazy" validation meaning that speculative execution can continue even after observing an inconsistent view of memory. Transactions in the state are called "doomed transactions". While the runtime tries to avoids some effects in doomed transactions, such as entering an infinite loop, it does not successfully control all memory effects. In particular there are many (I do not have an exhaustive list) allocations that eventually end up in the `allocate` function which is nicely documented with the following:
`allocate()` never fails; if it returns, the returned value is a valid address. If the nursery is already full, then another block is allocated from the global block pool. If we need to get memory from the OS and that operation fails, then the whole process will be killed.
https://github.com/ghc/ghc/blob/master/rts/sm/Storage.c#L819 A doomed transaction that demands a large allocation that the OS cannot fulfill will kill the process. Here is a program that reliably fails for me: {{{#!hs -- oom.hs {-# LANGUAGE BangPatterns #-} import GHC.Conc import Control.Concurrent import qualified Data.ByteString as B forever act = act >> forever act check True = return () check False = retry main = do tx <- newTVarIO 0 ty <- newTVarIO 1 tz <- newTVarIO 0 done <- newTVarIO False _ <- forkIO $ forever $ atomically $ do -- Only read tx and ty x <- readTVar tx y <- readTVar ty if x > y -- This should always be False then do -- We only get here in a doomed transaction. Commenting out the next two lines -- but leaving the write to done and the program runs as expected because the -- doomed transaction is detected at commit time. let !big = B.length $ B.replicate maxBound 0 -- The big allocation! writeTVar tz big writeTVar done True else return () let mut = forever $ atomically $ do y <- readTVar ty x <- readTVar tx if x > 1000 then do -- When we get big enough, start over. writeTVar tx 0 writeTVar ty 1 -- Always holds semantically that tx < ty else do -- increment x and y both writeTVar ty (succ y) writeTVar tx (succ x) -- tx < ty -- Give lots of opportunities to witness inconsistent memory. _ <- forkIO mut _ <- forkIO mut _ <- forkIO mut atomically $ readTVar done >>= check putStrLn "Done" }}} Running leads to out of memory: {{{#!bash
ghc oom.hs -fno-omit-yields -threaded [1 of 1] Compiling Main ( oom.hs, oom.o ) Linking oom.exe ... ./oom.exe +RTS -N oom.exe: out of memory }}}
== Fixing == When a potentially bad memory effect is about to happen in some thread, we need to ensure that we validate any running transaction and if it fails have some way to back out of the operation and abort the transaction. This might be tricky on two fronts 1) finding all the critical allocations and 2) finding places where we can both detect (1) and abort the transaction. The places I'm concerned about most are array allocation such as the `ByteString` in the example and maybe `Integer` allocation. Another fix is a different STM implementation altogether that (at a performance cost or trade-off) doesn't allow doomed transactions (to appear Haskell Symposium 2016 :D). I think we can at least address this particular example without going so far. A related issue is more general memory leaks from doomed transactions. Consider a program with a memoized Fibonacci function. The semantics of the transactions written by the programmer may ensure that no value higher then 10 is ever demanded of that function, yet in a doomed transaction the invariant goes wrong and the program demands 100. The program will detect the doomed transaction eventually in this case, but not before it has allocated a large live blob never to be touched again. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12607 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler