[GHC] #10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>)

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Incorrect result Unknown/Multiple | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: Differential Revisions: | -------------------------------------+------------------------------------- Compiling the test case with: ghc -O2 -threaded -eventlog -rtsopts ghc-bug.hs Now, trying with some inputs and -N2 $ ./ghc-bug 7 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 6 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 5 +RTS -N2 => 3125 $ ./ghc-bug 5 +RTS -N2 ghc-bug: <<loop>> Reducing the number of capabilities to 1, it works for those inputs $ ./ghc-bug 7 +RTS -N1 As a side-note, the problem only happens randomly with small inputs (on my hardware), and it seems to go away with bigger inputs (the [http://lpaste.net/132564/ original testcase] felt a bit more deterministic, but I think the testcase in the ticket is good enough) I only tested this with GHC 7.8.4 (on Debian), but people on IRC reported the same behavior with GHC 7.10.1 on OS X and Debian Similar bug: [10218] (-fno-cse and -flate-dmd-anal didn't help with this) {{{#!hs import Control.Applicative import Control.Monad import Control.Parallel.Strategies import System.Environment newtype ParList a = ParList { unParList :: [a] } nil :: ParList a nil = ParList [] cons :: a -> ParList a -> ParList a cons x (ParList xs) = ParList (x:xs) instance Functor ParList where fmap = liftM instance Applicative ParList where pure = return (<*>) = ap instance Monad ParList where return = ParList . return {- v code that doesn't work -} (ParList xs) >>= f = ParList (withStrategy (parListChunk 8 rseq) (xs
= unParList . f)) --(ParList xs) >>= f = ParList (concat (parMap rseq (unParList . f) xs)) {- ^ code that works -}
type Pair = (Int, [Int]) loop' :: Pair -> ParList Pair loop' (size,qns) = go 1 where go n | n > size = nil | otherwise = cons (size, n:qns) (go (n+1)) worker :: Int -> Pair -> [Pair] worker n = unParList . go n where go 1 = loop' go n = loop' >=> go (n-1) main :: IO () main = do [n] <- (read <$>) <$> getArgs print $ length (worker n (n,[])) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Description changed by rwbarton: Old description:
Compiling the test case with:
ghc -O2 -threaded -eventlog -rtsopts ghc-bug.hs
Now, trying with some inputs and -N2 $ ./ghc-bug 7 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 6 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 5 +RTS -N2 => 3125 $ ./ghc-bug 5 +RTS -N2 ghc-bug: <<loop>>
Reducing the number of capabilities to 1, it works for those inputs $ ./ghc-bug 7 +RTS -N1
As a side-note, the problem only happens randomly with small inputs (on my hardware), and it seems to go away with bigger inputs (the [http://lpaste.net/132564/ original testcase] felt a bit more deterministic, but I think the testcase in the ticket is good enough)
I only tested this with GHC 7.8.4 (on Debian), but people on IRC reported the same behavior with GHC 7.10.1 on OS X and Debian
Similar bug: [10218] (-fno-cse and -flate-dmd-anal didn't help with this)
{{{#!hs import Control.Applicative import Control.Monad
import Control.Parallel.Strategies
import System.Environment
newtype ParList a = ParList { unParList :: [a] }
nil :: ParList a nil = ParList [] cons :: a -> ParList a -> ParList a cons x (ParList xs) = ParList (x:xs)
instance Functor ParList where fmap = liftM
instance Applicative ParList where pure = return (<*>) = ap
instance Monad ParList where return = ParList . return {- v code that doesn't work -} (ParList xs) >>= f = ParList (withStrategy (parListChunk 8 rseq) (xs
= unParList . f)) --(ParList xs) >>= f = ParList (concat (parMap rseq (unParList . f) xs)) {- ^ code that works -}
type Pair = (Int, [Int])
loop' :: Pair -> ParList Pair loop' (size,qns) = go 1 where go n | n > size = nil | otherwise = cons (size, n:qns) (go (n+1))
worker :: Int -> Pair -> [Pair] worker n = unParList . go n where go 1 = loop' go n = loop' >=> go (n-1)
main :: IO () main = do [n] <- (read <$>) <$> getArgs print $ length (worker n (n,[]))
}}}
New description: Compiling the test case with: ghc -O2 -threaded -eventlog -rtsopts ghc-bug.hs Now, trying with some inputs and -N2 $ ./ghc-bug 7 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 6 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 5 +RTS -N2 => 3125 $ ./ghc-bug 5 +RTS -N2 ghc-bug: <<loop>> Reducing the number of capabilities to 1, it works for those inputs $ ./ghc-bug 7 +RTS -N1 As a side-note, the problem only happens randomly with small inputs (on my hardware), and it seems to go away with bigger inputs (the [http://lpaste.net/132564/ original testcase] felt a bit more deterministic, but I think the testcase in the ticket is good enough) I only tested this with GHC 7.8.4 (on Debian), but people on IRC reported the same behavior with GHC 7.10.1 on OS X and Debian Similar bug: #10218 (-fno-cse and -flate-dmd-anal didn't help with this) {{{#!hs import Control.Applicative import Control.Monad import Control.Parallel.Strategies import System.Environment newtype ParList a = ParList { unParList :: [a] } nil :: ParList a nil = ParList [] cons :: a -> ParList a -> ParList a cons x (ParList xs) = ParList (x:xs) instance Functor ParList where fmap = liftM instance Applicative ParList where pure = return (<*>) = ap instance Monad ParList where return = ParList . return {- v code that doesn't work -} (ParList xs) >>= f = ParList (withStrategy (parListChunk 8 rseq) (xs
= unParList . f)) --(ParList xs) >>= f = ParList (concat (parMap rseq (unParList . f) xs)) {- ^ code that works -}
type Pair = (Int, [Int]) loop' :: Pair -> ParList Pair loop' (size,qns) = go 1 where go n | n > size = nil | otherwise = cons (size, n:qns) (go (n+1)) worker :: Int -> Pair -> [Pair] worker n = unParList . go n where go 1 = loop' go n = loop' >=> go (n-1) main :: IO () main = do [n] <- (read <$>) <$> getArgs print $ length (worker n (n,[])) }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I get the same behavior with HEAD. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by AlexET): Compiling with `-debug` I sometimes get {{{ BrokenCode: internal error: ASSERTION FAILED: file rts/Schedule.c, line 570 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): I simplified the problem a little. The first command line argument is for the chunk size given to `parListChunk`: {{{#!hs import Control.Parallel.Strategies import System.Environment type Pair = (Int, [Int]) loop' :: Pair -> [Pair] loop' (size,qns) = go 1 where go n | n > size = [] | otherwise = (size, n:qns) : (go (n+1)) worker :: Int -> Int -> Pair -> [Pair] worker chunksize = go where go 1 = loop' go n = withStrategy (parListChunk chunksize rseq) . concatMap (go (n-1)) . loop' main :: IO () main = do chunksize:n:_ <- fmap (map read) getArgs print $ length (worker chunksize n (n,[])) }}} With this i get, e.g.: {{{ $ ghc -O2 -threaded -rtsopts -fforce-recomp threads.hs $ time ./threads 8 7 +RTS -N1 823543 real 0m2.133s user 0m2.025s sys 0m0.097s $ time ./threads 8 7 +RTS -N2 threads: <<loop>> real 0m0.368s user 0m0.074s sys 0m0.016s }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): This came into my head again, and can be simplified, and I think clarified, like so: {{{#!hs import Control.Parallel.Strategies import System.Environment parConcatMapN :: Int -> Int -> (a -> [a]) -> a -> [a] parConcatMapN chunksize depth step = go depth where go 0 = (:[]) go n = withStrategy (parListChunk chunksize rseq) . concatMap (go (n-1)) . step main :: IO () main = do depth:_ <- fmap (map read) getArgs print $ length (test depth) test depth = parConcatMapN 3 depth show 'x' -- i.e. iterate (concatMap show) 'x' !! depth }}} We keep re-chunking, but do not respect the chunking we already did, in the would-be progress through {{{ 'x' '\'''x''\'' '\'''\\''\'''\'''\'''x''\'''\'''\\''\'''\'' ... }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by yongqli): We've run into what seems to be the same issue. We've isolated a test case here: https://gist.github.com/j-e-k/bf8c5f027178ee20ac94 Could we get this fixed in `7.10.2`? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): I'd love it to be fixed in 7.10.2, but first someone needs to volunteer to look into what is wrong, and hopefully find out how to fix it. Is it a bug in a library? In the RTS? In the compiler? Help needed! Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I inlined the relevant parts of the parallel package into michaelt's example for the sake of testing across multiple versions of GHC. In the process I discovered that building with `-feager-blackholing` is necessary to reproduce the bug. Well, not surprising. (The parallel package specifies this flag in its cabal file.) To be specific, I am building it as {{{ ghc -threaded -O -rtsopts par -fforce-recomp -feager-blackholing }}} and running as {{{ while ./par 8 +RTS -N4; do :; done }}} until it fails (which is usually immediately). I managed to bisect the failure down to these three commits between 7.6 and 7.8 which added cardinality analysis: https://github.com/ghc/ghc/compare/da4ff650ae77930a5a10d4886c8bc7d37f081db7.... Interestingly in the versions that have cardinality analysis, the program still loops even when built with `-fkill-absence`. Hopefully this provides some clues to someone... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): That's extremely helpful, thank you Reid. I'm guessing that the culprit, somehow, is this code in `CoreToStg`: {{{ upd_flag | isSingleUsed (idDemandInfo bndr) = SingleEntry | otherwise = Updatable }}} This arranges not to black-hole a thunk that is used at most once. Can you try with `-fkill-one-shot`? That switches off this optimisation, and I bet it'll make the program work. Assuming that does fix it, the next question is: is it just the `CoreToStg` `upd_flag`? (`-fkill-one-shot` affects more things than just that one spot.) The next thing I'd do would be just to comment out the `SingleEntry` case above, rebuild the compiler, and check that the bug is gone. I'm 95% sure that it'll will, but worth checking. Next. If some thunk is being marked `SingleEntry`, and that that is causing the bug: * Which thunk is it? * Is it really single-entry? (I.e. is the analysis right or not) If you `-ddump-stg` and look for thunks marked `\s`, those are the single- entry ones. There probably aren't very many. If it's very few, we could look at the "is the analysis right?" question. If it's many, we'll need to find a way to narrow it down somehow. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): BTW is is the overloading of `Traversable` important. Eliminate it from the single-module reproducer? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonmar): Lazy blackholing will still take place for thunks that are not blackholed by eager blackholing, because we have no way to distinguish between an eager-blackholed and a lazy-blackholed thunk in the runtime. We had bugs in this area in the past, see #5226. I'm not sure this helps, but it's possible that the cardinality analysis is correct and this is a runtime bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): Unfortunately `-fkill-one-shot` made no difference. My gut feeling is that this is a bug in the RTS that was uncovered (for this program) by the cardinality analysis patch, but I have no real evidence for this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): I reduced this a little further so that it just uses the Monad instance and the two basic combinators: {{{#!hs rparWith s a = Eval $ \s0 -> spark# r s0 where r = case s a of Eval f -> case f realWorld# of (# _, a' #) -> a' runEval :: Eval a -> a runEval (Eval x) = case x realWorld# of (# _, a #) -> a }}} and a non-recursive but monotonously layered concrete function. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): The mechanism for attaching source must be before my eyes, but here is the reduced module: {{{#!hs {-# LANGUAGE MagicHash, UnboxedTuples #-} import Control.Applicative import Control.Monad import GHC.Exts newtype Eval a = Eval (State# RealWorld -> (# State# RealWorld, a #)) instance Functor Eval where fmap = liftM instance Applicative Eval where (<*>) = ap; pure = return instance Monad Eval where return x = Eval $ \s -> (# s, x #) Eval x >>= k = Eval $ \s -> case x s of (# s', a #) -> case k a of Eval f -> f s' rparWith s a = Eval $ \s0 -> spark# r s0 where r = case s a of Eval f -> case f realWorld# of (# _, a' #) -> a' runEval :: Eval a -> a runEval (Eval x) = case x realWorld# of (# _, a #) -> a main :: IO () main = do -- print $ length (pf 'x') -- either statement works at least on and off print (program 'y') -- but I seem to lose the effect if I use both statements program = pchunk . concatMap (pchunk . concatMap (pchunk . concatMap (pchunk . show) . show) . show) . show where -- the effect seems to vanish if I eta expand pchunk pchunk = runEval . fmap concat . mapM (rparWith (mapM (\x -> Eval $ \s -> seq# x s) )) . chunk' -- the effect seems to disappear if I reject splitAt in favor -- of a pattern match chunk' (a:b:c:xs) = [a,b,c]: chunk' xs chunk' :: [a] -> [[a]] chunk' [] = [] chunk' xs = as : chunk' bs where (as,bs) = splitAt 3 xs }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): And a bit more compressed, for what it may be worth: {{{#!hs {-# LANGUAGE MagicHash, UnboxedTuples #-} import GHC.Exts newtype Eval a = Eval {runEval :: State# RealWorld -> (# State# RealWorld, a #)} -- inline sequence :: [Eval a] -> Eval [a] well_sequenced :: [Eval a] -> Eval [a] well_sequenced = foldr op (Eval $ \s -> (# s, [] #)) where op e es = Eval $ \s -> case runEval e s of (# s', a #) -> case runEval es s' of (# s'', as #) -> (# s'', a : as #) -- seemingly demonic use of spark# ill_sequenced :: [Eval a] -> Eval [a] ill_sequenced as = Eval $ spark# (case well_sequenced as of Eval f -> case f realWorld# of (# _, a' #) -> a') main :: IO () main = print ((layer . layer . layer . layer . layer) show 'y') where layer :: (Char -> String) -> (Char -> String) layer f = (\(Eval x) -> case x realWorld# of (# _, as #) -> concat as) . well_sequenced . map ill_sequenced . map (map (\x -> Eval $ \s -> (# s, x #))) . chunk' . concatMap f . show }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): Sorry, I'm spamming the trac a bit. Notice that in the ultra-simplified module, now attached properly, the wrapping with `Lift` that `parallel` uses for `rparWith` is no where to be found. If I wrap stuff in my `ill_sequenced` with `Lift`, I can't get the effect. If though, that use of `Lift` in the definition of `rparWith` is required by whatever is going on with `spark#` and some of these other opaque-to-me primitives, then there is a question whether it is used enough: the original program is doing an end-run around this. It is presumably obviously undesirable, but if in rbarton's par.hs I complicate the definition of `rpar` , which is {{{#!hs rpar :: a -> Eval a rpar x = Eval $ \s -> spark# x s }}} and use instead something like {{{#!hs rpar :: a -> Eval a rpar x = Eval $ \s -> case y of Eval f -> case f s of (# s1 , l #) -> case l of Lift w -> (# s1 , w #) where y = Eval $ \s -> spark# (Lift x) s }}} then it seems all is well again. That probably destroys all the desired effects; but if it doesn't, then the problem may just be that the library is letting the user get too close to `spark#` which is practically naked in `rpar`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into
<<loop>>)
-------------------------------------+-------------------------------------
Reporter: exio4 | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by rwbarton):
I took a look at the generated STG for par2.hs before and after the
cardinality analysis commit. Before, there are no `\s` thunks at all.
After, there are two. One is in
{{{
Main.main_go [Occ=LoopBreaker]
:: [[GHC.Types.Char]] -> [GHC.Types.Char]
[GblId,
Arity=1,
Caf=NoCafRefs,
Str=DmdType ,
Unf=OtherCon []] =
\r srt:SRT:[] [ds_s1nu]
case ds_s1nu of _ {
[] -> [] [];
: y_s1ny [Occ=Once] ys_s1nz [Occ=Once] ->
let {
sat_s1pj [Occ=Once, Dmd=]
Main.main_go =
\ (ds_XrA :: [[GHC.Types.Char]]) ->
case ds_XrA of _ {
[] -> GHC.Types.[] @ GHC.Types.Char;
: y_arg ys_arh ->
GHC.Base.++ @ GHC.Types.Char y_arg (Main.main_go ys_arh)
}
}}}
which presumably comes from the use of `concat` in `layer`. This happens
even when I build the program with `-fkill-absence -fkill-one-shot`. Could
that be because base was built with cardinality analysis enabled? I don't
entirely see how, but I can try rebuilding the libraries with `-fkill-
absence -fkill-one-shot`.
Anyways I guess the main question, which I'm not sure how to answer, is
whether the fact that this thunk is marked as single-entry is correct.
The other single-entry thunk is, I think, very similar and arises from
`concatMap`:
{{{
{- note
$wlayer_r1m6
:: (GHC.Types.Char -> GHC.Base.String)
-> GHC.Prim.Char# -> GHC.Base.String
[GblId, Caf=NoCafRefs, Str=DmdType, Unf=OtherCon []]
w3_r1ma :: GHC.Types.Char -> GHC.Base.String
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []]
-}
Main.main_go2 [Occ=LoopBreaker]
:: [GHC.Types.Char] -> [GHC.Types.Char]
[GblId, Arity=1, Str=DmdType , Unf=OtherCon []] =
\r srt:SRT:[(r8, Main.main_go2), (r1m6, $wlayer_r1m6),
(r1ma, w3_r1ma)] [ds_s1oh]
case ds_s1oh of _ {
[] -> [] [];
: y_s1ol [Occ=Once!] ys_s1oq [Occ=Once] ->
case y_s1ol of _ {
GHC.Types.C# ww1_s1oo [Occ=Once] ->
let {
sat_s1pv [Occ=Once, Dmd=

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): Inlining `concat` and `concatMap` made no difference, the program still loops and a `\s` thunk is still generated inside "`main_mygo`". This happens for any combination of the `-fkill-absence` and `-fkill-one-shot` flags, and in every version I tested (the 7.7 commit adding cardinality analysis, 7.8.4, 7.10.1 and HEAD). I then tried removing the `SingleEntry` case as Simon suggested in comment:9 and that did generate a `\u` thunk instead and the program no longer <<loop>>s. So, one conclusion is that `-fkill-absence`/`-fkill-one-shot` don't fully disable cardinality analysis like they are expected to. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Michael, Reid, that is super-helpful. The cardinality analysis is plain wrong, so now we know just what is happening. I'm working on a fix. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Drat. On further reflection I the cardinality analysis is correct. So it looks as though there is a bug in the runtime system. '''Simon M''': might you find time to investigate? With only two single- entry thunks it can't be that hard! I have literally no hypothesis for why a single-entry thunk could cause `<loop>`. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I think I've finally come up with at least part of a plausible explanation. Tell me if this seems right... Suppose I have a heap object whose initial value is {{{ x = \u [] concat [[1],[]] -- using "\u []" as syntax for "updatable thunk", -- otherwise normal Core syntax (not STG) }}} and suppose first some thread evaluates `x` to WHNF. Using the definition of `concat` as `main_go` from above, this will allocate a single-entry thunk `y = \s [] concat []`, and then calling `(++)` will lead to {{{ x = 1 : z y = \s [] concat [] z = \u [] (++) [] y }}} At this point the heap has the following property (*) The heap contains a single-entry thunk (`y`) and a regular thunk (`z`) such that entering the regular thunk will cause the single-entry thunk to be entered as well. (The key point is that the single-entry thunk has already been allocated on the heap, in contrast to a situation in which entering a regular thunk would cause a ''new'' single-entry thunk to be allocated, possibly entered, and then become garbage, all before evaluation of that regular thunk is complete.) Now, consider the following execution path of two threads A and B. * Both threads enter `z` simultaneously (before either manages to overwrite it with a black hole, if eager blackholing was enabled when we compiled `(++)`; otherwise before either manages to update it to an indirection). * Thread A does the case analysis inside `(++)` and enters `y`, and overwrites it with a black hole before thread B does anything else. * Now thread B does the case analysis inside `(++)` and enters `y`, but `y` has been overwritten with a black hole so thread B blocks. But `y` is never going to be updated, so thread B will block forever. This is bad! * Finally thread A finishes evaluating `y` (to `[]`) and then updates `z` accordingly. But thread B is still blocking on the black hole and even if it could get unblocked by some mechanism (say in the GC) there's no longer enough information on the heap to recover the correct value of `y` and allow thread B to continue. This doesn't exactly explain why the programs in this ticket <<loop>>, but a thread becoming permanently blocked is equally bad behavior I think. Some extra evidence that something like this is going on is that if I inline the definition of `(++)` into par2.hs as well, so that it is compiled with eager blackholing enabled, the program <<loop>>s much less frequently, just a few percent of the time as opposed to over half the time. That would match up with a smaller window of simultaneity in the first step of the execution trace above. If this analysis is correct, then assuming we want to continue to allow threads to enter thunks in an unsynchronized way (to avoid prohibitive locking costs), it seems we have to ensure that the condition (*) never holds, at least when eager blackholing is enabled. Generating single-entry thunks is still okay as long as they never survive as live heap objects after the thunk that allocated them has been reduced to WHNF. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I also came across this comment in `rts/ThreadPaused.c` (I think it is concerning a different scenario): {{{ // NB. Blackholing is *compulsory*, we must either do lazy // blackholing, or eager blackholing consistently. See Note // [upd-black-hole] in sm/Scav.c. }}} Does this mean that every module in the entire program, including all the libraries the program links against like base, needs to be compiled with the same presence or absence of `-feager-blackholing`? The User's Guide doesn't mention anything about this and if that is the case, the flag seems essentially unusable except for those who are building their own GHC from source. I'm hoping the comment is just worded unclearly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): We need Simon Marlow's advice here. What you say does seem plausible: * With lazy blackholing, a single-entry thunk will never be black-holed. Lazy black-holing only black-holes thunks with an update frame on the stack, and single-entry thunks do not push an update frame. So that could explain why `-feager-blackholing` is required. * Rather to my surprise, a single-entry thunk is black-holed. I can't see any good reason why this happens. Look at `StgCmmClosure.blackHoleOnEntry`. * Re comment:22 we need Simon to say. It would be pretty bad if every module had to be compiled consistently. However I note that the comment in `Scav.c` is in code that messes with update frames, and so single-entry thunks probably don't matter. Even if all this is right, we still don't know why we get `<<loop>>`. I'm assuming that this means we have entered a black hole, ''that is under evaluation by this thread''; but suddenly I'm not sure. (For a black hole being evaluated by another thread, we'd block.) I suggest you try making `blackHoleOnEntry` return False for single-entry thunks. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonmar): Thanks for all the awesome debugging work here, I think we finally have all the pieces! @rwbarton's analysis seems plausible to me too. A single-entry thunk can still be entered multiple times in a parallel setting, and if eager blackholing is on then it is possible that a thread can get blocked indefinitely. Threads blocked indefinitely are detected by the GC as deadlocked, and receive an exception, which results in the `<loop>` message. The fix should be simple: just don't do eager blackholing for single-entry thunks. Regarding this comment: {{{ // NB. Blackholing is *compulsory*, we must either do lazy // blackholing, or eager blackholing consistently. See Note // [upd-black-hole] in sm/Scav.c. }}} It means every update frame must refer to a black hole by the time the GC runs, it's an invariant the GC relies on. It shouldn't be a problem here because it only applies to update frames. I can reword the comment so it's clearer. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): The actual cause of the <<loop>> here seems to be that two threads are each blocking on a black hole that is being evaluated, or more likely has been evaluated but not updated, by the other thread. I attached a complete `-Ds` log above, but the relevant lines are {{{ ... 0.011574 7ff6be7fc700: cap 1: thread 6 stopped (blocked on black hole owned by thread 5) ... 0.011808 7ff6c6700740: cap 0: thread 5 stopped (blocked on black hole owned by thread 6) ... }}} I didn't work out exactly how this can arise, but it probably involves two single-entry thunks and two ordinary thunks whose evaluations force both of the single-entry thunks, but in different orders. Changing `blackHoleOnEntry` for single-entry thunks as suggested did fix `par2.hs`. I'm going to test the other examples in this ticket now. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by michaelt): rwbarton, I think I have tested all of these examples on my machine, for it's worth. I rebuilt HEAD with what I took to be the patch described above: {{{ - LFThunk _ _no_fvs _updatable _ _ -> True + LFThunk _ _no_fvs _updatable _ _ -> _updatable }}} for https://github.com/ghc/ghc/blob/master/compiler/codeGen/StgCmmClosure.hs#L76... . Everything works fine, or seems to work fine. The complex program yongqli linked, with all the fancy imports, was a little erratic; I increased the size of the csv a lot, to make it more reliably bad somewhere along the way. I then just used the scheme of running ` ./bugcsv +RTS -N_ | grep loop ` 500 times each with -N5 and -N3 . (-N2 does't seem to make bring out the pathology with this program.) With ghc-7.10.1 I got `bugcsv: <<loop>>` about 200 times either way, but with the patched head, blessed silence. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: | Phab:D10414 -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => patch * differential: => Phab:D10414 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: Phab:D1040 -------------------------------------+------------------------------------- Changes (by bgamari): * differential: Phab:D10414 => Phab:D1040 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into
<<loop>>)
-------------------------------------+-------------------------------------
Reporter: exio4 | Owner:
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions: Phab:D1040
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into
<<loop>>)
-------------------------------------+-------------------------------------
Reporter: exio4 | Owner:
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions: Phab:D1040
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#10414: Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>) -------------------------------------+------------------------------------- Reporter: exio4 | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: Phab:D1040 -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed Comment: Merged. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10414#comment:31 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC