[GHC] #15300: Unboxed Sums Crash

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Keywords: UnboxedSums | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I've made it a little further in my experiments with unboxed tuples in the `packed` library. However, I've run into another issue that I strongly suspect is the result of bad behavior of unboxed tuples. To replicate this issue (with GHC 8.4.3), do the following: {{{ git clone https://github.com/andrewthad/packed cd packed cabal new-build }}} We use `cabal new-build` for its side effect of creating a `.ghc.environment.xyz` file. Now, create a minimal example in the directory called `eol.hs` with the following contents: {{{ import Packed.Bytes.Parser (Parser) import Data.Word import Packed.Bytes (Bytes) import GHC.Exts (RealWorld) import Packed.Bytes.Stream.IO (ByteStream) import qualified Packed.Bytes as B import qualified Data.Char import qualified Packed.Bytes.Parser as P import qualified Packed.Bytes.Stream.IO as Stream main :: IO () main = do r <- runExampleParser ( do P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume ) (foldMap Stream.singleton (map charToWord8 "the\nemporium\rhas\narrived")) print r runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a, Maybe String) runExampleParser parser stream = do P.Result mleftovers r _ <- P.parseStreamIO stream () parser mextra <- case mleftovers of Nothing -> return Nothing Just (P.Leftovers chunk remainingStream) -> do bs <- Stream.unpack remainingStream return (Just (map word8ToChar (B.unpack chunk ++ bs))) return (either (const Nothing) Just r,mextra) word8ToChar :: Word8 -> Char word8ToChar = Data.Char.chr . fromIntegral charToWord8 :: Char -> Word8 charToWord8 = fromIntegral . Data.Char.ord s2b :: String -> Bytes s2b = B.pack . map charToWord8 c2w :: Char -> Word8 c2w = charToWord8 }}} Finally, build this with `ghc -O2 eol.hs`, and then run the executable this produces to get the following: {{{ (Nothing,Just "\rhas\narrived") eol: internal error: stg_ap_n_ret (GHC version 8.4.3 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Aborted (core dumped) }}} Things worth noting: 1. I think the program fails in the final GC that runs right before the program terminates. We can see that it produces a correct result of `(Nothing,Just "\rhas\narrived")`, but something on the heap has definitely been corrupted. 2. This only happens with `-O2` turned on. 3. This only happens when the parser does not successfully parse its input. Here's some more context around this. I've been working on a parser that uses unboxed sums instead of continuations. After #15038 was fixed, everything had been going well. Then, I took the parser type and added two things to it: (1) context and (2) typed errors. Context is basically like throwing `StateT` on top and errors are like throwing `ExceptT` on top. After this, everything in my test suite kept working except for a single test, which now consistently crashes my test suite. So, I originally had this: {{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Maybe# a #) newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# s r a #) } }}} But I changed it to this: {{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Either# a (b :: TYPE r) = (# a | b #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# e c s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #) newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s r a #) } }}} Specifically, the function causing trouble is (as currently defined): {{{ {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-} takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes# takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #)) where go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s BytesRep Bytes# #) go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c #) #) go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13 (ByteArray arr0) of Nothing -> case streamFunc s0 of (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1 Just (I# ix, W8# theByte) -> case theByte of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) -- second case means it was 13 _ -> case ix <# (off0 +# len0 -# 1#) of 1# -> case indexWord8Array# arr0 (ix +# 1# ) of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0 #) #), (# Nothing | #), c #) #) _ -> case nextNonEmpty stream0 s0 of (# s1, m #) -> case m of (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton 13), Stream.empty #) #), (# Nothing | #), c #) #) (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case indexWord8Array# arr1 0# of 10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes bytes1)), stream1 #) #), (# Nothing | #), c #) #) }}} That's all I've got for now. If no one's able to make headway, I'll probably come back to this and try to make a more minimal example at some point. I won't have time to do this soon though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I think the program fails in the final GC that runs right before the
#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Thanks for reporting. I managed to reproduce this with GHC HEAD. program
terminates. We can see that it produces a correct result of (Nothing,Just "\rhas\narrived"), but something on the heap has definitely been corrupted.
This program does not do any GC at all. Try this: {{{ $ gdb --args ./eol Reading symbols from ./eol...done.
break GarbageCollect Breakpoint 1 at 0x8b3ca7: file rts/sm/GC.c, line 224. r }}}
you'll see that the breakpoint is never triggered. So this is definitely a code generation bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * version: 8.4.3 => 8.5 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * type: task => bug -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Both the library and the program passes Core, Stg and Cmm lints. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by andrewthad): * priority: normal => high Comment: Bumping this to high priority to increase make this visible in the list of important tickets for GHC 8.6. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by bgamari: Old description:
I've made it a little further in my experiments with unboxed tuples in the `packed` library. However, I've run into another issue that I strongly suspect is the result of bad behavior of unboxed tuples. To replicate this issue (with GHC 8.4.3), do the following:
{{{ git clone https://github.com/andrewthad/packed cd packed cabal new-build }}}
We use `cabal new-build` for its side effect of creating a `.ghc.environment.xyz` file. Now, create a minimal example in the directory called `eol.hs` with the following contents:
{{{ import Packed.Bytes.Parser (Parser) import Data.Word import Packed.Bytes (Bytes) import GHC.Exts (RealWorld) import Packed.Bytes.Stream.IO (ByteStream) import qualified Packed.Bytes as B import qualified Data.Char import qualified Packed.Bytes.Parser as P import qualified Packed.Bytes.Stream.IO as Stream
main :: IO () main = do r <- runExampleParser ( do P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume ) (foldMap Stream.singleton (map charToWord8 "the\nemporium\rhas\narrived")) print r
runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a, Maybe String) runExampleParser parser stream = do P.Result mleftovers r _ <- P.parseStreamIO stream () parser mextra <- case mleftovers of Nothing -> return Nothing Just (P.Leftovers chunk remainingStream) -> do bs <- Stream.unpack remainingStream return (Just (map word8ToChar (B.unpack chunk ++ bs))) return (either (const Nothing) Just r,mextra)
word8ToChar :: Word8 -> Char word8ToChar = Data.Char.chr . fromIntegral
charToWord8 :: Char -> Word8 charToWord8 = fromIntegral . Data.Char.ord
s2b :: String -> Bytes s2b = B.pack . map charToWord8
c2w :: Char -> Word8 c2w = charToWord8 }}}
Finally, build this with `ghc -O2 eol.hs`, and then run the executable this produces to get the following:
{{{ (Nothing,Just "\rhas\narrived") eol: internal error: stg_ap_n_ret (GHC version 8.4.3 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Aborted (core dumped) }}}
Things worth noting:
1. I think the program fails in the final GC that runs right before the program terminates. We can see that it produces a correct result of `(Nothing,Just "\rhas\narrived")`, but something on the heap has definitely been corrupted. 2. This only happens with `-O2` turned on. 3. This only happens when the parser does not successfully parse its input.
Here's some more context around this. I've been working on a parser that uses unboxed sums instead of continuations. After #15038 was fixed, everything had been going well. Then, I took the parser type and added two things to it: (1) context and (2) typed errors. Context is basically like throwing `StateT` on top and errors are like throwing `ExceptT` on top. After this, everything in my test suite kept working except for a single test, which now consistently crashes my test suite. So, I originally had this:
{{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Maybe# a #) newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# s r a #) } }}}
But I changed it to this:
{{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Either# a (b :: TYPE r) = (# a | b #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# e c s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #)
newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s r a #) } }}}
Specifically, the function causing trouble is (as currently defined):
{{{ {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-} takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes# takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #)) where go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s BytesRep Bytes# #) go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c #) #) go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13 (ByteArray arr0) of Nothing -> case streamFunc s0 of (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1 Just (I# ix, W8# theByte) -> case theByte of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) -- second case means it was 13 _ -> case ix <# (off0 +# len0 -# 1#) of 1# -> case indexWord8Array# arr0 (ix +# 1# ) of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0 #) #), (# Nothing | #), c #) #) _ -> case nextNonEmpty stream0 s0 of (# s1, m #) -> case m of (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton 13), Stream.empty #) #), (# Nothing | #), c #) #) (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case indexWord8Array# arr1 0# of 10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes bytes1)), stream1 #) #), (# Nothing | #), c #) #) }}}
That's all I've got for now. If no one's able to make headway, I'll probably come back to this and try to make a more minimal example at some point. I won't have time to do this soon though.
New description: I've made it a little further in my experiments with unboxed tuples in the `packed` library. However, I've run into another issue that I strongly suspect is the result of bad behavior of unboxed tuples. To replicate this issue (with GHC 8.4.3), do the following: {{{ git clone https://github.com/andrewthad/packed cd packed cabal new-build }}} We use `cabal new-build` for its side effect of creating a `.ghc.environment.xyz` file. Now, create a minimal example in the directory called `eol.hs` with the following contents: {{{#!hs import Packed.Bytes.Parser (Parser) import Data.Word import Packed.Bytes (Bytes) import GHC.Exts (RealWorld) import Packed.Bytes.Stream.IO (ByteStream) import qualified Packed.Bytes as B import qualified Data.Char import qualified Packed.Bytes.Parser as P import qualified Packed.Bytes.Stream.IO as Stream main :: IO () main = do r <- runExampleParser ( do P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume ) (foldMap Stream.singleton (map charToWord8 "the\nemporium\rhas\narrived")) print r runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a, Maybe String) runExampleParser parser stream = do P.Result mleftovers r _ <- P.parseStreamIO stream () parser mextra <- case mleftovers of Nothing -> return Nothing Just (P.Leftovers chunk remainingStream) -> do bs <- Stream.unpack remainingStream return (Just (map word8ToChar (B.unpack chunk ++ bs))) return (either (const Nothing) Just r,mextra) word8ToChar :: Word8 -> Char word8ToChar = Data.Char.chr . fromIntegral charToWord8 :: Char -> Word8 charToWord8 = fromIntegral . Data.Char.ord s2b :: String -> Bytes s2b = B.pack . map charToWord8 c2w :: Char -> Word8 c2w = charToWord8 }}} Finally, build this with `ghc -O2 eol.hs`, and then run the executable this produces to get the following: {{{ (Nothing,Just "\rhas\narrived") eol: internal error: stg_ap_n_ret (GHC version 8.4.3 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Aborted (core dumped) }}} Things worth noting: 1. I think the program fails in the final GC that runs right before the program terminates. We can see that it produces a correct result of `(Nothing,Just "\rhas\narrived")`, but something on the heap has definitely been corrupted. 2. This only happens with `-O2` turned on. 3. This only happens when the parser does not successfully parse its input. Here's some more context around this. I've been working on a parser that uses unboxed sums instead of continuations. After #15038 was fixed, everything had been going well. Then, I took the parser type and added two things to it: (1) context and (2) typed errors. Context is basically like throwing `StateT` on top and errors are like throwing `ExceptT` on top. After this, everything in my test suite kept working except for a single test, which now consistently crashes my test suite. So, I originally had this: {{{#!hs type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Maybe# a #) newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# s r a #) } }}} But I changed it to this: {{{#!hs type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Either# a (b :: TYPE r) = (# a | b #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# e c s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #) newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s r a #) } }}} Specifically, the function causing trouble is (as currently defined): {{{#!hs {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-} takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes# takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #)) where go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s BytesRep Bytes# #) go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c #) #) go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13 (ByteArray arr0) of Nothing -> case streamFunc s0 of (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1 Just (I# ix, W8# theByte) -> case theByte of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) -- second case means it was 13 _ -> case ix <# (off0 +# len0 -# 1#) of 1# -> case indexWord8Array# arr0 (ix +# 1# ) of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0 #) #), (# Nothing | #), c #) #) _ -> case nextNonEmpty stream0 s0 of (# s1, m #) -> case m of (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton 13), Stream.empty #) #), (# Nothing | #), c #) #) (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case indexWord8Array# arr1 0# of 10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes bytes1)), stream1 #) #), (# Nothing | #), c #) #) }}} That's all I've got for now. If no one's able to make headway, I'll probably come back to this and try to make a more minimal example at some point. I won't have time to do this soon though. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): andrewthad, why do you think `takeBytesUntilEndOfLineConsumeUnboxed` is the problematic function here? This is currently hard to debug as the reproducer is huge. A smaller reproducer would be very helpful. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): I don't have any suspicions about why that specific function is the only one that causes problems. I can put together a smaller reproducer though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): For a smaller reproducer, I've reused https://github.com/andrewthad /corrupted-memory-example. On master, I've basically got a copy of `packed` with everything stripped out that isn't needed to demonstrate this particular problem. I'm going to try to golf it down a little more today. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): I just brought the size down by another 50 lines of code. It's still at 550 lines total. I'll keep coming back to this occasionally to see what else I can eliminate. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: high => highest * owner: (none) => osa1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): This is probably a StgCSE bug as disabling it fixes this bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Looking at STG before and after CSE, the only difference is: {{{ case val_scuG of { (#_|#) err_scuJ [Occ=Once] -> case (#_|#) [err_scuJ] of sat_scuK [Occ=Once] { __DEFAULT -> Main.$w$j ipv_scuC leftovers1_scuF sat_scuK; }; ... }}} becomes: {{{ case val_scuG of { (#_|#) err_scuJ [Occ=Once] -> case wild2_scuI of sat_scuK [Occ=Once] { __DEFAULT -> Main.$w$j ipv_scuC leftovers1_scuF sat_scuK; }; ... }}} But note that wild2_scuI was dead before (dead binders are not printed by the STG printer so I'm not sure where it's bound). If the problem is reviving wild2_scuI then this is a duplicate of #14895. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Confirmed that the transformation above is the problem. With this transformation the unarised expression: {{{ Main.$w$j GHC.Prim.void# us_gcvd us_gcve us_gcvf us_gcvg us_gcvh 1# us_gcvj; }}} becomes {{{ Main.$w$j GHC.Prim.void# us_gcvd us_gcve us_gcvf us_gcvg us_gcvh us_gcvi us_gcvj us_gcvk us_gcvl; }}} We now pass more arguments to `$w$j`! I had spent some time debugging this in the assembly level and found out that `stg_ap_n` is trying to apply a non-pointer to a constructor. If we're passing more arguments to function or a constructor than this kind of error makes sense. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I hacked the Id printer to print dead binders with their types. It turns out the dead-but-used binder is this: {{{ wild2_scuI [Occ=Dead] :: (# GHC.Maybe.Maybe GHC.Types.Any | Packed.Bytes.Parser.Bytes# #) }}} This kind of variables (dead case binders with unboxed sum/tuple types) are not supposed to be used! We even have a comment about this in `UnariseStg`: {{{ unariseExpr rho (StgCase scrut bndr alt_ty alts) ... -- general case | otherwise = do scrut' <- unariseExpr rho scrut alts' <- unariseAlts rho alt_ty bndr alts return (StgCase scrut' bndr alt_ty alts') -- bndr may have a unboxed sum/tuple type but it will be -- dead after unarise (checked in StgLint) }}} and we actually check this in StgLint: {{{ lintStgExpr (StgCase scrut bndr alts_type alts) = do ... -- Case binders of unboxed tuple or unboxed sum type always dead -- after the unariser has run. -- See Note [Post-unarisation invariants]. MultiValAlt _ -> not (lf_unarised lf) ... }}} I don't understand yet why this program passes StgLint. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Aha! The problem is StgCse's term equality check does not work on unboxed sums. Are these two equal? (and a binder to one could be substituted for the other?) - `(# 1# | #) :: (# Int# | Bool #)` - `(# 1# | #) :: (# Int# | Int# #)` Of course not, becuse first one unarises to `(# 1#, 1#, absentSumField #)` while the second one unarises to `(# 1#, 1# #)`. Not sure how to compare unboxed sums for equality in StgCse as we don't have enough type information at that point. However, it seems to me that CSE on unboxed sums is useless as unboxed sums are not allocated (so CSE doesn't buy us anything). Maybe we should just not do CSE on unboxed sum (and maybe even tuple) terms. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Good sleuthing! What about doing StgCse after unarising? Then the difference would be obvious! We do CSE in STG even though we've done it already in Core, because we can sometimes common-up things that have the same representation in STG even though they have different types in Core. So it's possible that as well as prevent bogus CSE in the case you describe, you might get extra CSE in some other case. But I'm not close enough to StgCSE to think of an example. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1):
What about doing StgCse after unarising?
I think this is a good idea, but it reveals another bug in StgCse.
Basically StgCse reverts unarise by using case binders when scrutinee is a
unboxed tuple, e.g.
{{{
case

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1):
Basically StgCse reverts unarise by using case binders when scrutinee is a unboxed tuple
I should add that this is only a bug when we run StgCse after unarise. Before unarise unboxed tuple terms can be replaced by case binders. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4962 Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * status: new => patch * differential: => Phab:D4962 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: merge Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4962 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => merge -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: merge Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4962 Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): When the next GHC alpha (or beta or release candidate) comes out, I'll give this a try. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash
-------------------------------------+-------------------------------------
Reporter: andrewthad | Owner: osa1
Type: bug | Status: merge
Priority: highest | Milestone: 8.6.1
Component: Compiler | Version: 8.5
Resolution: | Keywords: UnboxedSums
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4962
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: osa1 Type: bug | Status: closed Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: fixed | Keywords: UnboxedSums Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4962 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: merge => closed * resolution: => fixed Comment: Merged to `ghc-8.6` in 72dc7989a25ed6ec4ab9d3adfeefc15425acbf57. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15300: Unboxed Sums Crash
-------------------------------------+-------------------------------------
Reporter: andrewthad | Owner: osa1
Type: bug | Status: closed
Priority: highest | Milestone: 8.6.1
Component: Compiler | Version: 8.5
Resolution: fixed | Keywords: UnboxedSums
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4962
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones
participants (1)
-
GHC