
#15300: Unboxed Sums Crash -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Keywords: UnboxedSums | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I've made it a little further in my experiments with unboxed tuples in the `packed` library. However, I've run into another issue that I strongly suspect is the result of bad behavior of unboxed tuples. To replicate this issue (with GHC 8.4.3), do the following: {{{ git clone https://github.com/andrewthad/packed cd packed cabal new-build }}} We use `cabal new-build` for its side effect of creating a `.ghc.environment.xyz` file. Now, create a minimal example in the directory called `eol.hs` with the following contents: {{{ import Packed.Bytes.Parser (Parser) import Data.Word import Packed.Bytes (Bytes) import GHC.Exts (RealWorld) import Packed.Bytes.Stream.IO (ByteStream) import qualified Packed.Bytes as B import qualified Data.Char import qualified Packed.Bytes.Parser as P import qualified Packed.Bytes.Stream.IO as Stream main :: IO () main = do r <- runExampleParser ( do P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume P.takeBytesUntilEndOfLineConsume ) (foldMap Stream.singleton (map charToWord8 "the\nemporium\rhas\narrived")) print r runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a, Maybe String) runExampleParser parser stream = do P.Result mleftovers r _ <- P.parseStreamIO stream () parser mextra <- case mleftovers of Nothing -> return Nothing Just (P.Leftovers chunk remainingStream) -> do bs <- Stream.unpack remainingStream return (Just (map word8ToChar (B.unpack chunk ++ bs))) return (either (const Nothing) Just r,mextra) word8ToChar :: Word8 -> Char word8ToChar = Data.Char.chr . fromIntegral charToWord8 :: Char -> Word8 charToWord8 = fromIntegral . Data.Char.ord s2b :: String -> Bytes s2b = B.pack . map charToWord8 c2w :: Char -> Word8 c2w = charToWord8 }}} Finally, build this with `ghc -O2 eol.hs`, and then run the executable this produces to get the following: {{{ (Nothing,Just "\rhas\narrived") eol: internal error: stg_ap_n_ret (GHC version 8.4.3 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Aborted (core dumped) }}} Things worth noting: 1. I think the program fails in the final GC that runs right before the program terminates. We can see that it produces a correct result of `(Nothing,Just "\rhas\narrived")`, but something on the heap has definitely been corrupted. 2. This only happens with `-O2` turned on. 3. This only happens when the parser does not successfully parse its input. Here's some more context around this. I've been working on a parser that uses unboxed sums instead of continuations. After #15038 was fixed, everything had been going well. Then, I took the parser type and added two things to it: (1) context and (2) typed errors. Context is basically like throwing `StateT` on top and errors are like throwing `ExceptT` on top. After this, everything in my test suite kept working except for a single test, which now consistently crashes my test suite. So, I originally had this: {{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Maybe# a #) newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# s r a #) } }}} But I changed it to this: {{{ type Bytes# = (# ByteArray#, Int#, Int# #) type Maybe# (a :: TYPE r) = (# (# #) | a #) type Either# a (b :: TYPE r) = (# a | b #) type Leftovers# s = (# Bytes# , ByteStream s #) type Result# e c s (r :: RuntimeRep) (a :: TYPE r) = (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #) newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity { getParserLevity :: forall s. c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s r a #) } }}} Specifically, the function causing trouble is (as currently defined): {{{ {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-} takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes# takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #)) where go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (# State# s, Result# e c s BytesRep Bytes# #) go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c #) #) go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13 (ByteArray arr0) of Nothing -> case streamFunc s0 of (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1 Just (I# ix, W8# theByte) -> case theByte of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) -- second case means it was 13 _ -> case ix <# (off0 +# len0 -# 1#) of 1# -> case indexWord8Array# arr0 (ix +# 1# ) of 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# ) bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0 #) #), (# Nothing | #), c #) #) _ -> case nextNonEmpty stream0 s0 of (# s1, m #) -> case m of (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton 13), Stream.empty #) #), (# Nothing | #), c #) #) (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case indexWord8Array# arr1 0# of 10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #) _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes bytes1)), stream1 #) #), (# Nothing | #), c #) #) }}} That's all I've got for now. If no one's able to make headway, I'll probably come back to this and try to make a more minimal example at some point. I won't have time to do this soon though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15300 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler