
On March 4, 2018 5:19:26 AM EST, Ben Franksen
I had a program I was working on lately (darcs) crash with a segmentation fault after I made a seemingly harmless refactoring. The original code was
{-# INLINE linesPS #-} linesPS :: B.ByteString -> [B.ByteString] linesPS ps | B.null ps = [B.empty] | otherwise = BC.split '\n' ps
which I wanted to optimize, pasting code from the definition of Data.ByteString.lines, to
linesPS ps = case search ps of Nothing -> [ps] Just n -> B.take n ps : linesPS (B.drop (n + 1) ps) where search = BC.elemIndex '\n'
So I looked at the bytestring library to see if there was something that could explain the crash. I found that it uses accursedUnutterablePerformIO all over the place.
The dire warnings accompanying this "function" (including the citation of a number of problem reports against commonly used libraries) made me think that it may be worthwhile to offer an opt-out for users of libraries like bytestring or text. (Note that I am not claiming my particular crash is due to a bug in bytestring, I merely want to exclude the possibility.)
For the libraries in question it would be simple to do this: just add a cabal flag to optionally disable use of accursedUnutterablePerformIO.
That still leaves the question of how users can make their project depend on a bytestring that has been built with this flag. I know this can be done with manual installation of a new version of the library, but I would rather use cabal new-build (as I am used to) and let it figure out itself that it has to attach a new hash to the variant with the flag.
I'm afraid it's not possible to provide the interfaces exposed by bytestring without some form of unsafety. Lazy IO alone requires unsafeInterleaveIO and the bytestring indexing operations require at very least unsafePerformIO since GHC treats access to foreign memory as an effect. accursedUnutterablePerformIO is an optimized form of unsafePerformIO which likely won't cause any issues that wouldn't otherwise manifest with plain unsafePerformIO. Consequently I am not sure it's worth providing a means to disable its usage. Cheers, - Ben