See the comment for hGetContents:

"This function reads chunks at a time, doubling the chunksize on each read. The final buffer is then realloced to the appropriate size. For files > half of available memory, this may lead to memory exhaustion. Consider using readFile in this case."

http://hackage.haskell.org/packages/archive/bytestring/0.9.2.1/doc/html/Data-ByteString-Char8.html#g:31

Maybe try lazy bytestrings?

  - Clark

On Tuesday, April 16, 2013, Anatoly Yakovenko wrote:
-- So why does this code run out of memory?

import Control.DeepSeq
import System.IO
import qualified Data.ByteString.Char8 as BS

scanl' :: NFData a => (a -> b -> a) -> a -> [b] -> [a]
scanl' f q ls =  q : (case ls of
                        []   -> []
                        x:xs -> let q' = f q x
                                in q' `deepseq` scanl' f q' xs)


main = do
   file <- openBinaryFile "/dev/zero" ReadMode
   chars <- BS.hGetContents file
   let rv = drop 100000000000 $ scanl' (+) 0 $ map fromEnum $ BS.unpack chars
   print (head rv)

-- my scanl' implementation seems to do the right thing, because 

main = print $ last $ scanl' (+) (0::Int) [0..]

-- runs without blowing up.  so am i creating a some thunk here?  or is hGetContents storing values?  any way to get the exception handler to print a trace of what caused the allocation?