A problem with bytestring 0.9.1.4 "hGetBuf: invalid argument"

Hi all, I've recently came across a problem when processing a large text file (around 2G in size). I wrote a Haskell program to count the number of lines in the file. module Main where import System import qualified Data.ByteString.Char8 as S -- import Prelude as S main :: IO () main = do { args <- getArgs ; case args of { [ filename ] -> do { content <- S.readFile filename ; let lns = S.lines content ; putStrLn (show $ length lns) } ; _ -> error "Usage : Wc <file>" } } I get this error, if I use the ByteString module, ./Wc a.out Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size (-1909953139)) Otherwise, it returns me the result. Another observation is that if I reduce the size of the file, the ByteString version works too. Is it a known limitation? Regards, Kenny A generator program that generate large file. (Warning, it is very slow, I don't know how to speed it up) -- generate a file module Main where import System import qualified Data.ByteString.Char8 as S l :: S.ByteString l = S.pack "All work, no fun, make Kenny a dull boy. " main :: IO () main = do { args <- getArgs ; case args of { [ n, fn ] -> do { let i = read n ; mapM_ (\s -> S.appendFile fn s) (take i $ repeat l) } ; _ -> return () } }

haskellmail:
Hi all,
I've recently came across a problem when processing a large text file (around 2G in size).
I wrote a Haskell program to count the number of lines in the file.
module Main where
import System import qualified Data.ByteString.Char8 as S -- import Prelude as S
main :: IO () main = do { args <- getArgs ; case args of { [ filename ] -> do { content <- S.readFile filename ; let lns = S.lines content ; putStrLn (show $ length lns) } ; _ -> error "Usage : Wc <file>" } }
I get this error, if I use the ByteString module, ./Wc a.out Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size (-1909953139)) Otherwise, it returns me the result.
Another observation is that if I reduce the size of the file, the ByteString version works too.
Is it a known limitation?
Yes, you need to use Data.ByteString.Lazy.Char8 to process files larger than this on a 32 bit machine (you'll have more space on a 64 bit machine). -- Don

Oh right. Thanks for pointing out. :)
On Wed, Aug 5, 2009 at 10:06 AM, Don Stewart
haskellmail:
Hi all,
I've recently came across a problem when processing a large text file (around 2G in size).
I wrote a Haskell program to count the number of lines in the file.
module Main where
import System import qualified Data.ByteString.Char8 as S -- import Prelude as S
main :: IO () main = do { args <- getArgs ; case args of { [ filename ] -> do { content <- S.readFile filename ; let lns = S.lines content ; putStrLn (show $ length lns) } ; _ -> error "Usage : Wc <file>" } }
I get this error, if I use the ByteString module, ./Wc a.out Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size (-1909953139)) Otherwise, it returns me the result.
Another observation is that if I reduce the size of the file, the ByteString version works too.
Is it a known limitation?
Yes, you need to use Data.ByteString.Lazy.Char8 to process files larger than this on a 32 bit machine (you'll have more space on a 64 bit machine).
-- Don
participants (2)
-
Don Stewart
-
kenny lu