Hi all,

I've recently came across a problem when processing a large text file (around 2G in size).

I wrote a Haskell program to count the number of lines in the file.


module Main where

import System
import qualified Data.ByteString.Char8 as S
-- import Prelude as S

main :: IO ()
main = do { args <- getArgs
          ; case args of
              { [ filename ] ->
                    do { content <- S.readFile filename
                       ; let lns = S.lines content
                       ; putStrLn (show $ length lns)
                       }
              ; _ -> error "Usage : Wc <file>"
              }
          }
                   

I get this error, if I use the ByteString module,
./Wc a.out
Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size (-1909953139))
Otherwise, it returns me the result.

Another observation is that if I reduce the size of the file, the ByteString version works too.

Is it a known limitation?

Regards,
Kenny


A generator program that generate large file. (Warning, it is very slow, I don't know how to speed it up)

-- generate a file

module Main where

import System
import qualified Data.ByteString.Char8 as S


l :: S.ByteString
l = S.pack "All work, no fun, make Kenny a dull boy. "

main :: IO ()
main = do { args <- getArgs
          ; case args of
              { [ n, fn ] -> do { let i = read n
                                ; mapM_ (\s -> S.appendFile fn s) (take i $ repeat l)
                                }
              ; _ -> return ()
              }
          }