Hi Axel
If you see the type of your function
Prelude> :t (  show . sum . map read . words )
(  show . sum . map read . words ) :: String -> String
It takes a string and return string.

On Sat, Mar 23, 2013 at 6:12 PM, Axel Wegen <axel.wegen@gmail.com> wrote:
When I try to run the following code on a 50M file consisting of one
number per line I receive a `Stack space overflow' error and the program
seems to consume a lot of memory. Why does that happen?

It's already mentioned there "A String is represented as a list of Char values; each element of a list is allocated individually, and has some book-keeping overhead. These factors affect the memory consumption and performance of a program that must read or write text or binary data. On simple benchmarks like this, even programs written in interpreted languages such as Python can outperform Haskell code that uses String by an order of magnitude".
 
And how can I
fix the problem without increasing the Stack with -Ksize hoping that it
will be big enough? Any general advice on processing big files with
Haskell?

Each ByteString type performs better under particular circumstances. For streaming a large quantity (hundreds of megabytes to terabytes) of data, the lazy ByteString type is usually best. Its chunk size is tuned to be friendly to a modern CPU's L1 cache, and a garbage collector can quickly discard chunks of streamed data that are no longer being used.
 

-- sumFile.hs
-- adapted from Real World Haskell's SumFile.hs at the beginning of
-- Chapter 8
main = interact sumFile
  where sumFile = show . sum . map read . words

ghc -o sumFile sumFile.hs
./sumFile < ./numbers

See if this code is working ( I haven't tested it on big file )

import qualified Data.ByteString.Lazy.Char8 as BS
import Data.Maybe ( fromJust )

readI :: BS.ByteString -> Integer
readI = fst . fromJust . BS.readInteger

main = BS.interact sumFile where
     sumFile =  BS.pack . show . sum . map readI . BS.words
 

-Mukesh

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners