a slight modification to compile it :

change:
where sumFile = sum . map read . L.words
to :
where sumFile = sum . map (read . L.unpack) . L.words

but it's actually _slower_ than the non-bytestring version.

i did a little test, three versions of the same script and manufactured meself  a ~50 MB file containing 1M of  ints 0-65535. and replaced the sum with length for obvious reasons.

module Main where
import qualified Data.ByteString.Lazy.Char8 as L

main1 = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = length . map L.readInt . L.words

main2 = do
        contents <- getContents
        print (sumFile contents)
            where sumFile = length . map (read :: String -> Int) . words

main3 = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = length . map ((read :: String -> Int) . L.unpack) . L.words 

time main3 < nums
real    0m22.421s
user    0m0.031s
sys     0m0.000s

time main2 < nums
real    0m14.296s
user    0m0.015s
sys     0m0.016s

time main1 < nums
real    0m22.078s
user    0m0.015s
sys     0m0.015s

i expected the conversions (L.unpack in main3) to kill the performance a little, but not to make it nearly two times as slow.
and i certainly did not expect that even the version using the bytestring readInt to be as slow  ...

did I do something wrong ?


On Tue, Oct 7, 2008 at 4:06 AM, Mike Coleman <tutufan@gmail.com> wrote:
Hi,

I could use a little help.  I was looking through the Real World
Haskell book and came across a trivial program for summing numbers in
a file.  They mentioned that that implementation was very slow, as
it's based on String's, so I thought I'd try my hand at converting it
to use lazy ByteString's.  I've made some progress, but now I'm a
little stuck because that module doesn't seem to have a 'read' method.

There's a readInt method, which I guess I could use, but it returns a
Maybe, and I don't see how I can easily strip that off.

So:

1.  Is there an easy way to strip off the Maybe that would allow an
equivalently concise definition for sumFile?  I can probably figure
out how to do it with pattern matching and a separate function--I'm
just wondering if there's a more concise way.

2.  Why doesn't ByteString implement 'read'?  Is it just that this
function (like 'input' in Python) isn't really very useful for real
programs?

3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
are Int and Integer treated specially?  Do I not need readDouble?

Thanks,
Mike


-- lazy version (doesn't compile)

-- file: ch08/SumFile.hs

import qualified Data.ByteString.Lazy.Char8 as L

main = do
        contents <- L.getContents
        print (sumFile contents)
            where sumFile = sum . map read . L.words
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe