
I can't find the error in your code (assuming there is an error), so I'm checking the code you didn't write, and the only thing that set off an alarm was... getFloat64le :: Get Double getFloat64le = getFloat (ByteCount 8) $ splitBytes . reverse splitBytes :: [Word8] -> RawFloat ...that every chunk read in the Get monad is being reversed, so that you can take one float (and you are taking in over 26 million floats) in little endian. I really don't know if this hits performance, but I assume the C equivalent would be reading an array in reverse order. I am more than willing to believe this is not the cause of such performance loss, but can't find a reason. PS1: "(e == True)" == "e" PS2: I know it's not important, but I can't help it: that is not an average you're computing... El jue, 29-04-2010 a las 23:37 +0100, Philip Scott escribió:
Hello again folks,
Sorry to keep troubling you - I'm very appreciative of the help you've given so far. I've got one more for you that has got me totally stumped. I'm writing a program which deals with largish-files, the one I am using as a test case is not stupidly large at about 200mb. After three evenings, I have finally gotten rid of all the stack overflows, but I am unfortunately left with something that is rather unfeasably slow. I was hoping someone with some keener skills than I could take a look, I've tried to distill it to the simplest case.
This program just reads in a file, interpreting each value as a double, and does a sort of running average on them. The actual function doesn't matter too much, I think it is the reading it in that is the problem. Here's the code:
import Control.Exception import qualified Data.ByteString.Lazy as BL import Data.Binary.Get import System.IO import Data.Binary.IEEE754
myGetter acc = do e <- isEmpty if e == True then return acc else do t <- getFloat64le myGetter $! ((t+acc)/2)
myReader file = do h <- openBinaryFile file ReadMode bs <- BL.hGetContents h return $ runGet (myGetter 0) bs
main = do d <- myReader "data.bin" evaluate d
This takes about three minutes to run on my (fairly modern) laptop.. The equivilant C program takes about 5 seconds.
I'm sure I am doing something daft, but I can't for the life of me see what. Any hints about how to get the profiler to show me useful stuff would be much appreciated!
All the best,
Philip
PS: If, instead of computing a single value I try and build a list of the values, the program ends up using over 2gb of memory to read a 200mb file.. any ideas on that one?
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners