
Am Freitag 30 April 2010 23:06:07 schrieb Philip Scott:
Hi Daniel
Replace getFloat64le with e.g. getWord64le to confirm. The reading of IEEE754 floating point numbers seems rather complicated. Maybe doing it differently could speed it up, maybe not.
That speeds things up by a factor of about 100 :)
Yes, I too.
I think there must be some efficiency to be extracted from there somewhere.. Either the IEEE module
Look at the code. It does a lot of hard work. That is probably necessary to ensure correctness, but it's sloow. If you feel like playing with fire, {-# LANGUAGE BangPatterns, MagicHash #-} import qualified Data.ByteString.Lazy as BL import Data.Binary import Data.Binary.Get import GHC.Prim import Data.Word getFloat64le :: Get Double getFloat64le = fmap unsafeCoerce# getWord64le myGetter !acc = do e <- isEmpty if e then return acc else do !t <- getFloat64le myGetter ((t+acc)/2) may work on your system (no warranties, you know what 'unsafe' means, don't you?).
or the Data.Binary.Get.
Considering that it's quick enough getting Word64, you won't get much improvement there.
Is it possible to get the profiler to look deeper than the top level module?
Lots of {-# SCC "foo" #-} pragmas. Or create a local copy of the module and import that, then -prof -auto-all should give more info.
With all the options I could find, it only ever tells me about things in the file I am dealing with..Hm, 200MB file => ~25 million Doubles, such a list needs at least 400MB.
Still a long way to 2GB. I suspect you construct a list of thunks, not Doubles.
I think you are almost certainly right. Is there an easy way to see if/how/where this is happening?
Read the core, profile with all -hx flags and look at the profiles, show the code to more experienced Haskellers.
Thanks once again,
Philip