
jwlato:
On Tue, Oct 28, 2008 at 5:43 PM, Don Stewart
wrote: jwlato:
Hello,
I was experimenting with using ghc-6.10.0.20081007 on a project, and it seems that binary-0.4.3.1 has markedly worse performance in certain cases. With the following simple test:
import qualified Data.ByteString.Lazy as L import Data.Binary import Data.Binary.Get import Control.Monad
main :: IO () main = do b <- L.readFile "some_binary_file" putStrLn $ show $ runGet getter b
getter :: Get [Word16] getter = replicateM 1000000 getWord16le
running this program compiled with ghc-6.10 takes about 4 times as long (and consumes much more memory) as when compiled with ghc-6.8.3. The extra time appears to be proportional to the number of elements processed in the Get. Running the programs with -hT shows a clear memory difference, which I think is the source of the problem. I've placed pdfs of that output at https://webspace.utexas.edu/latojw/data/
The difference seems to manifest itself only when the elements are actually processed; changing "show $ runGet " to "show $ length $ runGet " is slightly faster in 6.10.
I was working on an Intel Mac with OS 10.4, binary-0.4.3.1, and bytestring-0.9.1.4. Can anyone confirm this, or suggest what might be the difference?
Is this the sole test case?
I can investigate. Though perhaps using a newer GHC release candidate is also a good idea.
-- Don
I'll try a newer release candidate, although at the time I first ran this it was the latest.
Last night I tried creating the bytestring using "repeat 1" to remove the file I/O, and the result was the same. There are a few other things I want to try as well (removing replicateM, folding the list to force values rather than printing them all), but my time is extremely limited at the moment.
Could you send me a minimal test case not involving IO? -- Don