
Now attoparsec-text is more than twice faster, allocates even less memory and the total memory figures seem right.
Bottom line: I think this benchmark doesn't really represent the kind of workload your parser has. Can you reproduce these results on your system?
I spent quite a bit of time trying to reduce this down to a minimal reproduction and getting confusing results. Then I found out that compiling with profiling enabled makes attoparsec slow and parsec fast. When I compile without any profiling, here's what I get, in CPU time: parsec run 1000000 - time: 1.22s - atto bs run 1000000 - time: 0.38s - atto text run 1000000 - time: 0.78s - This looks more like I expect it to. I don't understand the parsec thing... one of the first things I did was recompile and reinstall parsec2, making sure to pass -p to configure, and verify that there is a /usr/local/lib/parsec-2.1.0.1/ghc-6.12.3/libHSparsec-2.1.0.1_p.a. However, on closer inspection, I believe I've found the culprit. Compiling with 'build -v' for attoparsec reveals a ghc cmdline line: '-prof -hisuf p_hi -osuf p_o -auto-all'. Compiling parsec has: '-prof -hisuf p_hi -osuf p_o'. And indeed, attoparsec cabal has 'ghc-prof-options: -auto-all', which parsec's cabal does not. And in fact, parsec3 also has this -auto-all, which both explains why the profile is full of internal functions and why parsec3 was so much slower than parsec2. I'm glad to have finally tracked this down, but unhappy that I spent so much time on it. It seems like a trap waiting to be sprung if various libraries are compiled with their individually specified flags, which have major effects on performance. Maybe I should have noticed, but it seems pretty subtle to me. GHC will refuse to compile non-profiling libs against a profiling build, but doesn't go down to the level of flags. I think my short term solution is going to be remove -auto-all from attoparsec's cabal---I'm not profiling attoparsec and so I don't want my entire profile output to be internal attoparsec functions. But presumably the flag was added there for a reason, so maybe there are people who really want that. Is there a better solution? GHC warns when linking a profiling lib compiled with different profiling flags? A separate .p_auto-all_o suffix? Removal of ghc-prof-options from cabal? A consensus to standardize on a set of flags? BTW, yes my situation is a little different from your test. It's lots and lots of little expressions for a simple language in an in-memory structure that get parsed individually. So I don't care about file reading speed, but I do care about parser startup overhead, since it's lots and lots of little parses. The numbers above are how long it takes to parse "2.34" 1m times.