
Hi wren!
Yes, i noticed that attoparsec's numeric parsers are slow. I have a benchmark set to compare attoparsec and binary-parsers on different sample JSON files, it's on github: https://github.com/winterland1989/binary-parsers.
I'm pretty sure bytestring-lexing helped a lot, for example, the average decoding speed improvement is around 20%, but numeric only benchmarks(integers and numbers) improved by 30% !
Parsing is just a part of JSON decoding, lots of time is spent on unescaping, .etc. So the parser's improvement is quite large IMHO.
BTW, can you provide a version of lexer which doesn't check whether a Word is a digit? In binary-parsers i use something like `takeWhile isDigit` to extract the input ByteString, so there's no need to verify this in lexer again. Maybe we can have another performance improvement.
Cheers!
Winterland
________________________________________
From: winterkoninkje@gmail.com
Hi all,
I am happy to announce binary-parsers. A ByteString parsing library built on binary. I borrowed lots of design/tests/document from attoparsec so that i can build its shape very quickly, thank you bos! And thanks to binary's excellent design, the codebase is very small(<500 loc).
From my benchmark, it’s insanely fast, it outperforms attoparsec by 10%~30% in aeson benchmark. it’s also slightly faster than scanner(a non-backtracking parser designed for speed) in http request benchmark. I’d like to ask you to give it a shot if you need super fast ByteString parsing.
Yay! more users of my bytestring-lexing package :) Since attoparsec's numeric parsers are dreadfully slow, can you tell how much of your speedup is due to bytestring-lexing vs how much is due to other differences vs aeson? -- Live well, ~wren