
-------------------------------------------- -- bytestring-lexing 0.3.0 -------------------------------------------- The bytestring-lexing package offers efficient reading and packing of common types like Double and Integral types. -------------------------------------------- -- Administrative Changes (since 0.2.1) -------------------------------------------- * Change of maintainer. Don Stewart handed maintainership of the package over to myself when I voiced interest. * Change of repo type. The old repo for the package used Darcs-1 style patches. I've converted the repository to Darcs-2 hashed. This means that the new repository cannot exchange patches with the old Darcs-1 repo (or any other Darcs-2 conversions that may be floating around out there). So anyone who's interested in contributing should scrap their local copies and get the new repo. -------------------------------------------- -- Code Changes (since 0.2.1) -------------------------------------------- * Added Data.ByteString.Lex.Integral which provides efficient implementations for reading and packing/showing integral types in ASCII-compatible formats including decimal, hexadecimal, and octal. * The readDecimal function in particular has been highly optimized. The new version is wicked fast[1] and perfectly suitable for hot code locations like parsing headers for HTTP servers like Warp. In addition, attention has been paid to ensuring that parsing is efficient for larger than native types like Int64 on 32-bit systems (including 64-bit OS X), as well as Integer. The optimization of this function was done in collaboration with Erik de Castro Lopo, Vincent Hanquez, and Christoph Breitkopf following a blog post by Erik[2] and ensuing discussion on Reddit[3] [1] A Criterion report is available for 64-bit Intel OS X running 32-bit GHC 6.12.1: http://code.haskell.org/~wren/bytestring-lexing/test/bench/readDecimal.html The benchmark is included in the repo and has also been run on 64-bit GHC 7 systems, which differ primarily in not showing slowdown for Int64 vs Int (naturally). If you're curious about the different implementations: * readIntBS / readIntegerBS --- are the readInt and readInteger functions in Data.ByteString * readDecimalOrig (correct) --- was my original implementation, prior to collaboration with Erik, Vincent, and Christoph. * readIntegralMH (buggy) --- or rather a non-buggy version very much like it, is the implementation currently used in Warp. * readDecimal (current) --- is the current implementation used in this package. [2] http://www.mega-nerd.com/erikd/Blog/CodeHacking/Haskell/read_int.html [3] http://www.reddit.com/r/haskell/comments/otwxe/benchmarking_and_quickcheckin... -------------------------------------------- -- Links -------------------------------------------- Homepage: http://code.haskell.org/~wren/ Hackage: http://hackage.haskell.org/package/bytestring-lexing Darcs: http://community.haskell.org/~wren/bytestring-lexing Haddock (Darcs version): http://community.haskell.org/~wren/bytestring-lexing/dist/doc/html/bytestrin... -- Live well, ~wren

wren ng thornton wrote:
* The readDecimal function in particular has been highly optimized. The new version is wicked fast[1] and perfectly suitable for hot code locations like parsing headers for HTTP servers like Warp. In addition, attention has been paid to ensuring that parsing is efficient for larger than native types like Int64 on 32-bit systems (including 64-bit OS X), as well as Integer. The optimization of this function was done in collaboration with Erik de Castro Lopo, Vincent Hanquez, and Christoph Breitkopf following a blog post by Erik[2] and ensuing discussion on Reddit[3]
Thanks Wren. I'm pretty sure that Warp will swap over to use your new readDecimal function. Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Wren, I notice that readDecimal has a typesig: readDecimal :: Integral a => ByteString -> Maybe (a, ByteString) which I would then use in Warp as: readInt64BSL :: ByteString -> Int64 readInt64BSL bs = fst $ fromMaybe (0, "") $ BSL.readDecimal bs However, this version with the fromMaybe and fst is a little slower than if these two extra bits weren't necessary. Would you consider a function with the following signature in bytestring-lexing? readDecimalX :: Integral a => ByteString -> a The idea is that it gives something faster for applications like Warp where reading an valid decimal should be as fast as possible, but if the string isn't valid it doesn't really matter what the result is. Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On 1/29/12 3:43 AM, Erik de Castro Lopo wrote:
Would you consider a function with the following signature in bytestring-lexing?
readDecimalX :: Integral a => ByteString -> a
The idea is that it gives something faster for applications like Warp where reading an valid decimal should be as fast as possible, but if the string isn't valid it doesn't really matter what the result is.
If I can figure out a way to do so without too much code duplication I will. Another option would be to use the trick that's used by the pack* functions which causes the initial checks to be inlined ---and hence easily optimized away when used with fst.fromMaybe(0,"")--- without inlining the whole thing. I'll keep you informed. -- Live well, ~wren
participants (2)
-
Erik de Castro Lopo
-
wren ng thornton