
On Thu, Sep 20, 2012 at 11:41 AM, Kazu Yamamoto
Hello,
ByteString is an array of Word8 but it seems to me that people tend to use the Char interface with Data.ByteString.Char8 instead of Word8 interface with Data.ByteString. Since the functions defined in Data.ByteString.Char8 converts Word8 to Char and Char to Word8, it has unnecessary overhead. Yes, the overhead is ignorable in many cases, but I would like to remove it for high performance server.
Why do people use Data.ByteString.Char8? I guess that there are two reasons:
- There are no standard utility functions for Word8 such as "isUpper" - Numeric literal (e.g 72 for 'H') is not readable
To fix these problems, I implemented the Data.Word8 module and uploaded the word8 library to Hackage:
http://hackage.haskell.org/packages/archive/word8/0.0.0/doc/html/Data-Word8....
If Michael and Bas like this, I would like to modify warp and case-insensitive to use the word8 library. What do people think this?
My concern is that character names start with "_". Some people would dislike this convention. But I have not a better idea at this moment. Suggestions are welcome.
--Kazu
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
Sounds good to me. I put together a simple benchmark to compare the performance of toLower, and the results are encouraging: benchmarking Char8 mean: 38.04527 us, lb 37.94080 us, ub 38.12774 us, ci 0.950 std dev: 470.9770 ns, lb 364.8254 ns, ub 748.3015 ns, ci 0.950 benchmarking Word8 mean: 4.807265 us, lb 4.798199 us, ub 4.816563 us, ci 0.950 std dev: 47.20958 ns, lb 41.51181 ns, ub 55.07049 ns, ci 0.950 I want to try throwing one more idea into the mix, I'll post with updates when I have them. So to answer your question: I'd be happy to include word8 in warp :). Michael {-# LANGUAGE OverloadedStrings #-} import Criterion.Main import qualified Data.ByteString as S import qualified Data.ByteString.Char8 as S8 import qualified Data.Char import qualified Data.Word8 main :: IO () main = do input <- S.readFile "bench.hs" defaultMain [ bench "Char8" $ whnf (S.length . S8.map Data.Char.toLower) input , bench "Word8" $ whnf (S.length . S.map Data.Word8.toLower) input ]