Re: [web-devel] Data.Word8 (word8 library)

20 Sep 2012

      On Thu, Sep 20, 2012 at 7:25 PM, Gregory Collins wrote:
...
Hey Michael,
BTW -- you're getting crap performance here because of the fromEnum/toEnum
in toLowerC, which does checks. An updated version using unsafeChr is
faster than your bsToLower call: https://gist.github.com/3756876
master /home/gdc/tmp/haskell/chr/gist-3756876 ⚠ $ ./bench
warming up
estimating clock resolution...
mean is 1.290661 us (640001 iterations)
found 2935 outliers among 639999 samples (0.5%)
  2541 (0.4%) high severe
estimating cost of a clock call...
mean is 31.63774 ns (13 iterations)
found 2 outliers among 13 samples (15.4%)
  2 (15.4%) high mild
benchmarking Char8
mean: 145.1935 us, lb 141.4375 us, ub 149.8138 us, ci 0.950
std dev: 21.28567 us, lb 18.14038 us, ub 23.87625 us, ci 0.950
found 22 outliers among 100 samples (22.0%)
  22 (22.0%) high severe
variance introduced by outliers: 89.411%
variance is severely inflated by outliers
benchmarking Char8 toLowerC
mean: 12.74308 us, lb 12.24657 us, ub 13.31365 us, ci 0.950
std dev: 2.712828 us, lb 2.434402 us, ub 2.904232 us, ci 0.950
variance introduced by outliers: 94.689%
variance is severely inflated by outliers
benchmarking bsToLower
mean: 20.68829 us, lb 20.66869 us, ub 20.70941 us, ci 0.950
 std dev: 104.5939 ns, lb 95.37577 ns, ub 120.6401 ns, ci 0.950
G.
On Thu, Sep 20, 2012 at 5:01 PM, Michael Snoyman wrote:
...
Well... let's test it out:
benchmarking Char8
mean: 333.0050 us, lb 329.2846 us, ub 336.2362 us, ci 0.950
std dev: 17.73400 us, lb 15.69876 us, ub 19.45947 us, ci 0.950
variance introduced by outliers: 51.452%
variance is severely inflated by outliers
benchmarking Char8 toLowerC
mean: 117.1571 us, lb 116.8739 us, ub 117.4219 us, ci 0.950
std dev: 1.394150 us, lb 1.189928 us, ub 1.649276 us, ci 0.950
benchmarking Word8
mean: 41.01667 us, lb 40.94708 us, ub 41.09468 us, ci 0.950
std dev: 378.4175 ns, lb 335.4655 ns, ub 462.6281 ns, ci 0.950
benchmarking bsToLower
mean: 37.37589 us, lb 37.24453 us, ub 37.48697 us, ci 0.950
std dev: 616.5653 ns, lb 513.7510 ns, ub 752.8996 ns, ci 0.950
found 9 outliers among 100 samples (9.0%)
  3 (3.0%) low severe
  4 (4.0%) low mild
  2 (2.0%) high mild
variance introduced by outliers: 9.426%
variance is slightly inflated by outliers
So a specialized `Char -> Char` function helps, but doesn't completely
close the performance gap. (Updates at the same gist[1].)
I disagree with a problem with an extra package: this is such a
low-level detail that average users don't need to really be aware of
the existence of the package, and I think the marginal increase in
compile times shouldn't cause any issues. I used to worry much more
about adding extra packages to the mix, but with the more recent
versions of cabal-install and the community's general improvement in
handling dependency hell, I see less of a reason to do so.
That said, I think having specialized toLower/toUpper in a central
place- perhaps even bytestring itself- would be a good thing.
Michael
[1] https://gist.github.com/3756212
...
This is, of course, not an apples-to-apples test:
Prelude Data.Char> toUpper 'χ'
'\935'
Prelude Data.Char> putStrLn ('\935':[])
Χ
...which I suppose is the point. I wonder whether a version of
toUpper/toLower on Char restricted to ASCII values would have the same
performance here.
We only call toLower explicitly in one place in snap-server, but where
...
would be nice to fix is for HTTP headers, where I think we are all using
case-insensitive (which just calls "map toLower"). Probably we should
send
Bas a patch to optimize the FoldCase instance for ByteString.
Personally I would prefer not to have yet another tiny package here, as
...
package zoo has enough creatures in it as it is. Do we think we have a
real
problem here beyond the toUpper/toLower case? I suspect that for most
other
uses of Data.ByteString.Char8 the conversion is a no-op.
G
On Thu, Sep 20, 2012 at 4:17 PM, Michael Snoyman 
wrote:
...
On Thu, Sep 20, 2012 at 2:10 PM, Michael Snoyman 
wrote:
...
On Thu, Sep 20, 2012 at 11:41 AM, Kazu Yamamoto 
wrote:
...
...
...
Hello,
ByteString is an array of Word8 but it seems to me that people tend
to
use the Char interface with Data.ByteString.Char8 instead of Word8
interface with Data.ByteString. Since the functions defined in
Data.ByteString.Char8 converts Word8 to Char and Char to Word8, it
has
unnecessary overhead. Yes, the overhead is ignorable in many cases,
but I would like to remove it for high performance server.
Why do people use Data.ByteString.Char8? I guess that there are two
reasons:
- There are no standard utility functions for Word8 such as
"isUpper"
- Numeric literal (e.g 72 for 'H') is not readable
To fix these problems, I implemented the Data.Word8 module and
uploaded the word8 library to Hackage:
http://hackage.haskell.org/packages/archive/word8/0.0.0/doc/html/Data-Word8....
...
If Michael and Bas like this, I would like to modify warp and
case-insensitive to use the word8 library. What do people think
On Thu, Sep 20, 2012 at 5:47 PM, Gregory Collins
 wrote:
this
the
this?
...
...
...
...
My concern is that character names start with "_". Some people would
dislike this convention. But I have not a better idea at this
moment.
...
Suggestions are welcome.
--Kazu
_______________________________________________
web-devel mailing list
web-devel@haskell.org
http://www.haskell.org/mailman/listinfo/web-devel
Sounds good to me. I put together a simple benchmark to compare the
performance of toLower, and the results are encouraging:
benchmarking Char8
mean: 38.04527 us, lb 37.94080 us, ub 38.12774 us, ci 0.950
std dev: 470.9770 ns, lb 364.8254 ns, ub 748.3015 ns, ci 0.950
benchmarking Word8
mean: 4.807265 us, lb 4.798199 us, ub 4.816563 us, ci 0.950
std dev: 47.20958 ns, lb 41.51181 ns, ub 55.07049 ns, ci 0.950
I want to try throwing one more idea into the mix, I'll post with
updates when I have them.
So to answer your question: I'd be happy to include word8 in warp :).
Michael
{-# LANGUAGE OverloadedStrings #-}
import Criterion.Main
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8
import qualified Data.Char
import qualified Data.Word8
main :: IO ()
main = do
    input <- S.readFile "bench.hs"
    defaultMain
        [ bench "Char8" $ whnf (S.length . S8.map Data.Char.toLower)
input
        , bench "Word8" $ whnf (S.length . S.map Data.Word8.toLower)
input
        ]
I tried implementing a more low-level approach to try and avoid the
Word8 boxing. The results improved a bit, but not significantly:
benchmarking Char8
mean: 318.2341 us, lb 314.5367 us, ub 320.4834 us, ci 0.950
std dev: 14.48230 us, lb 10.00946 us, ub 21.22126 us, ci 0.950
found 9 outliers among 100 samples (9.0%)
  8 (8.0%) low severe
variance introduced by outliers: 43.472%
variance is moderately inflated by outliers
benchmarking Word8
mean: 35.79037 us, lb 35.66547 us, ub 35.92601 us, ci 0.950
std dev: 665.5299 ns, lb 599.3413 ns, ub 741.6474 ns, ci 0.950
variance introduced by outliers: 11.349%
variance is moderately inflated by outliers
benchmarking bsToLower
mean: 31.49299 us, lb 31.32314 us, ub 31.65027 us, ci 0.950
std dev: 835.2251 ns, lb 744.4337 ns, ub 946.1789 ns, ci 0.950
variance introduced by outliers: 20.925%
variance is moderately inflated by outliers
Perhaps someone with more experience with this level of optimization
would be able to improve the algorithm:
https://gist.github.com/3756212
Michael
_______________________________________________
web-devel mailing list
web-devel@haskell.org
http://www.haskell.org/mailman/listinfo/web-devel
--
Gregory Collins 
--
Gregory Collins 
Hmm... I don't get your results.

benchmarking Char8
mean: 394.2475 us, lb 393.1611 us, ub 395.3824 us, ci 0.950
std dev: 5.674103 us, lb 4.574321 us, ub 7.548278 us, ci 0.950
found 16 outliers among 100 samples (16.0%)
  1 (1.0%) low severe
  4 (4.0%) low mild
  6 (6.0%) high mild
  5 (5.0%) high severe
variance introduced by outliers: 7.517%
variance is slightly inflated by outliers

benchmarking Char8 toLowerC
mean: 81.19748 us, lb 80.95403 us, ub 81.40814 us, ci 0.950
std dev: 1.154865 us, lb 977.5925 ns, ub 1.497224 us, ci 0.950
found 2 outliers among 100 samples (2.0%)
  1 (1.0%) low severe
variance introduced by outliers: 7.506%
variance is slightly inflated by outliers

benchmarking Word8
mean: 43.01692 us, lb 42.94030 us, ub 43.09647 us, ci 0.950
std dev: 401.2451 ns, lb 362.3989 ns, ub 458.7243 ns, ci 0.950

benchmarking bsToLower
mean: 36.61481 us, lb 36.46137 us, ub 36.79378 us, ci 0.950
std dev: 850.7579 ns, lb 717.1316 ns, ub 1.004895 us, ci 0.950
found 16 outliers among 100 samples (16.0%)
  2 (2.0%) low mild
  10 (10.0%) high mild
  4 (4.0%) high severe
variance introduced by outliers: 17.062%
variance is moderately inflated by outliers

I'm compiling with -O2 and running on 7.4.1, 64-bit Linux. I'm uncertain
what would lead to such a significant difference in our runtimes. Any
chance you can include Word8 in your run?

Michael