On Fri, Sep 21, 2012 at 5:11 AM, Kazu Yamamoto <kazu@iij.ad.jp> wrote:
Hello,

> Ignoring this, I think I can summarize:
>
>    - Data.ByteString.Char8 has no performance penalty: "w2c" uses
>    "unsafeChr" which is a no-op, and GHC.Word has a RULES pragma making the
>    "fromIntegral" also free (it's a narrow8Word#)
>    - for some reason GHC is generating worse code for the Word8 versions of
>    toLower than it is for the Char equivalents
>    - consequently, there seems to be no reason to use the word8 library:
>    not only is it not faster, it's actually a pessimization.

My dictionary does not have the word "pessimization". Would you explain
what do you want to say with other words?

It's a joke on "optimization" (see "optimism" vs "pessimism"), in other words making things slower rather than faster.

Anyway, my understanding from this thread is:

- Data.ByteString.Char8 does not have performance penalty.  So, we can
  use the character literal (e.g. 'H') with it for code readability.

- But the utility functions in Data.Char is slow because it handles
  Unicode. So, we need faster utility functions specialized to Char.

Is this correct? Should I implement the char8 library (or include
Data.Char8 in the word8 library)?

I know that Greg dislikes to have extra libraries but I think that
sharing utility functions is a good thing.

Sounds correct to me. The only thing that's needed as far as I can tell is specialized-to-ascii toUpper and toLower.

G
--
Gregory Collins <greg@gregorycollins.net>