
On Fri, Sep 21, 2012 at 5:11 AM, Kazu Yamamoto
Hello,
Ignoring this, I think I can summarize:
- Data.ByteString.Char8 has no performance penalty: "w2c" uses "unsafeChr" which is a no-op, and GHC.Word has a RULES pragma making the "fromIntegral" also free (it's a narrow8Word#) - for some reason GHC is generating worse code for the Word8 versions of toLower than it is for the Char equivalents - consequently, there seems to be no reason to use the word8 library: not only is it not faster, it's actually a pessimization.
My dictionary does not have the word "pessimization". Would you explain what do you want to say with other words?
It's a joke on "optimization" (see "optimism" vs "pessimism"), in other words making things slower rather than faster. Anyway, my understanding from this thread is:
- Data.ByteString.Char8 does not have performance penalty. So, we can use the character literal (e.g. 'H') with it for code readability.
- But the utility functions in Data.Char is slow because it handles Unicode. So, we need faster utility functions specialized to Char.
Is this correct? Should I implement the char8 library (or include Data.Char8 in the word8 library)?
I know that Greg dislikes to have extra libraries but I think that sharing utility functions is a good thing.
Sounds correct to me. The only thing that's needed as far as I can tell is
specialized-to-ascii toUpper and toLower.
G
--
Gregory Collins