
When I simulate case-insensitive by adding a FoldCase class as in: import Criterion.Main (defaultMain, bench, nf) import qualified Data.ByteString as S import qualified Data.ByteString.Char8 as S8 import Data.Char (ord) import Data.Word8 (Word8) import GHC.Base (unsafeChr) main :: IO () main = do input <- S.readFile "bench.hs" defaultMain [ bench "Word8-local" $ nf (S.length . S.map toLower8) input , bench "Char8 toLowerC" $ nf (S.length . S8.map toLowerC) input , bench "foldCase" $ nf (S.length . foldCase) input ] toLower8 :: Word8 -> Word8 toLower8 w | isUpper8 w = w + 32 | otherwise = w -- {-# INLINE toLower8 #-} isUpper8 :: Word8 -> Bool isUpper8 w = 0x41 <= w && w <= 0x5a || 0xc0 <= w && w <= 0xd6 || 0xd8 <= w && w <= 0xde -- {-# INLINE isUpper8 #-} toLowerC :: Char -> Char toLowerC w | isUpperC w = unsafeChr $ ord w + 0x20 | otherwise = w -- {-# INLINE toLowerC #-} isUpperC :: Char -> Bool isUpperC w = '\x41' <= w && w <= '\x5a' || '\xc0' <= w && w <= '\xd6' || '\xd8' <= w && w <= '\xde' -- {-# INLINE isUpperC #-} class FoldCase a where foldCase :: a -> a instance FoldCase S.ByteString where foldCase = S.map toLower8' -- {-# INLINE foldCase #-} toLower8' :: Word8 -> Word8 toLower8' w | isUpper8' w = w + 32 | otherwise = w -- {-# INLINE toLower8' #-} isUpper8' :: Word8 -> Bool isUpper8' w = 0x41 <= w && w <= 0x5a || 0xc0 <= w && w <= 0xd6 || 0xd8 <= w && w <= 0xde -- {-# INLINE isUpper8' #-} The results are: benchmarking Word8-local mean: 3.141125 us benchmarking Char8 toLowerC mean: 3.086287 us benchmarking foldCase mean: 8.870402 us Any idea how to speed up foldCase? Adding INLINE pragma's doesn't help... Note that when I remove the FoldCase class and turn the foldCase method into a function it becomes as fast as the rest. Bas