Hello Haskellers,

I’ve been trying to squeeze as much performance out of my code as possible and I’ve come to a point where can’t figure out what more I can do.

Here is some example code:

blankEscapedChars :: MonadThrow m => Conduit BS.ByteString m BS.ByteString
blankEscapedChars = blankEscapedChars' ""

blankEscapedChars' :: MonadThrow m => BS.ByteString -> Conduit BS.ByteString m BS.ByteString
blankEscapedChars' rs = do
  mbs <- await
  case mbs of
    Just bs -> do
      let cs = if BS.length rs /= 0 then BS.concat [rs, bs] else bs
      let ds = fst (unfoldrN (BS.length cs) unescapeByteString (False, cs))
      yield ds
      blankEscapedChars' (BS.drop (BS.length ds) cs)
    Nothing -> when (BS.length rs > 0) (yield rs)
  where
    unescapeByteString :: (Bool, ByteString) -> Maybe (Word8, (Bool, ByteString))
    unescapeByteString (wasEscaped, bs) = case BS.uncons bs of
      Just (_, cs) | wasEscaped       -> Just (wUnderscore, (False, cs))
      Just (c, cs) | c /= wBackslash  -> Just (c, (False, cs))
      Just (c, cs)                    -> Just (c, (True, cs))
      Nothing                         -> Nothing

The above function blankEscapedChars will go find all \ characters and convert the following character to a _. For a 1 MB in memory JSON ByteString, it benches at about 6.6 ms

In all my code the basic strategy is the same. await for the next byte string, then use and unfoldrN to produce a new ByteString for yielding.

Anyone know of a way to go faster?

Cheers,

-John