How to write faster ByteString/Conduit code

3 Apr 2016

      Hello Haskellers,

I’ve been trying to squeeze as much performance out of my code as possible
and I’ve come to a point where can’t figure out what more I can do.

Here is some example code:

blankEscapedChars :: MonadThrow m => Conduit BS.ByteString m BS.ByteString
blankEscapedChars = blankEscapedChars' ""

blankEscapedChars' :: MonadThrow m => BS.ByteString -> Conduit
BS.ByteString m BS.ByteString
blankEscapedChars' rs = do
  mbs <- await
  case mbs of
    Just bs -> do
      let cs = if BS.length rs /= 0 then BS.concat [rs, bs] else bs
      let ds = fst (unfoldrN (BS.length cs) unescapeByteString (False, cs))
      yield ds
      blankEscapedChars' (BS.drop (BS.length ds) cs)
    Nothing -> when (BS.length rs > 0) (yield rs)
  where
    unescapeByteString :: (Bool, ByteString) -> Maybe (Word8, (Bool,
ByteString))
    unescapeByteString (wasEscaped, bs) = case BS.uncons bs of
      Just (_, cs) | wasEscaped       -> Just (wUnderscore, (False, cs))
      Just (c, cs) | c /= wBackslash  -> Just (c, (False, cs))
      Just (c, cs)                    -> Just (c, (True, cs))
      Nothing                         -> Nothing

The above function blankEscapedChars will go find all \ characters and
convert the following character to a _. For a 1 MB in memory JSON ByteString,
it benches at about 6.6 ms

In all my code the basic strategy is the same. await for the next byte
string, then use and unfoldrN to produce a new ByteString for yielding.

Anyone know of a way to go faster?

Cheers,

-John

John Ky

John Ky

John Ky

tags

participants (1)