
Hey Johan, The main thing to remember is that anything you wish to match on in a rule needs to not be inlined in the first pass. So to match "many" or "satisfy" robustly, you'll need: {-# NOINLINE [1] many #-} For example. johan.tibell:
Hi,
I'm trying (for the first time ever) to use RULES pragmas to achieve some nice speedups in my bytestring parsing library. The relevant code in my library's module is:
-- The module imports Control.Applicative which containes 'many' and 'some'.
-- | The parser @satisfy p@ succeeds for any byte for which the -- supplied function @p@ returns 'True'. Returns the byte that is -- actually parsed. satisfy :: (Word8 -> Bool) -> Parser Word8 satisfy p = Parser $ \s@(S bs pos eof) succ fail -> case S.uncons bs of Just (b, bs') -> if p b then succ b (S bs' (pos + 1) eof) else fail s Nothing -> if eof then fail s else IPartial $ \x -> case x of Just bs' -> retry (S bs' pos eof) Nothing -> fail (S bs pos True) where retry s' = unParser (satisfy p) s' succ fail
-- | @byte b@ parses a single byte @b@. Returns the parsed byte -- (i.e. @b@). byte :: Word8 -> Parser Word8 byte b = satisfy (== b)
-- --------------------------------------------------------------------- -- Rewrite rules
satisfyMany :: (Word8 -> Bool) -> Parser S.ByteString satisfyMany p = undefined -- More efficient implementation goes here.
satisfySome :: (Word8 -> Bool) -> Parser S.ByteString satisfySome p = undefined -- More efficient implementation goes here.
{-# RULES
"fmap/pack/many/satisfy" forall p. fmap S.pack (many (satisfy p)) = satisfyMany p
"fmap/pack/some/satisfy" forall p. fmap S.pack (some (satisfy p)) = satisfySome p #-}
In another module where I use the library I have this code:
pHeaders :: Parser [(S.ByteString, S.ByteString)] pHeaders = many header where header = liftA2 (,) fieldName (byte (c2w ':') *> spaces *> contents) fieldName = liftA2 (S.cons) letter fieldChars contents = liftA2 (S.append) (fmap S.pack $ some notEOL <* crlf) (continuation <|> pure S.empty) continuation = liftA2 (S.cons) ((c2w ' ') <$ some (oneOf (map c2w " \t"))) contents
-- It's important that all three of these definitions are kept on the -- top level to have RULES fire correctly. fieldChars = fmap S.pack $ many fieldChar
-- fieldChar = letter <|> digit <|> oneOf (map c2w "-_") fieldChar = satisfy isFieldChar where isFieldChar b = (isDigit $ chr $ fromIntegral b) || (isAlpha $ chr $ fromIntegral b) || (b `elem` map c2w "-_")
I want the fieldChars use of 'fmap S.pack $ many fieldChar' to trigger my rewrite rule "fmap/pack/many/satisfy" which it does in this case. The trouble is that the rule only triggers when I make at least fieldChars and fieldChar top-level definition and isFieldChar either a named local definition in fieldChar or a top-level definition. If I turn the predicate (isFieldChar) into to an anonymous lambda function it doesn't trigger, if I make either fieldChars or fieldChars a local defintion (in a where clause) of pHeaders it doesn't trigger. If I make fieldChar a local definition in fieldChars it doesn't trigger, etc.
It would be great if there was a way to make this a bit less fragile and have the rule trigger more often as it is potentially a huge performance win. I understand it's hard to guarantee that the rule always triggers but now it triggers in rare cases.
-- Johan _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users