Hi Alexis,
that's a very interesting example you have there!
So far, what we referred to as Nested CPR concerned unboxing for returned nested *records*, e.g., the `annotation` field in your example. That's what I try to exploit in
!1866, which after a rebase that I'll hopefully be doing this week, some more sleuthing and then documenting what I did will finally make it into master.
CPR'ing the Lambda, i.e., what is returned for `parser`, on the other hand, is a surprising new opportunity for what Nested CPR could do beyond unboxing records! And it's pretty simple, too: Because it's a function, we don't care about subtleties such as whether all callers actually evaluate the pair that deep (actually, that's wrong, as I realise below). I think it's entirely within the reach of !1866 today. So we could transform (provided that `(,) <$> a <*> b` inlines `<$>` and `<*>` and then will actually have the CPR property)
AnnotatedParser ann1 f <+> AnnotatedParser ann2 g = AnnotatedParser
{ annotation = Seq ann1 ann2
, parser = \s1 ->
let !(a, s2) = f s1
!(b, s3) = g s2
in ((,) <$> a <*> b, s3)
}
to
$w<+> :: Annotation
-> (String -> (Maybe a, String))
-> Annotation
-> (String -> (Maybe b, String))
-> (# Annotation, String -> (# Maybe (a, b), String #) #)
$w<+> ann1 f ann2 g =
(# Seq ann1 ann2
, \s1 -> case (\s1 -> let !(a, s2) = f s1
!(b, s3) = g s2
in ((,) <$> a <*> b) s1 of (p, q) -> (#p, q#), s3) #)
<+> :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)
<+> (AnnotatedParser ann1 f) (AnnotatedParser ann2 g) =
case $w<+> ann1 f ann2 g of
(# a, b #) -> AnnotatedParser (\s1 -> case a s1 of (# p, q#) -> (p, q)) b
{-# INLINE <+> #-}
Actually writing out the transformation tells me that this isn't always a win: We now have to allocate a lambda in the wrapper. That is only a win if that lambda cancels away at call sites! So we have to make sure that all call sites of the wrapper actually call the `parser`, so that the lambda simplifies away. If it doesn't, we have a situation akin to reboxing. So I was wrong above when I said "we don't care about subtleties such as whether all callers actually evaluate the pair that deep": We very much need to know whether all call sites call the lambda. Luckily, I implemented
just that for exploitation by Nested CPR! That's the reason why I need to rebase !1866 now. I'll ḱeep you posted.
---
You might wonder why CPR today doesn't care for lambdas. Well, they only make sense in nested scenarios (otherwise the function wasn't eta-expanded that far, for good reasons) and CPR currently doesn't bother unboxing records nestedly, which is what
#18174 discusses and what !1866 tries to fix.
Cheers,
Sebastian