Am I missing something about unfoldr?

The current definition: unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> [] I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this: unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n unfoldr f b = build (unfoldrB f b) then we get some really nice fusion: foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n The list is gone, and there are no new closures or data structures in its place. As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.

Hmm... I found something I missed. I need to prove that this doesn't break
when the arguments to unfold use seq (if in fact it doesn't). I will have
to look into it later.
On Jul 26, 2014 3:34 PM, "David Feuer"
The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.

its worth point out that many uses of fusion in haskell libs punt on
preserving bottoms, that is, fusion can make *more* programs terminate. I
don't have a good example off hand mind you
On Sat, Jul 26, 2014 at 4:02 PM, David Feuer
Hmm... I found something I missed. I need to prove that this doesn't break when the arguments to unfold use seq (if in fact it doesn't). I will have to look into it later. On Jul 26, 2014 3:34 PM, "David Feuer"
wrote: The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Foldr/build fusion can't remove bottoms; it *can* add them. That's why we
have to be very careful. I *think* we're safe in this case, though.
On Jul 26, 2014 4:28 PM, "Carter Schonwald"
its worth point out that many uses of fusion in haskell libs punt on preserving bottoms, that is, fusion can make *more* programs terminate. I don't have a good example off hand mind you
On Sat, Jul 26, 2014 at 4:02 PM, David Feuer
wrote: Hmm... I found something I missed. I need to prove that this doesn't break when the arguments to unfold use seq (if in fact it doesn't). I will have to look into it later. On Jul 26, 2014 3:34 PM, "David Feuer"
wrote: The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

I think this is probably what Hinze et al, "Theory and Practice of Fusion"
describes as the fold/unfold law, "a fold after an unfold is a hylo" but I
don't know enough to read that paper.
On Jul 26, 2014 4:35 PM, "David Feuer"
Foldr/build fusion can't remove bottoms; it *can* add them. That's why we have to be very careful. I *think* we're safe in this case, though. On Jul 26, 2014 4:28 PM, "Carter Schonwald"
wrote: its worth point out that many uses of fusion in haskell libs punt on preserving bottoms, that is, fusion can make *more* programs terminate. I don't have a good example off hand mind you
On Sat, Jul 26, 2014 at 4:02 PM, David Feuer
wrote: Hmm... I found something I missed. I need to prove that this doesn't break when the arguments to unfold use seq (if in fact it doesn't). I will have to look into it later. On Jul 26, 2014 3:34 PM, "David Feuer"
wrote: The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Sat, Jul 26, 2014 at 5:10 PM, David Feuer
I think this is probably what Hinze et al, "Theory and Practice of Fusion" describes as the fold/unfold law, "a fold after an unfold is a hylo" but I don't know enough to read that paper.
The two main styles of fusion are build/foldr and unfoldr/destroy. The difference is in which side of the slash has the recursion. GHC's choice to use build/foldr is because that worked more nicely with various other facets of the language; this is discussed in some of the early papers about list fusion in GHC. The issue we run into is that you can't really have unfoldr/foldr fusion because having recursion on both sides means they fight with each other. But! we can fuse the operations together into a single operation, and that single operation should behave nicely with both build/foldr and unfoldr/destroy styles of fusion. A catamorphism is the class of recursion schemes you get by generalizing over the list data type in foldr (hence foldr is the list instance of cata). Dually, an anamorphism is the class of corecursion schemes you get by generalizing over the list data type in unfoldr (hence unfoldr is the list instance of ana). If we use an anamorphism to expand some "seed" value into a structure (e.g., a list), but then we immediately use a catamorphism to destroy that structure to produce a "summary" value, then we can eliminate the intermediate structure and just go from seed to summary directly. That's what hylomorphisms do. So we have: roll :: F (fix F) -> fix F unroll :: fix F -> F (fix F) cata :: (F b -> b) -> fix F -> b cata f = f . fmap (cata f) . unroll ana :: (a -> F a) -> a -> fix F ana g = roll . fmap (ana g) . g hylo :: (F b -> b) -> (a -> F a) -> a -> b hylo f g = f . fmap (hylo f g) . g where cata f . ana g == {definition} f . fmap (cata f) . unroll . roll . fmap (ana g) . g == {unroll/roll fusion} f . fmap (cata f) . fmap (ana g) . g == {functor law} f . fmap (cata f . ana g) . g == {inductive hypothesis} f . fmap (hylo f g) . g == {definition} hylo f g More specifically for lists: hyloList :: b -> (x -> b -> b) -> (a -> Maybe (x,a)) -> a -> b hyloList fnil fcons g a = case g a of Nothing -> fnil Just (x,a') -> fcons x (hyloList fnil fcons g a') If we inline hyloList whenever g is known concretely, then the intermediate Maybe(_,_) structure will be fused away by the case-of-constructor transform. Hopefully that should be enough to get you started? -- Live well, ~wren

I think that "for lists" version is pretty much what I ended up with (I
haven't worked through what all the others might be about yet). I don't
understand the "fight" you refer to, unless you actually mean trying to
implement both foldr/build and destroy/unfoldr at the same time. If you
write hyloList in a more inliner-friendly fashion:
hyloList fnil fcons g = go
where
go a = case g a of
Nothing -> fnil
Just (x,a') -> fcons x (go a')
and then write
unfoldr g a = build $ \c n -> hyloList n c g a
Then foldr/build gets you
foldr c n (unfoldr g a) = hyloList n c g a
And then if g is statically known and sufficiently simple, the Maybes go
away.
On Aug 14, 2014 10:23 PM, "wren romano"
On Sat, Jul 26, 2014 at 5:10 PM, David Feuer
wrote: I think this is probably what Hinze et al, "Theory and Practice of Fusion" describes as the fold/unfold law, "a fold after an unfold is a hylo" but I don't know enough to read that paper.
The two main styles of fusion are build/foldr and unfoldr/destroy. The difference is in which side of the slash has the recursion. GHC's choice to use build/foldr is because that worked more nicely with various other facets of the language; this is discussed in some of the early papers about list fusion in GHC.
The issue we run into is that you can't really have unfoldr/foldr fusion because having recursion on both sides means they fight with each other. But! we can fuse the operations together into a single operation, and that single operation should behave nicely with both build/foldr and unfoldr/destroy styles of fusion.
A catamorphism is the class of recursion schemes you get by generalizing over the list data type in foldr (hence foldr is the list instance of cata). Dually, an anamorphism is the class of corecursion schemes you get by generalizing over the list data type in unfoldr (hence unfoldr is the list instance of ana). If we use an anamorphism to expand some "seed" value into a structure (e.g., a list), but then we immediately use a catamorphism to destroy that structure to produce a "summary" value, then we can eliminate the intermediate structure and just go from seed to summary directly. That's what hylomorphisms do. So we have:
roll :: F (fix F) -> fix F unroll :: fix F -> F (fix F)
cata :: (F b -> b) -> fix F -> b cata f = f . fmap (cata f) . unroll
ana :: (a -> F a) -> a -> fix F ana g = roll . fmap (ana g) . g
hylo :: (F b -> b) -> (a -> F a) -> a -> b hylo f g = f . fmap (hylo f g) . g
where
cata f . ana g == {definition} f . fmap (cata f) . unroll . roll . fmap (ana g) . g == {unroll/roll fusion} f . fmap (cata f) . fmap (ana g) . g == {functor law} f . fmap (cata f . ana g) . g == {inductive hypothesis} f . fmap (hylo f g) . g == {definition} hylo f g
More specifically for lists:
hyloList :: b -> (x -> b -> b) -> (a -> Maybe (x,a)) -> a -> b hyloList fnil fcons g a = case g a of Nothing -> fnil Just (x,a') -> fcons x (hyloList fnil fcons g a')
If we inline hyloList whenever g is known concretely, then the intermediate Maybe(_,_) structure will be fused away by the case-of-constructor transform.
Hopefully that should be enough to get you started?
-- Live well, ~wren

On Thu, Aug 14, 2014 at 10:44 PM, David Feuer
I think that "for lists" version is pretty much what I ended up with (I haven't worked through what all the others might be about yet). I don't understand the "fight" you refer to, unless you actually mean trying to implement both foldr/build and destroy/unfoldr at the same time.
Suppose that build does not exist. (Also that destroy does not exist.) Now try to think about performing list fusion in a foldr/unfoldr framework. What would that framework look like? how would it work? Because both foldr and unfoldr use value-level fixedpoints, you run into problems trying to make that foldr/unfoldr framework actually work out. Don't take my word for it, go read some of the early papers like A Short Cut to Deforestation (1993) by Gill, Launchbury, & Peyton Jones, or The Concatenate Vanishes (1987) by Wadler, and then try to work through implementing foldr/unfoldr. This is a good chunk of why the foldr/build stuff in base has so many explicitly non-recursive functions to decompose things into before rewriting. The problem isn't that the inliner doesn't like recursive functions, the problem is that fixedpoints get in the way of being able to see what the terms are in order to rewrite them and that you can't fuse away buffers when there's an impedance mismatch between producers and consumers. -- Live well, ~wren

I've read the Short Cut paper. Note that what it describes as fold/unfold
is completely unrelated to fusing foldr with unfoldr—they're talking about
"unfold" in Wadler's sense, which is part of a program transformation. I
just did work through what foldr/unfoldr fusion proper looks like—it looks
like hyloList. This all works out very well with map/unfoldr as well. I
don't think foldr/unfoldr makes for a proper "fusion framework", per se,
because you can't use it to make fusing transformers like map. I believe,
however, that bringing unfoldr into the foldr/build framework is still
valuable, because it's a convenient way to generate lists, and, if wrapped
in build, offers a limited but safe way to write good producers.
On Aug 15, 2014 12:55 AM, "wren romano"
On Thu, Aug 14, 2014 at 10:44 PM, David Feuer
wrote: I think that "for lists" version is pretty much what I ended up with (I haven't worked through what all the others might be about yet). I don't understand the "fight" you refer to, unless you actually mean trying to implement both foldr/build and destroy/unfoldr at the same time.
Suppose that build does not exist. (Also that destroy does not exist.) Now try to think about performing list fusion in a foldr/unfoldr framework. What would that framework look like? how would it work?
Because both foldr and unfoldr use value-level fixedpoints, you run into problems trying to make that foldr/unfoldr framework actually work out. Don't take my word for it, go read some of the early papers like A Short Cut to Deforestation (1993) by Gill, Launchbury, & Peyton Jones, or The Concatenate Vanishes (1987) by Wadler, and then try to work through implementing foldr/unfoldr.
This is a good chunk of why the foldr/build stuff in base has so many explicitly non-recursive functions to decompose things into before rewriting. The problem isn't that the inliner doesn't like recursive functions, the problem is that fixedpoints get in the way of being able to see what the terms are in order to rewrite them and that you can't fuse away buffers when there's an impedance mismatch between producers and consumers.
-- Live well, ~wren

One I saw not too long ago was where someone created an infinite Vector and then consumed it immediately. With optimizations turned on, the intermediate Vector was fused away. With optimizations turned off, it wasn't fused away, and the program eventually crashed when it ran out of memory for holding the infinite Vector. On Sat, Jul 26, 2014 at 11:28 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
its worth point out that many uses of fusion in haskell libs punt on preserving bottoms, that is, fusion can make *more* programs terminate. I don't have a good example off hand mind you
On Sat, Jul 26, 2014 at 4:02 PM, David Feuer
wrote: Hmm... I found something I missed. I need to prove that this doesn't break when the arguments to unfold use seq (if in fact it doesn't). I will have to look into it later. On Jul 26, 2014 3:34 PM, "David Feuer"
wrote: The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

But that doesn't look like it's actually a problem. Expand the foldr and
unfoldr one step each, apply case-of-case, and it looks like everything
comes together nicely.
On Jul 26, 2014 4:02 PM, "David Feuer"
Hmm... I found something I missed. I need to prove that this doesn't break when the arguments to unfold use seq (if in fact it doesn't). I will have to look into it later. On Jul 26, 2014 3:34 PM, "David Feuer"
wrote: The current definition:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a] unfoldr f b = case f b of Just (a,new_b) -> a : unfoldr f new_b Nothing -> []
I haven't yet looked over the Core to see if GHC is able to do a worker/wrapper transformation to inline it, but we definitely want to inline it (in some appropriate phase), because doing so has the potential to erase the Maybes. It also isn't a good producer for foldr/build. However, if we (re)write it like this:
unfoldrB f b c n = go b where go b = case f b of Just (a,new_b) -> a `c` go new_b Nothing -> n
unfoldr f b = build (unfoldrB f b)
then we get some really nice fusion:
foldr c n (unfoldr f b) = foldr c n (build (unfoldrB f b)) = unfoldrB f b c n
The list is gone, and there are no new closures or data structures in its place.
As a side benefit, we could write groupBy as an unfoldr, and make it a somewhat better producer, especially when the groups are small. I don't *think* there's any way to make groupBy a good consumer for foldr/build, unfortunately.
participants (4)
-
Carter Schonwald
-
David Feuer
-
Michael Snoyman
-
wren romano