
Hi, I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc. The main idea is to have an extended version of foldr: -- | A mapping between @a@ and @b@. data Wrap a b = Wrap (a -> b) (b -> a) foldrW :: (forall e. Wrap (f e) (e -> b -> b)) -> (a -> b -> b) -> b -> [a] -> b foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0 where go = unwrap $ \list z' -> case list of [] -> z' x:xs -> f x $ wrap go xs z' This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop. Using this, foldl' can be defined as newtype Simple b e = Simple { runSimple :: e -> b -> b } foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial where wrap (Simple s) e k a = k $ s e a unwrap u = Simple $ \e -> u e id g x next acc = next $! f acc x The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function 'b -> b', effectively un-CPS-transforming the loop. I put preliminary code and some more explanation on Github: https://github.com/takano-akio/ww-fusion Thank you, Takano Akio

Any input on this is appreciated. In particular, I'd like to know: if I
implement the idea as a patch to the base package, is there a chance it is
considered for merge?
-- Takano Akio
On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano
Hi,
I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.
The main idea is to have an extended version of foldr:
-- | A mapping between @a@ and @b@. data Wrap a b = Wrap (a -> b) (b -> a)
foldrW :: (forall e. Wrap (f e) (e -> b -> b)) -> (a -> b -> b) -> b -> [a] -> b foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0 where go = unwrap $ \list z' -> case list of [] -> z' x:xs -> f x $ wrap go xs z'
This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop.
Using this, foldl' can be defined as
newtype Simple b e = Simple { runSimple :: e -> b -> b }
foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial where wrap (Simple s) e k a = k $ s e a unwrap u = Simple $ \e -> u e id g x next acc = next $! f acc x
The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function 'b -> b', effectively un-CPS-transforming the loop.
I put preliminary code and some more explanation on Github:
https://github.com/takano-akio/ww-fusion
Thank you, Takano Akio

Hey akio, it's certainly an interesting idea. If you implement it, the first step would be to run a nofib before and after to benchmark the impact of the change. On Thursday, January 9, 2014, Akio Takano wrote:
Any input on this is appreciated. In particular, I'd like to know: if I implement the idea as a patch to the base package, is there a chance it is considered for merge?
-- Takano Akio
On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano
wrote:
Hi,
I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.
The main idea is to have an extended version of foldr:
-- | A mapping between @a@ and @b@. data Wrap a b = Wrap (a -> b) (b -> a)
foldrW :: (forall e. Wrap (f e) (e -> b -> b)) -> (a -> b -> b) -> b -> [a] -> b foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0 where go = unwrap $ \list z' -> case list of [] -> z' x:xs -> f x $ wrap go xs z'
This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop.
Using this, foldl' can be defined as
newtype Simple b e = Simple { runSimple :: e -> b -> b }
foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial where wrap (Simple s) e k a = k $ s e a unwrap u = Simple $ \e -> u e id g x next acc = next $! f acc x
The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function 'b -> b', effectively un-CPS-transforming the loop.
I put preliminary code and some more explanation on Github:
https://github.com/takano-akio/ww-fusion
Thank you, Takano Akio

I've hesitated to reply, because I have lots of questions but no time to investigate in. I'm looking at your wiki page https://github.com/takano-akio/ww-fusion
* Does your proposed new fold' run faster than the old one? You give no data.
* The new foldl' is not a "good consumer" in the foldr/build sense, which a big loss. What if you say fold' k z [1..n]; you want the intermediate list to vanish.
* My brain is too small to truly understand your idea. But since foldrW is non-recursive, what happens if you inline foldrW into fold', and then simplify? I'm betting you get something pretty similar to the old foldl'. Try in by hand, and with GHC and let's see the final optimised code.
* Under "motivation" you say "GHC generates something essentially like..." and then give some code. Now, if GHC would only eta-expand 'go' with a second argument, you'd get brilliant code. And maybe that would help lots of programs, not just this one. It's a slight delicate transformation but I've often thought we should try it; c.f #7994, #5809
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Akio Takano
Sent: 09 January 2014 13:25
To: ghc-devs
Subject: Re: Extending fold/build fusion
Any input on this is appreciated. In particular, I'd like to know: if I implement the idea as a patch to the base package, is there a chance it is considered for merge?
-- Takano Akio
On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano

Thank you for looking at this!
On Tue, Jan 14, 2014 at 1:27 AM, Simon Peyton Jones
I’ve hesitated to reply, because I have lots of questions but no time to investigate in. I’m looking at your wiki page https://github.com/takano-akio/ww-fusion
· Does your proposed new fold’ run faster than the old one? You give no data.
No, it runs just equally fast as the old one. At the Core level they are the same. I ran some criterion benchmarks: source: https://github.com/takano-akio/ww-fusion/blob/master/benchmarks.hs results: http://htmlpreview.github.io/?https://github.com/takano-akio/ww-fusion/blob/... The point was not to make foldl' faster, but to make it fuse well with good producers.
· The new foldl’ is not a “good consumer” in the foldr/build sense, which a big loss. What if you say fold’ k z [1..n]; you want the intermediate list to vanish.
For my idea to work, enumFromTo and all other good producers need to be redefined in terms of buildW, which fuses with foldrW. The definition of buildW and the relevant rules are here: https://github.com/takano-akio/ww-fusion/blob/master/WWFusion.hs
· My brain is too small to truly understand your idea. But since foldrW is non-recursive, what happens if you inline foldrW into fold’, and then simplify? I’m betting you get something pretty similar to the old foldl’. Try in by hand, and with GHC and let’s see the final optimised code.
I checked this and I see the same code as the old foldl', modulo order of arguments. This is what I expected.
· Under “motivation” you say “GHC generates something essentially like…” and then give some code. Now, if GHC would only eta-expand ‘go’ with a second argument, you’d get brilliant code. And maybe that would help lots of programs, not just this one. It’s a slight delicate transformation but I’ve often thought we should try it; c.f #7994, #5809
I agree that it would be generally useful if GHC did this transformation. However I don't think it's good enough for this particular goal of making foldl' fuse well. Consider a function that flattens a binary tree into a list: data Tree = Tip {-# UNPACK #-} !Int | Bin Tree Tree toList :: Tree -> [Int] toList tree = build (toListFB tree) {-# INLINE toList #-} toListFB :: Tree -> (Int -> r -> r) -> r -> r toListFB root cons nil = go root nil where go (Tip x) rest = cons x rest go (Bin x y) rest = go x (go y rest) Let's say we want to eliminate the intermediate list in the expression (sum (toList t)). Currently sum is not a good consumer, but if it were, after fusion we'd get something like: sumList :: Tree -> Int sumList root = go0 root id 0 go0 :: Tree -> (Int -> Int) -> Int -> Int go0 (Tip x) k = (k $!) . (x+) go0 (Bin x y) k = go0 x (go0 y k) Now, merely eta-expanding go0 is not enough to get efficient code, because the function will still build a partial application every time it sees a Bin constructor. For this recursion to work in an allocation-free way, it must be rather like: go1 :: Tree -> Int -> Int go1 (Tip x) n = x + n go1 (Bin x y) n = go1 y (go1 x n) And this is what we get if we define foldl' and toList in terms of foldrW and buildW. I think a similar problem arises whenever you define a good consumer that traverses a tree-like structure, and you want to use a strict fold to consume a list produced by that producer. Thank you, Takano Akio
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org] *On Behalf Of *Akio Takano *Sent:* 09 January 2014 13:25 *To:* ghc-devs *Subject:* Re: Extending fold/build fusion
Any input on this is appreciated. In particular, I'd like to know: if I implement the idea as a patch to the base package, is there a chance it is considered for merge?
-- Takano Akio
On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano
wrote: Hi,
I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.
The main idea is to have an extended version of foldr:
-- | A mapping between @a@ and @b@. data Wrap a b = Wrap (a -> b) (b -> a)
foldrW :: (forall e. Wrap (f e) (e -> b -> b)) -> (a -> b -> b) -> b -> [a] -> b foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0 where go = unwrap $ \list z' -> case list of [] -> z' x:xs -> f x $ wrap go xs z'
This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop.
Using this, foldl' can be defined as
newtype Simple b e = Simple { runSimple :: e -> b -> b }
foldl' :: (b -> a -> b) -> b -> [a] -> b foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial where wrap (Simple s) e k a = k $ s e a unwrap u = Simple $ \e -> u e id g x next acc = next $! f acc x
The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function 'b -> b', effectively un-CPS-transforming the loop.
I put preliminary code and some more explanation on Github:
https://github.com/takano-akio/ww-fusion
Thank you, Takano Akio

Akio
Aha! So you are really talking about replacing the *entire* foldr/build story with a new one, namely a foldW/buildW story. Presumably all producers and consumers (map, filter, take, drop etc) must be redefined using foldW and buildW instead of fold and build. Is that right?
That is much more significant than the wiki page describes. If you are serious about this, could you perhaps update the wiki page to describe what you propose? Do you believe that the new story will catch every case that the old one does? (Plus some new ones.) Does your data support that?
I'm really not sure about your Tree example. I agree that the foldl' style code gives the result that you show. But I tried the more straightforward version:
sumT :: Tree -> Int
sumT t = foldr (+) 0 (build (toListFB t))
This yielded pretty decent code:
FB.$wgo =
\ (w_sio :: FB.Tree) (ww_sir :: GHC.Prim.Int#) ->
case w_sio of _ {
FB.Tip rb_dgM -> GHC.Prim.+# rb_dgM ww_sir;
FB.Bin x_af0 y_af1 ->
case FB.$wgo y_af1 ww_sir of ww1_siv { __DEFAULT ->
FB.$wgo x_af0 ww1_siv
}
}
This builds no thunks. It does build stack equal to the depth of the tree. But your desired go1 code will also do exactly the same; go1 is strict in its second argument and hence will use call-by-value, and hence will build stack equal to the depth of the tree.
In short, I'm not yet seeing a benefit.
I am probably missing something important.
Suggestion: rather than just reply to this email (soon lost in the email stream), it would be easier for others to join in if you updated your wiki page to say (a) what you propose, and (b) how it can yield benefits that the current setup cannot. Then an email reply can say "go look at section 3" or whatever.
best wishes
Simon
From: Akio Takano [mailto:tkn.akio@gmail.com]
Sent: 14 January 2014 09:22
To: Simon Peyton Jones
Cc: ghc-devs
Subject: Re: Extending fold/build fusion
Thank you for looking at this!
On Tue, Jan 14, 2014 at 1:27 AM, Simon Peyton Jones

On Thu, Jan 16, 2014 at 4:20 AM, Simon Peyton Jones
Akio
Aha! So you are really talking about replacing the *entire* foldr/build story with a new one, namely a foldW/buildW story. Presumably all producers and consumers (map, filter, take, drop etc) must be redefined using foldW and buildW instead of fold and build. Is that right?
Yes
That is much more significant than the wiki page describes. If you are serious about this, could you perhaps update the wiki page to describe what you propose? Do you believe that the new story will catch every case that the old one does? (Plus some new ones.) Does your data support that?
I updated the file. Please see the section "Will the functions currently fusible continue to fuse well?" https://github.com/takano-akio/ww-fusion#will-the-functions-currently-fusibl...
I’m really not sure about your Tree example. I agree that the foldl’ style code gives the result that you show. But I tried the more straightforward version:
sumT :: Tree -> Int
sumT t = foldr (+) 0 (build (toListFB t))
This yielded pretty decent code:
FB.$wgo =
\ (w_sio :: FB.Tree) (ww_sir :: GHC.Prim.Int#) ->
case w_sio of _ {
FB.Tip rb_dgM -> GHC.Prim.+# rb_dgM ww_sir;
FB.Bin x_af0 y_af1 ->
case FB.$wgo y_af1 ww_sir of ww1_siv { __DEFAULT ->
FB.$wgo x_af0 ww1_siv
}
}
This builds no thunks. It does build stack equal to the depth of the tree. But your desired go1 code will also do exactly the same; go1 is strict in its second argument and hence will use call-by-value, and hence will build stack equal to the depth of the tree.
I don't think using foldr is a general replacement for foldl', because (1) it is less efficient when the input is a list and (2) it will change the meaning of the code when the operator to fold with is not associative. -- Akio
In short, I’m not yet seeing a benefit.
I am probably missing something important.
Suggestion: rather than just reply to this email (soon lost in the email stream), it would be easier for others to join in if you updated your wiki page to say (a) what you propose, and (b) how it can yield benefits that the current setup cannot. Then an email reply can say “go look at section 3” or whatever.
best wishes
Simon

Dear Akio, Am Freitag, den 03.01.2014, 23:20 +0900 schrieb Akio Takano:
I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.
I’d like to evaluate your approach, but let me first note that I had been working on #7994 (make foldl a good consumer), and with my patches the compiler is smart enough to eta-expand go in all cases covered by nofib, using the existing foldr/build-fusion. That said, I do like your idea of making the worker/wrapper a bit more explicit, instead of relying on the compiler to do the transformation for us. So let’s see in what ways your proposal surpasses a smarter GHC. The Tree example is a good one, because there any form of eta expansion, just as you write, will not help. And I find that that Simons’s solution of using a foldr-based sum for Trees unsatisfying: We should indeed aim for „sum $ toList tree“ to produce good results. Given that Data.Map is a tree, and that is a common data structure and it’s toList a good producer, this is relevant. Can you implement build via buildW, so that existing code like "map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) can be used unmodified? But probably not... but that would mean a noticeable incompatibility and a burden on library authors using list fusion. In any case, I suggest you just dig in, create a branch of libraries/base and replace everything related to foldr/builder with your approach. First, do not actually change the definition of foldl. Then compare the nofib testruns (probably best with two separate working repo clones, starting from "make distclean"): Do the results differ? A lot of work went into foldr/build-fusion, so we want to be sure that we are not losing anything anywhere (or if we are, we want to know why). Then make foldl and foldl' a good consumer, as in the patch at the beginning of #7994. How large are the gains? How do they compare with the gains from the smarter GHC (numbers also in the ticket). If by then we have not found any regression, things look promising. Greetings, and I hope the delayed responses do not lesen your motivation, Joachim PS: I’m subscribed to the mailinglist, no need to CC me explicitly. -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Hi Joachim,
On Wed, Jan 29, 2014 at 3:06 AM, Joachim Breitner
Dear Akio,
Am Freitag, den 03.01.2014, 23:20 +0900 schrieb Akio Takano:
I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.
I'd like to evaluate your approach, but let me first note that I had been working on #7994 (make foldl a good consumer), and with my patches the compiler is smart enough to eta-expand go in all cases covered by nofib, using the existing foldr/build-fusion.
Nice.
That said, I do like your idea of making the worker/wrapper a bit more explicit, instead of relying on the compiler to do the transformation for us. So let's see in what ways your proposal surpasses a smarter GHC.
The Tree example is a good one, because there any form of eta expansion, just as you write, will not help. And I find that that Simons's solution of using a foldr-based sum for Trees unsatisfying: We should indeed aim for "sum $ toList tree" to produce good results. Given that Data.Map is a tree, and that is a common data structure and it's toList a good producer, this is relevant.
I agree. In fact, my original motivation was that I wanted to efficiently serialize a IntMap into a ByteString.
Can you implement build via buildW, so that existing code like "map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) can be used unmodified? But probably not... but that would mean a noticeable incompatibility and a burden on library authors using list fusion.
You can implement build in terms of buildW. However any list producer defined using that definition of build would produce good code if the final consumer is a left fold. The resulting code will be in CPS. On the other hand, I imagine that if we also annotate foldl with oneShot, this problem may become less severe.
In any case, I suggest you just dig in, create a branch of libraries/base and replace everything related to foldr/builder with your approach. First, do not actually change the definition of foldl. Then compare the nofib testruns (probably best with two separate working repo clones, starting from "make distclean"): Do the results differ? A lot of work went into foldr/build-fusion, so we want to be sure that we are not losing anything anywhere (or if we are, we want to know why).
Then make foldl and foldl' a good consumer, as in the patch at the beginning of #7994. How large are the gains? How do they compare with the gains from the smarter GHC (numbers also in the ticket).
If by then we have not found any regression, things look promising.
Thank you for the advice, I'll have a try. - Akio
Greetings, and I hope the delayed responses do not lesen your motivation, Joachim
PS: I'm subscribed to the mailinglist, no need to CC me explicitly.
-- Joachim "nomeata" Breitner mail@joachim-breitner.de * http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de * GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Dear Akio, Am Freitag, den 31.01.2014, 16:54 +0900 schrieb Akio Takano:
Can you implement build via buildW, so that existing code like "map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) can be used unmodified? But probably not... but that would mean a noticeable incompatibility and a burden on library authors using list fusion.
You can implement build in terms of buildW. However any list producer defined using that definition of build would produce good code if the final consumer is a left fold. The resulting code will be in CPS. On the other hand, I imagine that if we also annotate foldl with oneShot, this problem may become less severe.
Hmm, I guess my question was not precise enough. Let me rephrase: To what extend can you provide the exsting foldr/build API _without_ losing the advantages of your approach? Or put differently: Could you add a section to the wiki that serves as a migration guide to those who want to port their producers and consumers to your system, without having to fully understand what’s going on? Another thing that would be very interesting: Your framework seems to be quite general: Are there other useful worker-wrapper-transformations that one would possibly want to apply to a fused computations, besides the one that makes foldl work well? Other examples of w/w-transformations in GHC include * Unboxing of parameters * Unboxing of return values, returning multiple values but maybe you can think of other interesting examples. Am I right that the _consumer_ of a fused computation decides which worker-wrapper pair to use? I still quite like the approach, mostly because it does so well for lists. I still have to fully grok it, though :-) Greetings, Joachim -- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata@joachim-breitner.de

Dar Akio, I just noticed that even with your approach, the code for foldl-as-foldr is not automatically beautiful. Consider this: I modified the eft function to do to some heavy work at each step (or at least to look like that): myEft :: Int -> Int -> [Int] myEft = \from to -> buildW (myEftFB from to) {-# INLINE myEft #-} expensive :: Int -> Int expensive = (1+) {-# NOINLINE expensive #-} myEftFB :: Int -> Int -> (Wrap f r) -> (Int -> r -> r) -> r -> r myEftFB from to (Wrap wrap unwrap) cons nil = wrap go from nil where go = unwrap $ \i rest -> if i <= to then cons i $ wrap go (expensive i) rest else rest {-# INLINE[0] myEftFB #-} Then I wanted to see if "sum [f..t]" using this code is good: sumUpTo :: Int -> Int -> Int sumUpTo f t = WW.foldl' (+) 0 (myEft f t) And this is the core I get for the inner loop: letrec { $wa :: GHC.Prim.Int# -> GHC.Types.Int -> GHC.Types.Int [LclId, Arity=1, Str=DmdType L] $wa = \ (ww2 :: GHC.Prim.Int#) -> case GHC.Prim.<=# ww2 ww1 of _ { GHC.Types.False -> GHC.Base.id @ GHC.Types.Int; GHC.Types.True -> let { e [Dmd=Just D(L)] :: GHC.Types.Int [LclId, Str=DmdType] e = F.expensive (GHC.Types.I# ww2) } in \ (acc :: GHC.Types.Int) -> case acc of _ { GHC.Types.I# x -> case e of _ { GHC.Types.I# ww3 -> $wa ww3 (GHC.Types.I# (GHC.Prim.+# x ww2)) } } }; } in $wa ww F.sumUpTo1 (GHC 7.6.3, -O). See how it is still building up partial applications. So I am a bit confused now: I thought the (or one) motivation for your proposal is to produce good code in these cases. Or am I using your code wrongly? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On Sat, Feb 1, 2014 at 12:17 AM, Joachim Breitner
Dar Akio,
I just noticed that even with your approach, the code for foldl-as-foldr is not automatically beautiful. Consider this:
I modified the eft function to do to some heavy work at each step (or at least to look like that):
myEft :: Int -> Int -> [Int] myEft = \from to -> buildW (myEftFB from to) {-# INLINE myEft #-}
expensive :: Int -> Int expensive = (1+) {-# NOINLINE expensive #-}
myEftFB :: Int -> Int -> (Wrap f r) -> (Int -> r -> r) -> r -> r myEftFB from to (Wrap wrap unwrap) cons nil = wrap go from nil where go = unwrap $ \i rest -> if i <= to then cons i $ wrap go (expensive i) rest else rest {-# INLINE[0] myEftFB #-}
Then I wanted to see if "sum [f..t]" using this code is good:
sumUpTo :: Int -> Int -> Int sumUpTo f t = WW.foldl' (+) 0 (myEft f t)
And this is the core I get for the inner loop:
letrec { $wa :: GHC.Prim.Int# -> GHC.Types.Int -> GHC.Types.Int [LclId, Arity=1, Str=DmdType L] $wa = \ (ww2 :: GHC.Prim.Int#) -> case GHC.Prim.<=# ww2 ww1 of _ { GHC.Types.False -> GHC.Base.id @ GHC.Types.Int; GHC.Types.True -> let { e [Dmd=Just D(L)] :: GHC.Types.Int [LclId, Str=DmdType] e = F.expensive (GHC.Types.I# ww2) } in \ (acc :: GHC.Types.Int) -> case acc of _ { GHC.Types.I# x -> case e of _ { GHC.Types.I# ww3 -> $wa ww3 (GHC.Types.I# (GHC.Prim.+# x ww2)) } } }; } in $wa ww F.sumUpTo1
(GHC 7.6.3, -O).
See how it is still building up partial applications. So I am a bit confused now: I thought the (or one) motivation for your proposal is to produce good code in these cases. Or am I using your code wrongly?
Yes, this is supposed to work. Fortunately it was an easy fix: https://github.com/takano-akio/ww-fusion/commit/ae26b18b135d92e0df7513e5efc0... Thank you for pointing this out! -- Akio
Greetings, Joachim
-- Joachim "nomeata" Breitner mail@joachim-breitner.de * http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de * GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Dear Akio, Am Samstag, den 01.02.2014, 16:52 +0900 schrieb Akio Takano:
On Sat, Feb 1, 2014 at 12:17 AM, Joachim Breitner
wrote: Yes, this is supposed to work. Fortunately it was an easy fix: https://github.com/takano-akio/ww-fusion/commit/ae26b18b135d92e0df7513e5efc0...
Thank you for pointing this out!
much better, and this way I understand better what’s going on. I’m looking forward to your results from running nofib. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On Fri, Jan 31, 2014 at 6:18 PM, Joachim Breitner
Dear Akio,
Am Freitag, den 31.01.2014, 16:54 +0900 schrieb Akio Takano:
Can you implement build via buildW, so that existing code like "map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) can be used unmodified? But probably not... but that would mean a noticeable incompatibility and a burden on library authors using list fusion.
You can implement build in terms of buildW. However any list producer defined using that definition of build would produce good code if the final consumer is a left fold. The resulting code will be in CPS. On the other hand, I imagine that if we also annotate foldl with oneShot, this problem may become less severe.
Hmm, I guess my question was not precise enough. Let me rephrase: To what extend can you provide the exsting foldr/build API _without_ losing the advantages of your approach?
Sorry, I had a bad typo in the previous message: I meant Any list producer defined using that definition of build would *not* produce good code if the final consumer is a left fold. To answer your question: list producers defined using build will continue to compile, but will *not* be able to take any advantages of foldrW/buildW.
Or put differently: Could you add a section to the wiki that serves as a migration guide to those who want to port their producers and consumers to your system, without having to fully understand what's going on?
I added a section for this: https://github.com/takano-akio/ww-fusion#how-to-make-a-list-produer-work-wit...
Another thing that would be very interesting: Your framework seems to be quite general: Are there other useful worker-wrapper-transformations that one would possibly want to apply to a fused computations, besides the one that makes foldl work well? Other examples of w/w-transformations in GHC include * Unboxing of parameters * Unboxing of return values, returning multiple values but maybe you can think of other interesting examples.
I can't think of anything new, but I think it's often interesting to do a (nested) CPR transformation for a fused function. I added such an example (see serializeTree and foldIO_Ptr) : https://github.com/takano-akio/ww-fusion/blob/master/test.hs#L23 However this kind of tricks may become unnecessary once GHC starts to do nested CPRs.
Am I right that the _consumer_ of a fused computation decides which worker-wrapper pair to use?
Yes.
I still quite like the approach, mostly because it does so well for lists. I still have to fully grok it, though :-)
Greetings, Joachim
-- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata@joachim-breitner.de

Dear Takano, did you manage to apply fold/build fusion to the standard libraries, and run nofib? I guess your starting point should be http://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources and then you can start changing libraries/base. Best if you create two working copies, one that you do not change and one with your changes. Then you can run nofib in both: https://ghc.haskell.org/trac/ghc/wiki/Building/RunningNoFib and compare the results. Do you need help with that? Come by #ghc on freenode! Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

I modified the base library to use foldrW/buildW, with no changes to
foldl yet. nofib showed a very big regression in cryptarithm2, so I'm
looking into it.
Thank you for your help,
Akio
On Mon, Feb 10, 2014 at 7:50 PM, Joachim Breitner
Dear Takano,
did you manage to apply fold/build fusion to the standard libraries, and run nofib?

Dear Akio, Am Dienstag, den 11.02.2014, 08:04 +0900 schrieb Akio Takano:
I modified the base library to use foldrW/buildW, with no changes to foldl yet. nofib showed a very big regression in cryptarithm2, so I'm looking into it.
any news on this front? Did you find out what happened in cryptarithm2? Do you need help with that? I’m currently writing a paper on the Call Arity analysis and tried to summarize your suggestion in the “Related Work” section. Is there any citable document describing your variant of fusion, besides the github page? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On Thu, Mar 13, 2014 at 3:36 AM, Joachim Breitner
Dear Akio,
Am Dienstag, den 11.02.2014, 08:04 +0900 schrieb Akio Takano:
I modified the base library to use foldrW/buildW, with no changes to foldl yet. nofib showed a very big regression in cryptarithm2, so I'm looking into it.
any news on this front? Did you find out what happened in cryptarithm2? Do you need help with that?
I haven't found what happened in cryptarithm2. Maybe I'll look at it in this weekend.
I'm currently writing a paper on the Call Arity analysis and tried to summarize your suggestion in the "Related Work" section. Is there any citable document describing your variant of fusion, besides the github page?
No, unfortunately not. -- Akio
Greetings, Joachim
-- Joachim "nomeata" Breitner mail@joachim-breitner.de * http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de * GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Dear Akio, Am Mittwoch, den 12.03.2014, 19:36 +0100 schrieb Joachim Breitner:
Dear Akio,
Am Dienstag, den 11.02.2014, 08:04 +0900 schrieb Akio Takano:
I modified the base library to use foldrW/buildW, with no changes to foldl yet. nofib showed a very big regression in cryptarithm2, so I'm looking into it.
any news on this front? Did you find out what happened in cryptarithm2? Do you need help with that?
I haven’t heard from you in quite some time. Are you still on this project? Recent investigations into fusion by David Feuer has increased interest in your approach (https://ghc.haskell.org/trac/ghc/ticket/9545). Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

I've been looking into this a bit in the past day or so, and I feel like
some of the stuff in the repository doesn't make sense (to me, at least).
For instance, if you start examining the generated code, you'll see quirks
like, taking map as an example:
map f xs = go xs []
where
go xs n = case xs of
[] -> n
y:ys -> f y : go ys n
In other words, the loop passes along the empty list in an argument for the
base case, which is a waste. This stems from the definition of foldrW:
foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0
where
go = unwrap $ \list z' -> case list of
[] -> z'
x:xs -> f x $ wrap go xs z'
Specifically, the z' becomes this extra threaded argument, and never
disappears. It is possible to fix this by changing the definition of foldrW
to be:
...
[] -> z0
...
And ghc then recognizes that the z' being threaded is useless and
eliminates it. But this raises the question of why it's being threaded this
way in the first place. It seems like the types in Wrap are inappropriate
in some way, at least for the general case. But I'm not yet certain what
they should be.
There are also fusion problems with the current implementation that neither
I nor David have fully figured out yet. For instance:
bar = map (+1) (eft 0 1000)
does not fuse, even after trying many tweaks to the definitions (due to the
eft 0 1000 being pulled out into a let for reasons we don't yet understand;
it even happens with -fno-full-laziness). However:
bar = foldl (+) 0 (map (+1) (eft 0 1000))
fuses fully. The only way to fix the former I've yet found is making buildW
CONLIKE, but that may not be appropriate in general (probably isn't). I
have a sneaking suspicion that the strangeness mentioned in the first part
of this mail may be a contributing factor to this latter issue, too.
-- Dan
On Wed, Sep 3, 2014 at 3:47 PM, Joachim Breitner
Dear Akio,
Am Mittwoch, den 12.03.2014, 19:36 +0100 schrieb Joachim Breitner:
Dear Akio,
Am Dienstag, den 11.02.2014, 08:04 +0900 schrieb Akio Takano:
I modified the base library to use foldrW/buildW, with no changes to foldl yet. nofib showed a very big regression in cryptarithm2, so I'm looking into it.
any news on this front? Did you find out what happened in cryptarithm2? Do you need help with that?
I haven’t heard from you in quite some time. Are you still on this project?
Recent investigations into fusion by David Feuer has increased interest in your approach (https://ghc.haskell.org/trac/ghc/ticket/9545).
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

To answer myself some after having fiddled with this more....
The failure-to-fuse is apparently only an issue if I put the fusible thing
in the same module as the combinators, for reasons I can't explain. If a
separate module imports and defines bar, things fuse fine.
I'm still not sure what to do about the weird nil-passing definitions. I
came up with one possibility, which is to create a different build:
nilWrap :: b -> Wrap (Simple b) b
nilWrap z = Wrap (\(Simple s) e r -> s e r) (\u -> Simple $ \e _ -> u e
z)
buildPlain :: (forall b f. Wrap f b -> (a -> b -> b) -> b -> b) -> [a]
buildPlain g = g (nilWrap []) (:) []
This uses the wrapping to plug in the same nil case at every step, which
eliminates the extra argument when used. But, I don't think this is usable.
Some definitions are okay with this wrapper, but others aren't, and I
believe that foldrW/buildW fusion can cause it to get into places where it
isn't okay. For instance, if we use nilWrap in foldr, then:
foldr f z (reverse xs)
does the wrong thing.
On Thu, Sep 4, 2014 at 5:20 PM, Dan Doel
I've been looking into this a bit in the past day or so, and I feel like some of the stuff in the repository doesn't make sense (to me, at least).
For instance, if you start examining the generated code, you'll see quirks like, taking map as an example:
map f xs = go xs [] where go xs n = case xs of [] -> n y:ys -> f y : go ys n
In other words, the loop passes along the empty list in an argument for the base case, which is a waste. This stems from the definition of foldrW:
foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0 where go = unwrap $ \list z' -> case list of [] -> z' x:xs -> f x $ wrap go xs z'
Specifically, the z' becomes this extra threaded argument, and never disappears. It is possible to fix this by changing the definition of foldrW to be:
... [] -> z0 ...
And ghc then recognizes that the z' being threaded is useless and eliminates it. But this raises the question of why it's being threaded this way in the first place. It seems like the types in Wrap are inappropriate in some way, at least for the general case. But I'm not yet certain what they should be.
There are also fusion problems with the current implementation that neither I nor David have fully figured out yet. For instance:
bar = map (+1) (eft 0 1000)
does not fuse, even after trying many tweaks to the definitions (due to the eft 0 1000 being pulled out into a let for reasons we don't yet understand; it even happens with -fno-full-laziness). However:
bar = foldl (+) 0 (map (+1) (eft 0 1000))
fuses fully. The only way to fix the former I've yet found is making buildW CONLIKE, but that may not be appropriate in general (probably isn't). I have a sneaking suspicion that the strangeness mentioned in the first part of this mail may be a contributing factor to this latter issue, too.
-- Dan
On Wed, Sep 3, 2014 at 3:47 PM, Joachim Breitner
wrote:
Dear Akio,
Am Mittwoch, den 12.03.2014, 19:36 +0100 schrieb Joachim Breitner:
Dear Akio,
Am Dienstag, den 11.02.2014, 08:04 +0900 schrieb Akio Takano:
I modified the base library to use foldrW/buildW, with no changes to foldl yet. nofib showed a very big regression in cryptarithm2, so I'm looking into it.
any news on this front? Did you find out what happened in cryptarithm2? Do you need help with that?
I haven’t heard from you in quite some time. Are you still on this project?
Recent investigations into fusion by David Feuer has increased interest in your approach (https://ghc.haskell.org/trac/ghc/ticket/9545).
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
participants (5)
-
Akio Takano
-
Carter Schonwald
-
Dan Doel
-
Joachim Breitner
-
Simon Peyton Jones