[GHC] #9434: GHC.List.reverse does not fuse

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.9 Keywords: | Operating System: Architecture: Unknown/Multiple | Unknown/Multiple Difficulty: Easy (less than 1 | Type of failure: Runtime hour) | performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- As Edward Kmett speculated could be the case a couple months ago, Joachim Breitner's call arity analysis makes the Prelude version of `reverse` better than GHC's version. It's less clear to me whether it's beneficial to wrap it in `build`, but I think the answer is ''probably'' yes, based on the fact that doing so turns `foldr c n $ reverse xs` into `foldl (flip c) n xs`. {{{#!hs {-# INLINE reverse #-} reverse :: [a] -> [a] reverse xs = build $ \c n -> foldl (\a x -> x `c` a) n xs }}} This simplifies to {{{#!hs Rec { poly_go_r2uL poly_go_r2uL = \ @ a_a2nn ds_a2xO eta_Xh -> case ds_a2xO of _ { [] -> eta_Xh; : y_a2xT ys_a2xU -> poly_go_r2uL ys_a2xU (: y_a2xT eta_Xh) } end Rec } reverse reverse = \ @ a_a2nn eta_B1 -> poly_go_r2uL eta_B1 ([]) }}} which looks about the same as the current version in GHC.List. Behold the beauty when it is applied to an unfold (with a fusion-friendly version of `unfoldr`): {{{#!hs testReverseUnfoldr f q0 = reverse (unfoldr f q0) }}} simplifies to {{{#!hs testReverseUnfoldr testReverseUnfoldr = \ @ a_a2w3 @ b_a2w4 f_a2mn q0_a2mo -> letrec { go_a1QX go_a1QX = \ b1_a1Hy eta_B1 -> case f_a2mn b1_a1Hy of _ { Nothing -> eta_B1; Just ds_d2d8 -> case ds_d2d8 of _ { (a1_a1Hz, new_b_a1HA) -> go_a1QX new_b_a1HA (: a1_a1Hz eta_B1) } }; } in go_a1QX q0_a2mo ([]) }}} This looks exactly like a hand-written `unfoldl`! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Oh yes, of course. We want to wrap it in `build` to make it fuse with `map`. `reverse` inherently has lousy fusion behavior, but I think we should still do what we can. I think even improving the simple cases of `map f . reverse` and `reverse . map f` is sufficient to justify this change. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): Great. Just check that when fusion ''doesn't'' take place, the result is good. And do a `nofib` comparison for good luck. Then submit a patch. Thanks for doing all this work on fusion, David. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: new => patch Comment: The nofib results look generally good, if I'm reading them right. It appears that with regard to allocations, the biggest benefits are to kahan (-7.8%), gamteb (-4.1%), and scs (-1.4%). The loser, for some reason, seems to be constraints (+3.6%). I don't have much faith in my (good- looking) timing measurements—the results look ... very wonky. I'm hoping maybe someone who's better at such things can try to measure those properly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): That looks helpful. But allocation probably should never go up. Do you know why it does so in constraints? I use `-ticky` a lot to track down these kind of things. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:4 simonpj]:
That looks helpful. But allocation probably should never go up. Do you know why it does so in constraints?
I use `-ticky` a lot to track down these kind of things.
Simon
I don't know how to use `-ticky` yet. How would I get nofib compiled with that? My best guess about constraints is that it's all about the evil spirit in the machine, by which I mean the inliner. One phenomenon I saw in some of my little tests was that inlining thresholds make things rather less compositional than one might like. That is, if `f x = ... g ...` and `g y = ... h ...`, then inlining `h` can make `g` "too big", so then `g` won't be inlined in `f` unless someone leans on the compiler to do so. My bet is that something that calls reverse is not being inlined as a result, and therefore some other optimization opportunity is missed. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): Ticky ticky: https://ghc.haskell.org/trac/ghc/wiki/Debugging/TickyTicky Inlining: yes, it's difficult. Would love solution. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): I've improved the rules for writing `reverse` back to a recursive form. It appears to be good to fuse in most cases. The main exceptions seem to be `filter p . reverse` (although `reverse . filter p` is good) and `takeWhile . reverse` (although `reverse . takeWhile` is good). The latter was investigated by Dan Doel. The problem seems to be that the closures built up in the bad cases end up allocating more than the conses they replace. Even the worst case for `nofib` (a `filter . reverse` form) isn't really too terrible. I think we're probably better off making this change than not. Note: rules to try to convert `filter p . reverse` back to a (slightly) better version seem inherently too fragile to want to bother with. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): AT least for `filter p . reverse`, couldn’t you rewrite it to `reverse . filter p` if that is for whatever reason better? Seems to be a nice small optimization (though, like many of them, probably of little practical relevance.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:8 nomeata]:
AT least for `filter p . reverse`, couldn’t you rewrite it to `reverse . filter p` if that is for whatever reason better? Seems to be a nice small optimization (though, like many of them, probably of little practical relevance.)
Unfortunately, that changes semantics. `filter (==1) (reverse [undefined,1,2]) = 1:undefined` but `reverse (filter (==1) [undefined,1,2]) = undefined` -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Replying to [comment:8 nomeata]:
AT least for `filter p . reverse`, couldn’t you rewrite it to `reverse . filter p` if that is for whatever reason better? Seems to be a nice small optimization (though, like many of them, probably of little
#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:9 dfeuer]: practical relevance.)
Unfortunately, that changes semantics. `filter (==1) (reverse
[undefined,1,2]) = 1:undefined` but `reverse (filter (==1) [undefined,1,2]) = undefined` And to go along with that, as dolio mentioned to me, the performance may be worse if only part of the list is used and the filter predicate is expensive. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Ok, sorry, didn’t think it through. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: patch => new Comment: Letting `reverse` fuse with `foldr` generally seems to be causing more than its fair share of problems, including issues relating to arity analysis. I've been working today on different implementations of `dropWhileEnd`. One I think seems particularly pleasant looks like this: {{{#!hs dropWhileEnd p xs = build (\c n -> foldr (dweFB p c) (const n) xs []) {-# INLINE dropWhileEnd #-} dweFB p c = \x r trues -> if p x then r (x : trues) else foldr c (x `c` r []) (reverse trues) }}} I tested this like so: {{{#!hs testDWE = foldl (*) 1 $ dropWhileEnd (>10) [1::Int .. 1000] }}} If `reverse` is prevented from fusing with `foldr`, this generates what looks to me like very good code (note: if you're looking at the Core, that inner `letrec` becomes a let-no-escape), generating far fewer conses and boxes than the current implementation, even if that one is made `INLINE`. If they're allowed to fuse, however, Call Arity fails to work its magic, and the result is a higher-order mess. So I think for now the answer is probably to just use something like the Report Prelude version that fuses with a `build` but not a `foldr`, and add a rule to exchange `map` with `reverse` in some fashion. I'll come up with a patch to do that shortly. Of course, if Joachim or someone can figure out a better solution or a better characterization of the problem, that'd be great. Side note: for reasons I can't begin to figure out, if the `foldl` above is replaced with `foldl'`, then it doesn't fuse at all. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: new => patch Comment: I think that should take care of the problems I was seeing. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by nomeata): * status: patch => infoneeded Comment: Is this understanding correct: `reverse` is made a good consumer, and additionally, it is a good producer when immediately consumed by `map`. You might make the rules simpler by using `mapReverseFB revId` instead of `revFB`: No need to have both `mapMapReverse` and `mapReverse` then. There is quite a bit of logic in this patch. Would you mind adding test cases for this, so that this behaviour is not accidentally broken in the future? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:14 nomeata]:
Is this understanding correct: `reverse` is made a good consumer, and additionally, it is a good producer when immediately consumed by `map`.
It's made a good consumer, yes. It never becomes a good producer, per se; rather, it fuses with map to form something else that's not a good producer but fuses with map. The purpose of this is to shift the break we introduce in the fusion "pipeline" to the left in the hope that it will coalesce with another such break further on.
You might make the rules simpler by using `mapReverseFB revId` instead of `revFB`: No need to have both `mapMapReverse` and `mapReverse` then.
That's a good idea.
There is quite a bit of logic in this patch. Would you mind adding test cases for this, so that this behaviour is not accidentally broken in the future?
I can add some correctness tests for sure. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): I also just realized it may be better to ditch `mapreverse` altogether and actually interchange the `map` with the `reverse` so as to be sure not to interfere with the `map/map` rule. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9434: GHC.List.reverse does not fuse -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata):
I can add some correctness tests for sure.
Not just functional correctness, but also that things fuse as they should, in the spirit of `T9339`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9434#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC