[GHC] #9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.9 Keywords: | Operating System: Architecture: Unknown/Multiple | Unknown/Multiple Difficulty: Project (more | Type of failure: than a week) | None/Unknown Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Takano Akio's [https://github.com/takano-akio/ww-fusion worker-wrapper fusion] modifies `foldr/build` fairly conservatively to allow safe fusion of `foldl` and friends without relying on arity analysis. This advantage is important for two reasons that I know of: 1. As discussed in Joachim Breitner's original paper, the current arity analysis is unable to infer arity correctly when facing certain forms of recursion. 2. The current arity analysis, and probably ''any'' practical arity analysis, depends on inlining to a degree that can sometimes be unhealthy. I would love to make functions like `hPutStr` fuse, but I suspect doing so safely at present would likely cause a code explosion—the function being folded is too large to want to inline to make it available for arity analysis following fusion. I don't understand it completely yet, but it looks like `foldrW/buildW` can eliminate a lot of this uncertainty. Furthermore, it appears that functions currently written using `foldr` in a "right-handed" way can remain unchanged, using an implementation of `foldr` in terms of `foldrW`. Only "left-handed" folds will need to be rewritten to take advantage of this framework, along with all the `build`s. That said, `foldrW/buildW` probably has its weaknesses too. The basic fusion rule looks like this. {{{#!hs {-# RULES "foldrW/buildW" forall f z (i :: Wrap f b) (g :: forall c g . (Wrap g c) -> (a -> c -> c) -> c -> c) . foldrW i f z (buildW g) = g i f z #-} }}} If `g` and `i` are not inlined sufficiently to merge with each other, I imagine this could produce worse results than `foldr/build` when the latter works properly. The only way to really get a feel for performance would be to carefully modify everything to work as well as it can with this framework and see how the results compare. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: new Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): === Update === Takano Akio's draft implementation has one or a few saturation issues and missing pragmas that prevent some necessary inlining when the fusion pipeline is not "capped" with a `foldrW`. This seems easy to fix, and I have done so for `map`. Dan Doel recognized that `foldrW/buildW` fusion, applied to situations currently handled with `foldr/build`, tends to produce recursive forms that pass a static argument around and around without touching it. He discovered that a modified version of `buildW`, call it `buildWForgetful` for now, can be used instead of `buildW` when recursion defining the builder satisfies a simple condition common to many important examples. One serious difficulty in dealing with this framework is the difficulty of ''understanding'' it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: new Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): === Progress === Dan Doel has made a lot of progress implementing `foldrW/buildW` versions of major list-processing functions, with a drop of help from me. I've done some work on refactoring and rewrite rules. The `foldM` implementation seems to need the monadic right identity law, `m >>= return = m`, to be implemented in `RULES` in order to get clean Core for the general case (inlining may clean it up at some point for specific instances). I really hope no one is breaking this law. === Challenges === The `buildWForgetful` thing is now fairly clearly the wrong way to go. It seems fine if a "static" producer is tied directly to a pure consumer, but if it's tied to a transformer, bugs go flying everywhere. It therefore seems likely to need some help from the compiler to clean up these static args. The current SAT criterion for what should be transformed is ''not'' sufficient to clean up the static args `foldrW/buildW` sometimes produces. In particular, SAT currently will ''only'' touch functions with at least two static args, which these generally don't. It may be possible to find other criteria for SAT being likely to be good. One particularly obnoxious situation I've seen a couple times is a static argument whose value is actually known at compile time. For example, the "index too large or list empty" error message gets passed around through the recursion in `(!!)`. I don't think preserving that sort of static argument can ''ever'' be good. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: closed Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: wontfix | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: new => closed * resolution: => wontfix Comment: Challenges seem to be piling up. The biggest problem is that no one has been able to find local criteria to ensure correctness. The interesting wrappers don't generally satisfy the trivial `wrap . unwrap = id` correctness condition; rather, they rely on more complex interactions with the loop bodies they're embedded in. To ensure correctness, we would need two things: 1. Rules about how a producer can/must use the wrapper, cons, and nil it receives, and 2. Rules about what properties the wrapper, cons, and nil a consumer uses must have to ensure correct behavior when used by a sufficiently well- behaved producer. In essence, these three things have to travel and transform ''together'' in a correct fashion as they travel from a consumer through transformers to a producer. Takano Akio did not write any such things down, and Joachim was unable to make contact. Joachim and Dan couldn't think of anything. Edward K. said this whole thing looked frightening. It may be that one of the wizards out there can come up with a way to make this work, but until then, I think it's probably best to set it aside. Joachim has some other ideas for addressing the underlying issues; hopefully one of them will work out. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: closed Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: wontfix | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by quantheory): I don't think that I have enough information to resolve this, but I've noticed some things that may help clarify the situation, for any onlookers or future developers who haven't already figured them out. Apologies if any of this is too obvious. 1) The `wrap`, `unwrap`, `cons`, and `nil` definitions have to be examined in combination to check correctness. All are, in effect, defined by the consumer. I think that it's a bit confusing to have a `Wrap` type that only contains `wrap`/`unwrap` when (aside from the trivial `wrap . unwrap = id` case) the four have to be examined together to be understood. I'm agnostic as to whether that should be addressed by replacing `Wrap` with a new type that contains all four, or if `wrap`/`unwrap` pairs are so closely linked that it justifies pairing them separately. Aside from `isoSimple`, I'm not sure if many `Wrap` definitions would be reusable for very many consumers. 2) Producers are relatively constrained due to the use of `forall` in `buildW` and `Wrap`. Since consumers choose the definition of `wrap`/`unwrap`/`cons`/`nil` they are much freer. In an ideal (but unproven) case, all producers using buildW would be good, given sufficient constraints on of consumers. 3) These rules seem to me to be ''sufficient'' to ensure that a consumer is correct (disregarding the treatment of _|_): {{{ wrap . unwrap $ cons = cons wrap . unwrap $ (\ _ _ -> nil) = \ _ _ -> nil }}} I've started to sketch a proof of this, but to be frank I'm rather new to this sort of analysis and not at all confident. I'd be perfectly happy either to discuss this, or for someone else to beat me there, or give a counterexample. (I'm also not sure that `unwrap . wrap = id` is important, though probably pairs with this property are easier to understand.) 4) I have much less confidence that my conditions in (3) are necessary than that they are sufficient. They hold for `isoSimple` and the `foldl` implementation by Takano Akio, but I have not quite wrapped my head around, for instance, the scanl implementation here: https://github.com/dolio/ww-fusion This seems to get to the vague neighborhood of the condition on cons in (3), but not quite there, which suggests that either that condition needs to be expanded, or else that this implementation fails for some use of scanl that I haven't tested. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: closed Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: wontfix | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:5 quantheory]:
1) The `wrap`, `unwrap`, `cons`, and `nil` definitions have to be examined in combination to check correctness. All are, in effect, defined by the consumer.
I think that it's a bit confusing to have a `Wrap` type that only contains `wrap`/`unwrap` when (aside from the trivial `wrap . unwrap = id` case) the four have to be examined together to be understood. I'm agnostic as to whether that should be addressed by replacing `Wrap` with a new type
Quite possibly. that contains all four, or if `wrap`/`unwrap` pairs are so closely linked that it justifies pairing them separately. Aside from `isoSimple`, I'm not sure if many `Wrap` definitions would be reusable for very many consumers. Unclear. `wrap . unwrap = id` seems to strong—I don't ''think'' any non- trivial wrappers satisfy it. Joachim thinks there could be a way to use parametricity to come up with a better criterion, and he may very well be right, but whether even a weaker one would be sufficient to implement enough useful functions is unclear.
2) Producers are relatively constrained due to the use of `forall` in `buildW` and `Wrap`. Since consumers choose the definition of `wrap`/`unwrap`/`cons`/`nil` they are much freer. In an ideal (but unproven) case, all producers using buildW would be good, given sufficient constraints on of consumers.
4) I have much less confidence that my conditions in (3) are necessary
What makes a producer valid is not so clear, but yes, it seems less complicated than the consumer side. than that they are sufficient. They hold for `isoSimple` and the `foldl` implementation by Takano Akio, but I have not quite wrapped my head around, for instance, the scanl implementation here:
This implementation fails for some reason you haven't tested. There are very bad interactions between the "static" stuff and `reverse`, and between `scanl` and `reverse`, leading to incorrect results. I tend to blame the static stuff and the `scanl` wrapper, but we don't really know if they would work right in all circumstances without `reverse`. Your efforts to find "local" correctness criteria are appreciated. My sense is that the `scanl` wrapper and the static stuff are problematic because they throw away information that someone else needs. The static stuff is an optimization that may be obviated by work on the static argument transformation. Whether there is a valid wrapper for `scanl` is as yet unclear. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Unclear. `wrap . unwrap = id` seems to strong—I don't ''think'' any non-
#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: closed Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: wontfix | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by quantheory): I have done a little bit more on this, so can elaborate. trivial wrappers satisfy it. I agree. From what I've seen so far, `wrap . unwrap = id` only fits in cases where `build`/`foldr` already worked in the first place.
What makes a producer valid is not so clear, but yes, it seems less complicated than the consumer side. [...] This implementation fails for some reason you haven't tested.
My hope had been that if the definition of a good consumer could be nailed down well enough, then producers would be trivial. But it's looking increasingly difficult to get this right. I did find the issues with scanl and reverse; in fact, even after fixing `reverse . scanl f z $ xs`, I still had issues with `reverse . scanl f z . reverse $ xs`. But I did eventually fix both cases with this change: https://github.com/quantheory/ww- fusion/commit/74618328c34572270154dbd93eb441f448760e0e It seems that, on top of needing `wrap . unwrap $ cons = cons`, if you have multiple layers of wrapper, there are more conditions needed to ensure that one wrapper doesn't break an invariant relied upon by an underlying wrapper. My first attempted fix for scanl used `k` twice, attempting to throw away the result in one case. But it turned out that, when combined with `reverse`, the extra `k b` would be moved to where it could insert extra, wrong values into the stream. This is why, in the commit linked above, I use `n` in `scanlWrap`: presumably `n` should never add new values. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: task | Status: closed Priority: normal | Milestone: Component: | Version: 7.9 libraries/base | Keywords: Resolution: wontfix | Architecture: Unknown/Multiple Operating System: | Difficulty: Project (more Unknown/Multiple | than a week) Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:7 quantheory]:
I did find the issues with scanl and reverse; in fact, even after fixing `reverse . scanl f z $ xs`, I still had issues with `reverse . scanl f z . reverse $ xs`. But I did eventually fix both cases with this change:
https://github.com/quantheory/ww- fusion/commit/74618328c34572270154dbd93eb441f448760e0e
I must admit I'm still nervous about that implementation, for the very simple and unscientific reason that your unwrapper has a `_` in it: {{{#!hs (\u -> Env $ \k b -> unwrap $ \e _ -> u e k b) }}} The fact that it makes me nervous does not mean that it's actually wrong, of course. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: task | Status: closed Priority: normal | Milestone: Component: libraries/base | Version: 7.9 Resolution: wontfix | Keywords: | StaticArgumentTransformation Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by mpickering): * keywords: => StaticArgumentTransformation -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9545: Evaluate Takano Akio's foldrW/buildW fusion framework as a possible replacement for foldr/build -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: (none) Type: task | Status: closed Priority: normal | Milestone: Component: libraries/base | Version: 7.9 Resolution: wontfix | Keywords: | StaticArgumentTransformation Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by maoe): * cc: maoe (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9545#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC