[GHC] #11029: Performance loss due to eta expansion

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Given the attached file, at both {{{-O2}}} and {{{-O0}}}, GHC translates the function: {{{#!hs test1 [1,2,3,4,5,6,7,8,9,10] = \x -> x test1 _ = \x -> negate x }}} To be equivalent to: {{{#!hs test0 [1,2,3,4,5,6,7,8,9,10] x = x test0 _ x = negate x }}} When applied in a loop with something like: {{{#!hs map (test1 [1..]) [1..1000] }}} The eta-expanded variant is 3x slower. Adding a trace breaks that transformation, and then the code goes 3x faster. Specifically: {noformat} test2 [1,2,3,4,5,6,7,8,9,10] = \x -> x test2 _ = trace "here" $ \x -> negate x {noformat} Timings, as reported by Criterion under O2 with GHC 7.10.2, are: {{{ benchmarking test0 = 40.99 ns (40.96 ns .. 41.02 ns) benchmarking test1 = 41.09 ns (41.06 ns .. 41.14 ns) benchmarking test2 = 17.74 ns (17.68 ns .. 17.81 ns) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by NeilMitchell): * Attachment "Main.hs" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by NeilMitchell: Old description:
Given the attached file, at both {{{-O2}}} and {{{-O0}}}, GHC translates the function:
{{{#!hs test1 [1,2,3,4,5,6,7,8,9,10] = \x -> x test1 _ = \x -> negate x }}}
To be equivalent to:
{{{#!hs test0 [1,2,3,4,5,6,7,8,9,10] x = x test0 _ x = negate x }}}
When applied in a loop with something like:
{{{#!hs map (test1 [1..]) [1..1000] }}}
The eta-expanded variant is 3x slower. Adding a trace breaks that transformation, and then the code goes 3x faster. Specifically:
{noformat} test2 [1,2,3,4,5,6,7,8,9,10] = \x -> x test2 _ = trace "here" $ \x -> negate x {noformat}
Timings, as reported by Criterion under O2 with GHC 7.10.2, are:
{{{ benchmarking test0 = 40.99 ns (40.96 ns .. 41.02 ns) benchmarking test1 = 41.09 ns (41.06 ns .. 41.14 ns) benchmarking test2 = 17.74 ns (17.68 ns .. 17.81 ns) }}}
New description: Given the attached file, at both {{{-O2}}} and {{{-O0}}}, GHC translates the function: {{{#!hs test1 [1,2,3,4,5,6,7,8,9,10] = \x -> x test1 _ = \x -> negate x }}} To be equivalent to: {{{#!hs test0 [1,2,3,4,5,6,7,8,9,10] x = x test0 _ x = negate x }}} When applied in a loop with something like: {{{#!hs map (test1 [1..]) [1..1000] }}} The eta-expanded variant is 3x slower. Adding a trace breaks that transformation, and then the code goes 3x faster. Specifically: {{{#!hs test2 [1,2,3,4,5,6,7,8,9,10] = \x -> x test2 _ = trace "here" $ \x -> negate x }}} Timings, as reported by Criterion under O2 with GHC 7.10.2, are: {{{ benchmarking test0 = 40.99 ns (40.96 ns .. 41.02 ns) benchmarking test1 = 41.09 ns (41.06 ns .. 41.14 ns) benchmarking test2 = 17.74 ns (17.68 ns .. 17.81 ns) }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata):
Given the attached file, at both -O2 and -O0
does this imply that it is not happening with `-O` (which I thought is the most common optimization level)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I have a large collection of eta-related tickets. See "Arity" in [wiki:Status/SLPJ-Tickets]. This one is awkward. If we have {{{ f True = \x -> .. f False = \y -> .. }}} and `f` is usually called applied to two arguments, it's really much better to eta-expand. But if we have `map (f v) xs` it probably isn't better to eta-expand. Generally, GHC will eta expand if the only duplication is pattern matching; but as you point out that is not always right. Currently there is no notion of how ''much'' pattern matching is duplicated, and that might not be hard to add. Another avenue would be to give programmers a way to control the behaviour, along the lines of the existing `oneShot` [https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc- prim-0.4.0.0/GHC-Magic.html docs]. I wish I knew a really good way to deal with this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): JFTR, the relevant code is this bit in `CoreArity.lhs`: {{{#!hs arityType env (Case scrut _ _ alts) | exprIsBottom scrut || null alts = ABot 0 -- Do not eta expand -- See Note [Dealing with bottom (1)] | otherwise = case alts_type of ABot n | n>0 -> ATop [] -- Don't eta expand | otherwise -> ABot 0 -- if RHS is bottomming -- See Note [Dealing with bottom (2)] ATop as | not (ae_ped_bot env) -- See Note [Dealing with bottom (3)] , ae_cheap_fn env scrut Nothing -> ATop as | exprOkForSpeculation scrut -> ATop as | otherwise -> ATop (takeWhile isOneShotInfo as) where alts_type = foldr1 andArityType [arityType env rhs | (_,_,rhs) <- alts] }}} and you can prevent this behaviour with `-fpedantic-bottoms`. And it looks that I made this eta-expansion more aggressive in changeset:2931d19e90d2366f2ce308d65a36333336ca6059 when trying to fix #2915. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NeilMitchell): @nomeata: I confirm {{{-O1}}} has exactly the same behaviour. As for fixes, my first thought is that eta-expansion is valuable even if you are duplicating more than patterns, and sometimes duplicating patterns is too much. My hypothetical solution would be: * Make each function have multiple versions at different arities, so you'd generate core for {{{test1/1}}} and {{{test1/2}}}. (The functions might refer to each other, or not be generated if it didn't seem valuable, to keep the size down.) * Have the call site select between {{{test1/1}}} and {{{test1/2}}} based on the number of arguments available. * After that, you might want to change the desugarer, so that the two functions above produce the same Core. Pattern matching on the first argument could happen after the first argument is supplied, before the second argument. It would be nice if the 2 argument version still went 3x faster when used partially applied. Trying to figure out the arity of a function at it's definition site seems difficult, and can never be optimal for all uses. For context, I got lead down this path starting at view patterns, which have similar considerations but with arbitrarily expensive "patterns", which GHC never duplicates: http://neilmitchell.blogspot.co.uk/2015/10 /viewpatterns-and-lambda-expansion.html. In Shake someone submitted a patch to refactor and move a lambda to the left of an {{{=}}}, which somewhat surprisingly can be much slower. Adding a {{{oneShot}}} equivalent would be very useful, to make such optimisations explicit and robust. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by NeilMitchell): * cc: ndmitchell@… (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11029: Performance loss due to eta expansion -------------------------------------+------------------------------------- Reporter: NeilMitchell | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): Code duplication is an easy way out of many optimization problems; unfortunately, you easily get into exponential blow ups. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11029#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC