[GHC] #10992: Performance regression

#10992: Performance regression --------------------------------------+--------------------------------- Reporter: aleator | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Keywords: performane | Operating System: Linux Architecture: x86_64 (amd64) | Type of failure: None/Unknown Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: --------------------------------------+--------------------------------- For some reason Data.List.sum is much slower than the naive recursive definition for it. Does not seem happen in 7.8. {{{#!hs import Data.List import Criterion.Main sum1 [] = 0 sum1 (x:xs) = x + sum1 xs sum2 = foldr (+) 0 sum3 = foldl' (+) 0 sum4 = getSum #. foldMap Sum main = do let els = [1..100000] defaultMain [ bench "sum0" $ whnf (sum :: [Int] -> Int) els ,bench "sum" $ whnf (sum :: [Int] -> Int) els ,bench "sum1" $ whnf (sum1:: [Int] -> Int) els ,bench "sum2" $ whnf (sum2:: [Int] -> Int) els ,bench "sum3" $ whnf (sum3:: [Int] -> Int) els ,bench "sum4" $ whnf (sum4:: [Int] -> Int) els ] {- benchmarking sum0 time 128.0 ms (121.7 ms .. 133.8 ms) 0.997 R² (0.994 R² .. 1.000 R²) mean 132.8 ms (130.1 ms .. 136.3 ms) std dev 4.622 ms (2.980 ms .. 6.577 ms) variance introduced by outliers: 11% (moderately inflated) benchmarking sum time 131.0 ms (125.7 ms .. 136.7 ms) 0.998 R² (0.993 R² .. 1.000 R²) mean 131.9 ms (128.6 ms .. 136.7 ms) std dev 5.971 ms (2.903 ms .. 9.649 ms) variance introduced by outliers: 11% (moderately inflated) benchmarking sum1 time 26.29 ms (25.43 ms .. 27.24 ms) 0.995 R² (0.991 R² .. 0.998 R²) mean 26.30 ms (25.73 ms .. 26.98 ms) std dev 1.293 ms (953.2 μs .. 1.804 ms) variance introduced by outliers: 15% (moderately inflated) benchmarking sum2 time 26.39 ms (25.49 ms .. 27.26 ms) 0.995 R² (0.991 R² .. 0.998 R²) mean 26.33 ms (25.77 ms .. 26.95 ms) std dev 1.295 ms (941.5 μs .. 1.878 ms) variance introduced by outliers: 15% (moderately inflated) benchmarking sum3 time 7.695 ms (7.670 ms .. 7.722 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 7.703 ms (7.691 ms .. 7.733 ms) std dev 51.21 μs (25.43 μs .. 99.41 μs) benchmarking sum4 time 26.29 ms (25.40 ms .. 27.21 ms) 0.995 R² (0.989 R² .. 0.998 R²) mean 26.36 ms (25.88 ms .. 27.08 ms) std dev 1.364 ms (954.5 μs .. 1.899 ms) variance introduced by outliers: 20% (moderately inflated) -} }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10992 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10992: Performance regression due to lack of inlining of `foldl` and `foldl'`. ---------------------------------+-------------------------------------- Reporter: aleator | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: performane Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | ---------------------------------+-------------------------------------- Comment (by nomeata): I looked into this, and the cause is clear: In 7.10, `foldl (+) 0` (which is how `GHC.List.sum` is defined) resp. `foldl' (+) 0` are not inlined here. If they were, GHC would be able to transform both to the ideal code corresponding to an explicit recursion with accumulator. If I use these definitions: {{{ sum0 xs = sum xs sum3 xs = foldl' (+) 0 xs sum5 xs = foldl (+) 0 xs }}} then I get good code in all cases. The reason is that `foldl` is set up so that it inlines once the third argument is present: {{{ {-# INLINE foldl #-} foldl k z0 xs = ... {-# INLINE foldl' #-} foldl' k z0 xs = }}} This makes sense when trying to achieve list fusion, but even with two arguments (or maybe even one), inlining would be useful for the strictness analyzer to fire. I’m however not sure how relevant unsaturated calls to `foldl` are in practice. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10992#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10992: Performance regression due to lack of inlining of `foldl` and `foldl'`. -------------------------------------+------------------------------------- Reporter: aleator | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 7.10.2 Resolution: | Keywords: performane Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * cc: ekmett (added) * failure: None/Unknown => Runtime performance bug * component: Compiler => Core Libraries Comment: Replying to [comment:1 nomeata]:
If I use...: {{{ sum0 xs = sum xs }}} then I get good code...
Tricky. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10992#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10992: Performance regression due to lack of inlining of `foldl` and `foldl'`. -------------------------------------+------------------------------------- Reporter: aleator | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 7.10.2 Resolution: | Keywords: performane Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): I should add that I’m inclined to suggest to change the definitions to {{{ {-# INLINE foldl #-} foldl k z0 = \xs -> ... {-# INLINE foldl' #-} foldl' k z0 = \xs -> ... }}} which should fix the problem at hand, but I’d like to have at least a second opinion on this by someone. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10992#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10992: Performance regression due to lack of inlining of `foldl` and `foldl'`. -------------------------------------+------------------------------------- Reporter: aleator | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 7.10.2 Resolution: | Keywords: performane Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): comment:3 sounds plausible to me. Needs a `Note` to explain! (And a nofib run.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10992#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC