
#16004: Vector performance regression in GHC 8.6 -------------------------------------+------------------------------------- Reporter: guibou | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.3 Component: Compiler | Version: 8.6.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Reproduced with 8.4.3 and 8.6.1 For the original code using -fno-full-laziness performance is almost the same for 8.4 and 8.6, and what little difference there is probably comes from using a different branch order at the Cmm level. {{{ $ ~/bench-exe.exe ./test-8.6-nofloat.exe -- ./test-8.4.exe benchmarking execute: ./test-8.6-nofloat.exe time 1.632 s (1.352 s .. 1.859 s) 0.997 R² (0.988 R² .. 1.000 R²) mean 1.629 s (1.600 s .. 1.658 s) std dev 49.03 ms (0.0 s .. 49.85 ms) variance introduced by outliers: 19% (moderately inflated) benchmarking execute: ./test-8.4.exe time 1.646 s (1.493 s .. 1.863 s) 0.998 R² (0.994 R² .. NaN R²) mean 1.597 s (1.560 s .. 1.622 s) std dev 37.88 ms (0.0 s .. 43.65 ms) variance introduced by outliers: 19% (moderately inflated) }}} The difference between full-laziness not seems to be that with full- laziness we float out the creation of the [0..n] list, instead of transforming the code into a simple loop as intended. So we end up with this inner loop that passes around the list explicitly. I assume deforestation fails here? {{{ joinrec { go2_s7mc go2_s7mc ds3_X6W3 eta2_X3r = case ds3_X6W3 of { [] -> jump exit2_XF eta2_X3r; : y3_X6Yn ys2_X6Yq -> case readDoubleArray# (ipv1_a6zy `cast` Co:50) (+# (*# x1_a5F9 1000#) x_a69v) (eta2_X3r `cast` Co:14) of { (# ipv4_X6am, ipv5_X6ao #) -> case y3_X6Yn of { I# y4_X6c8 -> case writeDoubleArray# (ipv1_a6zy `cast` Co:50) (+# (*# x1_a5F9 1000#) y4_X6c8) ipv5_X6ao ipv4_X6am of s'#1_X6b3 { __DEFAULT -> jump go2_s7mc ys2_X6Yq (s'#1_X6b3 `cast` Co:13) } } } }; } }}} Not an export in the deforestation machinery so leaving that for someone else. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16004#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler