
#8763: forM_ [1..N] does not get fused (10 times slower than go function) -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #7206 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Nofib suggests that this regresses allocations in `integer` by 6.0% and counted instructions by 0.1%. I had a look at the simplified Core and it seems that it's entirely due to the new definition, although I'm not sure where exactly this allocates more. Maybe it's due to an increase in closure size of `go_up` because of `single`?. Here's the [https://www.diffchecker.com/FrxIUoRQ Core diff] and the [https://github.com/sgraf812/ghc/blob/cf4c1a52916fbf1b6acadd9a2477672b876a860... new definition of efdtIntUpFB for reference]. It seems that `c` is still not inlined, even with the new definition. I assume that's because there are multiple occurences of `c` which were probably duplicated before the inliner had a chance to inline the argument `c`. It better had introduced a join point before. Maybe loopification helps here? Indeed [https://ghc.haskell.org/trac/ghc/ticket/14068#comment:47 #14068] suggests that something beneficial happens, maybe more so with this patch. Or we could introduce some kind of annotation mechanism to tell GHC to be careful not to duplicate occurences of certain parameters that occur once (`f {-# HUGE #-} c n = ...`). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8763#comment:56 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler