
#8763: forM_ [1..N] does not get fused (allocates 50% more) -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #7206 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Sure will do. Let's begin with the original problem. There's two functions in [[attachment:reprof.hs]]. One is `f`, having a loop of the `forM_ [2..n] body` form, the other is `g` which has the `forM_ [2,3..n] body` form. The former doesn't allocate (56kB in total), whereas the latter allocates quite a lot (960MB). Why is that? The arithmetic sequences desugar, rewrite and specialise to `build (\c t -> eftIntFB c z 2 n)` and `build (\c z -> efdtIntFB c z 2 3 n)`, respectively. That cancels away with `forM_`'s implementation in terms of `foldr`: {{{ forM_= flip mapM_ mapM_ body = {- definition -} foldr ((>>) . body) (return ()) = {- eta expand the section -} foldr (\x k -> body x >> k) (return ()) = {- (>>) of ST, written as `s -> (a, s)` for lighter syntax -} foldr (\x k s -> case (body x s) of (_, s') -> k s') (return ()) }}} Note how `k` occurs in tail position within the lambda. Now, this cancels with the definition of `ef(d)tIntFB`: {{{ foldr (\x k s -> case (body x s) of (_, s') -> k s') (return ()) . build (\c z -> efdtIntFB c z 2 3 n) = {- foldr/build -} efdtIntFB (\x k s -> case (body x s) of (_, s') -> k s') (return ()) 2 3 n }}} So, that lambda is what is bound to `c` when `ef(d)tIntFB` gets inlined. Now, the implementation of `eftIntFB` only has a single occurrence of `c`, so it will inline properly and bind its `k` parameter (the continuation in tail position) to the recursive `go` call here [[https://github.com/ghc/ghc/blob/fa3143c76ac77ee96fd89559cacc089205abaa20/lib...]]. The call to `k` turns into a tail call within `go`s body: {{{ go x = I# x `c` if isTrue# (x ==# y) then n else go (x +# 1#) ==> c = \x k s -> case (body x s) of (_, s') -> k s' go x s = case (body (I# x) s) of (_, s') -> if isTrue# (x ==# y) then n s' else go (x +# 1#) s' }}} And `go` can be made a join point. The same isn't possible in the current `efdtIntFB`, because it duplicates `c` by branching on whether to count up or down (and also within the loop itself, anyway): {{{ efdtIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> Int# -> r efdtIntFB c n x1 x2 y | isTrue# (x2 >=# x1) = efdtIntUpFB c n x1 x2 y | otherwise = efdtIntDnFB c n x1 x2 y }}} Now, when `efdtInt{Up,Dn}FB` gets inlined, we end up with a let-bound `c` that still tail-calls its argument `k`, and is even tail-called itself within `go_up`/`go_dn` (here https://github.com/ghc/ghc/blob/fa3143c76ac77ee96fd89559cacc089205abaa20/lib...): {{{ let c x k s = case (body x s) of (_, s') -> k s' in ... let go_up x | isTrue# (x ># y') = I# x `c` n | otherwise = I# x `c` go_up (x +# delta) }}} Note that `go_up` tail calls `c` and passes itself as the `k` parameter. If `c` was inlined, all would be fine and `go_up` would turn into a join point. That's not the case because `c` is duplicated in `efdtIntFB` and then one more time in `efdtInt{Up,Dn}FB`. My first implementation in comment:52 (for which you provided a simplification in comment:54) dealt with the latter, while the idea in comment:60 is supposed to deal with the former. Sadly, case-of-case seems to undo the painful de-duplication of the `c` parameter (that's what comment:71 is about). Why doesn't LLF help here? Well, lifting out `c` to top-level gets rid of allocations for `c`, but there's still at least the allocation for the thunk for `go_up (x+1)` (the `Int` box goes away because of strictness). Also, the call to `go_up` is still an unknown call, as opposed to the simple join call we would get by inlining `c`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8763#comment:74 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler