
#12603: INLINE and manually inlining produce different code -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #12747 #12781 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK I know what is happening here. Without an INLINE on `fgFromInt` we get: {{{ -- Initially fgFromInt w = w + (2^8) attrFromIntINLINE w = Attr (fgFromInt w) -- After float-out lvl = 2^8 fgFromInt w = w + lvl attrFromIntINLINE w = Attr (fgFromInt w) -- After inlining attrFromIntINLINE w = case w of I# w' -> case lvl of I# lvl' -> Attr (w' +# lvl') }}} The `Attr` constructor has one strict field, which is reprsented unboxed. We only compute `(2^8)` once. But with an INLINE on `fgFromInt` we get this: {{{ -- Initially fgFromInt w = w + (2^8) attrFromIntINLINE w = Attr (fgFromInt w) -- After float-out lvl = 2^8 fgFromInt w = w + lvl {- INLINE rhs = w + (2^8) -} attrFromIntINLINE w = Attr (fgFromInt w) -- After inlining attrFromIntINLINE w = case w of I# w' -> case 2# ^# 8# of lvl' -> Attr (w' +# lvl') }}} The INLINE pragma promises to inline what you wrote, not some optimised version thereof. So we inline `w + (2^8)`. In pcinciple we should optimise that just as well after it has been inlined. We've missed the float-out pass, but there's another one later. Alas, however, by the time the second float-out pass runs, the `(2^8)` has been transformed to its unboxed form, and currently we don't float those. Result we compute `(2^8)` on each iteration, rather than just once. There's even a `Note` about it in `SetLevels`: {{{ Note [Unlifted MFEs] ~~~~~~~~~~~~~~~~~~~~ We don't float unlifted MFEs, which potentially loses big opportunites. For example: \x -> f (h y) where h :: Int -> Int# is expensive. We'd like to float the (h y) outside the \x, but we don't because it's unboxed. Possible solution: box it. }}} --------------- What do do? The bad thing is really the inability to float expressions of unboxed type, because that inability makes GHC vulnerable to these "phase effects" when optimisation depends declicately on the exact order things happen in. Perhaps we could fix that by boxing. So, just before doing the floating stuff `SetLevels` could do {{{ e :: Int# ---> (case e of y -> \void. y) void ---> let v = /\a -> case e of y -> \void. y in v a void }}} at least when `e` is not cheap. Now it can float the `(case e of y -> I# y)`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12603#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler