
#9279: Optimisation bug ------------------------------+-------------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime performance bug Unknown/Multiple | Test Case: Difficulty: Unknown | Blocking: Blocked By: | Related Tickets: | ------------------------------+-------------------------------------------- I have a strange problem with a local function binding that is not being inlined. I've attached the repro code, build it like this: {{{ $ ghc -O2 Haxl/Core/Monad.hs }}} There are several modules, but the one with the problem is `Haxl/Core/Monad.hs`. In particular `Haxl.Core.Monad.$fApplicativeGenHaxl2`, which contains this fragment: {{{ let { $wa4_s6YQ $wa4_s6YQ = \ w_s6YF _ ww1_s6YM _ _ w1_s6YI -> case GHC.Prim.readMutVar# ipv7_X5Ne w1_s6YI of _ { (# ipv10_X5QO, ipv11_X5QQ #) -> case ipv11_X5QQ of _ { Haxl.Core.Monad.IVarFull a2_a3kK -> case lvl6_r7qi of wild3_00 { }; Haxl.Core.Monad.IVarEmpty dt2_d4Vs -> case GHC.Prim.writeMutVar# ipv7_X5Ne (Haxl.Core.Monad.IVarFull w_s6YF) ipv10_X5QO of s2#_a5OR { __DEFAULT -> case GHC.Prim.readMutVar# dt2_d4Vs s2#_a5OR of _ { (# ipv12_X5R0, ipv13_X5R2 #) -> case GHC.Prim.readMutVar# ww1_s6YM ipv12_X5R0 of _ { (# ipv14_X5SS, ipv15_X5SU #) -> letrec { go_a5ti go_a5ti = \ ds10_a5tj -> case ds10_a5tj of _ { [] -> ipv15_X5SU; : y_a5to ys_a5tp -> GHC.Types.: (y_a5to w_s6YF) (go_a5ti ys_a5tp) }; } in case go_a5ti ipv13_X5R2 of x'_a5P6 { __DEFAULT -> case GHC.Prim.writeMutVar# ww1_s6YM x'_a5P6 ipv14_X5SS of s2#1_a5P7 { __DEFAULT -> (# s2#1_a5P7, Haxl.Core.Monad.cacheRequest6 #) } } } } } } } } in let { a2_s6lH a2_s6lH = \ w_s6YF _ w2_s6YH w3_s6YI -> case w2_s6YH of _ { Haxl.Core.Monad.SchedState ww1_s6YL ww2_s6YM ww3_s6YN ww4_s6YO -> $wa4_s6YQ w_s6YF ww1_s6YL ww2_s6YM ww3_s6YN ww4_s6YO w3_s6YI } } in }}} I want `$wa4` to be inlined at its single occurrence in `a2`. I believe the reason it is not being inlined is that `a2` is a wrapper, but the situation seems silly because the wrapper isn't going away either, so we have a redundant closure being built (this is in my inner loop). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Optimisation bug --------------------------------------------+------------------------------ Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Changes (by simonmar): * owner: => simonpj -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Optimisation bug --------------------------------------------+------------------------------ Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by simonpj): So how do I reproduce this? I get {{{ simonpj@cam-05-unx:~/tmp/T9279$ ghc -O2 Haxl/Core/Monad.hs Haxl/Core/Types.hs:66:8: Could not find module ‘Data.Aeson’ Perhaps you meant Data.Version (from base) Use -v to see a list of the files searched for. Haxl/Core/Types.hs:67:8: Could not find module ‘Data.Hashable’ Use -v to see a list of the files searched for. Haxl/Core/Types.hs:72:8: Could not find module ‘Data.HashMap.Strict’ Perhaps you meant Data.IntMap.Strict (from containers-0.5.5.1) Data.Map.Strict (from containers-0.5.5.1) Use -v to see a list of the files searched for. Haxl/Core/Monad.hs:73:8: Could not find module ‘Control.Concurrent.STM’ Perhaps you meant Control.Concurrent.QSem (from base) Control.Concurrent (from base) Control.Concurrent.Chan (from base) Use -v to see a list of the files searched for. }}} Also can you try with `-ddump-inlinings -dverbose-core2core -ddump-occur- anal` and add the log file? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Optimisation bug -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Differential Revisions: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonmar): You also need to install `aeson`, `unordered-containers`, `stm` and `text` (I think those should be enough). I've collected the log you asked for, it's too big to attach here so I've put it at [http://community.haskell.org/~simonmar/log.bz2]. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Optimisation bug -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Differential Revisions: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: nicolas.frisby@… (added) Comment: OK I can see what is going on; it is more or less as you say. We have something like {{{ let worker = \x. BIG in let {-# INLINE wrapper #-} wrapper = \y. ...worker.... in ...map wrapper xs.... }}} (Technically, the wrapper has a "stable inlining" which I have not written above, which is a copy of the body of the wrapper.) Now, GHC is (in general rightly) unhappy about inlining `worker` in `wrapper`'s rhs, even though there is currently just one textual occurrence of `worker`, because there might (now or in the future) be many occurrences of `wrapper`, and the INLINE pragma means we'll make a copy of `wrapper`'s rhs at each occurrence. However, in this case, `wrapper` is not inlined because it's not applied to anything, so nothing is gained. But something is lost. I can see various possible solutions: * Lambda-lift local function definitions, late in the pipeline. This is something Nicolas Frisby was working on, with promising results, but he sadly he never completed the work. I'm copying him on this ticket. * Towards the end of the pipeline, forget that wrappers are special by discarding their INLINE pragmas. This is a bit delicate; I think you would ''not'' want to do this for things marked INLINE by the programmer. And certainly not for top-level things either (else you'd defeat the entire w/w idea). I wonder how often this happens. Might not be hard to find out; how often does a local function definition with an INLINE pragma appear in the final Core? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Differential Revisions: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nfrisby): Regarding status of late lambda lift, please see ticket:9374#comment:3. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nfrisby): Based only on the eye-balled free variable analysis of $wa4, I think that the current late lambda lift implementation would lift $wa4. ipv7_X5Ne is the only free variable I see in the closure, so replacing $wa4 with ip7_X5Ne in a2_s6lH would not constitute closure growth, which is what would most likely prevent the lift. (I'm assuming $wa4 is not LNE.) I'll attempt the repro with my local build and let you know for sure. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nfrisby): 9c2904c, the tip of `wip/llf`, using my recommended settings (cf LateLamLift) does lift our `$wa4` (or whatever lead to it). The new attachment is the output of `/home/nifr/tmp/llf-for-9279/bin/ghc -fforce-recomp -O2 $(echo $(cat ~/ghc-llf/llf-nr10-r6)) -c Haxl/Core/Monad.hs -ddump-llf -ddump-simpl -dsuppress-all`. The subcommand spews out the "recommended settings" that I refer to as llf-nr10-r6. HTH. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => LateLamLift -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * Attachment "repro2.tgz" added. Repro without reference to unordered-containers/hashable -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): The current tip of my lambda lifting branch will not lift `$wa4`, because that would result in a top-level function with 7 parameters, which is two more than available hardware registers. Since the above link with the reproduction is offline, I had to delete/rename stuff from the original reproduction until it compiled again myself, but I think I have identified the binding which corresponds to the old `$wa4`: {{{ let { $wlvl_smyy = sat-only \r [w_smyz ww_smyA ww1_smyB ww2_smyC ww3_smyD void_X1V] case readMutVar# [ipv7_smyu void#] of { Unit# ipv11_smyH -> case ipv11_smyH of { IVarFull _ -> lvl12_rmt8; IVarEmpty dt2_smyL -> let { sat_smyM = CCCS IVarFull! [w_smyz]; } in case writeMutVar# [ipv7_smyu sat_smyM void#] of s2#_smyN { (##) -> case readMutVar# [dt2_smyL void#] of { Unit# ipv13_smyQ -> <the previous go loop was inlined a few times> }; }; }; }; } in let { lvl20_smz3 = \r [w_smz4 w1_smz5 w2_smz6 void_X1U] case w2_smz6 of { SchedState ww1_smz9 ww2_smza ww3_smzb ww4_smzc -> $wlvl_smyy w_smz4 ww1_smz9 ww2_smza ww3_smzb ww4_smzc void#; }; } in }}} There's also this debug output: {{{ stgLiftLams:goodToLift [$wlvl_smyy] args spill on stack }}} Which tells me that it won't lift entirely due to `$wlvl` taking to many parameters. Which is a shame, because it's pretty clear that 3 parameters are absent. I suspect some Demand Analyser / WW thing at fault here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): You could try `-flate-dmd-anal`, which comes with -O2. It runs the demand analyser late in the pipeline, precisely to eliminate useless absent args. Does it help? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): `-flate-dmd-anal` doesn't seem to have any effect. Weird, considering that `DmdAnal` definitely sees the opportunity, but WW doesn't seem to exploit it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): The absent case in WWLib.hs requires that we can `mk_absent_let` for that particular type. For the 3 absent parameters here are of type `MutVar# s a`, `MutVar# s a` and `TVar# s a`. We ''could'' extend `Literal.absentLiteralOf` for these unlifted, boxed cases simply by returning (the equivalent of) NULL. Alternatively, we could make sure that we don't unpack these things in the first place: If we had taken lifted `MutVar`s/`TVar`s, `mk_absent_let` would successfully conjure a lifted let binding with an absent error (so I'd say). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): If I apply this diff {{{ diff --git a/compiler/basicTypes/Literal.hs b/compiler/basicTypes/Literal.hs index 21f4a92290..4444f69c7b 100644 --- a/compiler/basicTypes/Literal.hs +++ b/compiler/basicTypes/Literal.hs @@ -618,6 +618,8 @@ absentLiteralOf tc = lookupUFM absent_lits (tyConName tc) absent_lits :: UniqFM Literal absent_lits = listToUFM [ (addrPrimTyConKey, MachNullAddr) + , (mutVarPrimTyConKey, MachNullAddr) + , (tVarPrimTyConKey, MachNullAddr) , (charPrimTyConKey, MachChar 'x') , (intPrimTyConKey, mkMachIntUnchecked 0) , (int64PrimTyConKey, mkMachInt64Unchecked 0) }}} Everything WWs properly and the troubling binding `$wlvl` gets lifted to top-level. Given that this might bite somewhere else, should we maybe generalise `absentLiteralOf` by returning `MachNullAddr` whenever it gets a boxed type? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Here's a diff that handles all `UnliftedRep`s uniformly: {{{ diff --git a/compiler/basicTypes/Literal.hs b/compiler/basicTypes/Literal.hs index 21f4a92290..0e9f25f51e 100644 --- a/compiler/basicTypes/Literal.hs +++ b/compiler/basicTypes/Literal.hs @@ -62,6 +62,7 @@ import Binary import Constants import DynFlags import Platform +import RepType import UniqFM import Util @@ -614,11 +615,14 @@ literalType (LitNumber _ _ t) = t absentLiteralOf :: TyCon -> Maybe Literal -- Return a literal of the appropriate primitive -- TyCon, to use as a placeholder when it doesn't matter -absentLiteralOf tc = lookupUFM absent_lits (tyConName tc) +absentLiteralOf tc + | tyConPrimRep tc == [UnliftedRep] + = ASSERT (isUnliftedTyCon tc) Just MachNullAddr + | otherwise + = lookupUFM absent_lits (tyConName tc) absent_lits :: UniqFM Literal -absent_lits = listToUFM [ (addrPrimTyConKey, MachNullAddr) - , (charPrimTyConKey, MachChar 'x') +absent_lits = listToUFM [ (charPrimTyConKey, MachChar 'x') , (intPrimTyConKey, mkMachIntUnchecked 0) , (int64PrimTyConKey, mkMachInt64Unchecked 0) , (wordPrimTyConKey, mkMachWordUnchecked 0) }}} But now `MachNullAddr` isn't always a literal of type `Addr#`. In particular, the definition {{{ -- | Find the Haskell 'Type' the literal occupies literalType :: Literal -> Type literalType MachNullAddr = addrPrimTy literalType (MachChar _) = charPrimTy literalType (MachStr _) = addrPrimTy }}} is probably a lie. But then it also lies for `MachStr` and `MachLabel`s, so maybe this isn't such a bad thing? The binding should be immediately eliminated by the simplifier, after all. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): OK, this diff in isolation provokes Core Lint errors everywhere: {{{ *** Core Lint errors : in result of Worker Wrapper binds *** <no location info>: warning: [RHS of ww_scA1 :: Array# b_sczT] The type of this binder doesn't match the type of its RHS: ww_scA1 Binder's type: Array# b_sczT }}} So, it's pretty clear we would have to add an additional field to `MachNullAddr` (`MachNull` would probably be more appropriate) for the type it represents, similar to `LitNumber`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): `Note [Absent errors]` reads {{{ Note: I did try the experiment of using an error thunk for unlifted things too, relying on the simplifier to drop it as dead code. But this is fragile - It fails when profiling is on, which disables various optimisations - It fails when reboxing happens. E.g. data T = MkT Int Int# f p@(MkT a _) = ...g p.... where g is /lazy/ in 'p', but only uses the first component. Then 'f' is /strict/ in 'p', and only uses the first component. So we only pass that component to the worker for 'f', which reconstructs 'p' to pass it to 'g'. Alas we can't say ...f (MkT a (absentError Int# "blah"))... bacause `MkT` is strict in its Int# argument, so we get an absentError exception when we shouldn't. Very annoying! So absentError is only used for lifted types. }}} But I suppose having just use `NULL` for the other unlifted+boxed cases is OK. We've been doing it for `Addr#` for a long time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Ah, just saw that `AddrRep /= UnliftedRep`... Also, defining `| MachNull Type` makes it hard to define `Ord` and `Binary` instances. Urgh. Not sure how to proceed. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Extending the existing approach by adding e.g. `| MachNullMutVar` doesn't work, because we still don't know which type to return in `literalType`. The type arguments `s` and `a` to the `MutVar#` constructor are lost. So there is no way around this issue unless we store `Type`s in `Literal` that are relevant in `Ord` and `Binary` instances. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): There are two things going on in this ticket 1. Should we inline wrappers late in the pipeline? See comment:4 2. Can we do a better job for "absent" arguments of unlifted types. I'll concentrate on (2) in this comment; but we should not lose sight of (1). In fact it might be better to make (2) a new ticket and leave this one for (1) -- or vice versa. For a long time, the worker/wrapper splitter has given up on absent arguments of certain unlifted types: see `Literal.absentLiteralOf` and `Note [Absent errors]` in `WwLib`. This is very annoying because it means that we get left with functions that take a bunch of arguments they do not use, as in this ticket. For lifted types T we build an absent value as a thunk of form {{{ aBSENT_ERROR_ID @T "Used absent value" }}} This does two things A. It gives us something, of the right type, to use in place of the value we aren't passing any more. B. It gives an extra sanity check: if that value is ever used (a compiler bug) we'll get a runtime error message. For unlifted types we don't have thunks, so we can't do this. As you can see in `absentLiteralOf`, for some types we just make up a silly value: e.g. for `Char#` we use `'x#'`; for `Int#` we use `0#`. Note, however that * Substituting a particular value serves purpose (A) but not purpose (B). A compiler bug would go undetected. This is sad: e.g. #11126 is a real bug that was detected by (B). But I see no way out. * It doesn't work for `Array#`, `MutVar#`, `TVar#` etc because we have no available literal values of those types. So Sebastian is suggesting that we add a new literal value -- call it a '''rubbish value''' -- which can work for any (unlifted type), extending `Literal` something like this {{{ data Literal = ... | RubbishLit Type }}} We need to store the type so we can still do `literalType`. Now * Maybe we could get rid of `MachNullAddr` in favour of this new literal. * I think -- but I am not sure -- that this literal should never occur in code generation. For example, we should never pass a rubbish value to a function. Before then dead-code elimination should have got rid of it I'm not 100% certain, but if this was true, it'd be a great sanity check. * Yes, `Literal` has `Eq` and `Ord` -- but I'm not sure why. Try removing them and seeing what happens! (Generally I think it'd be better to define `eqLit` and `cmpLit` and cal them, rather than use `==` and `>`; so much easier to grep for! And in fact, we do have `eqType` and `cmpType`. * Do we need to spit out a `RubbishLit` in the `Binary` instance. This seems more likely, because perhaps these rubbish values can occur in unfoldings, which are serialised as their parse tree. But the we can just serialise the `Type`. It won't happen much. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Concerning comment@4 I have not looked in detail at the example in the Description, but I'm guessing that we have a situation like this {{{ f x xs = let g :: Int -> Int g y = y+x in map g xs }}} We see that `g` is strict, so we w/w it thus: {{{ f x xs = let $wg :: Int# -> Int# $wg y# = case x of I# x# -> y# + x# g :: Int -> Int {-# Stable unfolding = \y -> <as below> #-} g y = case y of I# y# -> case $wg y# of r# -> I# r# in map g xs }}} But alas, `g` never gets to inline, so it is all in vain. Worse, we have lost out, because `map` calls `g` which calls `$wg` and that's slower than what we started with. What we want is this: * If the only call to `$wg` is from `g`, then inline it back in. Currently there are two calls to `gw`, one in the RHS of `g` and one in the stable unfolding of `g`. If we simply nuked the stable unfolding to `g` (which was added by w/w), then there'd only be one call to `gw` and we'd inine it happily. On the other hand, if the body of f was `(map g xs, g 7)`, then the `g 7` would by now have turned into a call of `$wg`, so whether we inlined `$wg` would depend on how big `$wg` is, which is absolutely fine. Arguably should not do this nuking stuff until after `TidyCore`, which generates bindings to put in the interface file. We want to leave `f`'s RHS undisturbed until then, in case `f` itself is inlined in other modules. (An alternative view: it'd be ok to dump the w/w split in before `TidyCore` because we'll rediscover the strictness (and perhaps better strictness) in any module that inlines `f`.) My conclusion: * for local functions (i.e. bound by a `let`, or at top level but not exported) * that have been w/w'd * at a fairly late stage in the pipeline * kill off the stable-unfolding introduced by w/w * and simplify I think it'd be interesting to try this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Thanks for your input! I created #15627 to track progress on the `mk_absent_let` thing. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure
allocation
-------------------------------------+-------------------------------------
Reporter: simonmar | Owner: simonpj
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.2
Resolution: | Keywords: LateLamLift
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Krzysztof Gogolewski

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): The patch in comment:23 nails issue (2) of comment:20, leaving issue (1) open. So we should not close this ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9279: Local wrapper function remains in final program; result = extra closure allocation -------------------------------------+------------------------------------- Reporter: simonmar | Owner: simonpj Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift, | Simplifier Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: LateLamLift => LateLamLift, Simplifier -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9279#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC