[GHC] #15770: Missing optimisation opportunity in code gen for always-saturated applications?

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- As a part of the work on moving some fragile analyses from Core to final phase before code gen (#9718) I found an unused analysis in CoreToStg which basically assigns each binder this: {{{ data StgBinderInfo = NoStgBinderInfo | SatCallsOnly -- All occurrences are *saturated* *function* calls -- This means we don't need to build an info table and -- slow entry code for the thing -- Thunks never get this value }}} The idea is , as mentioned in the comment, for `SatCallsOnly` binder we can avoid generating code for slow entry. However this is currently not used (see Phab:D5232), and this ticket is to decide whether to use or remove this. I found these comments in `StgCmmClosure` that explains how this became unused and how it was used before: {{{ ----------------------------------------------------------------------------- -- staticClosureRequired ----------------------------------------------------------------------------- {- staticClosureRequired is never called (hence commented out) SimonMar writes (Sept 07) It's an optimisation we used to apply at one time, I believe, but it got lost probably in the rewrite of the RTS/code generator. I left that code there to remind me to look into whether it was worth doing sometime {- Avoiding generating entries and info tables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ At present, for every function we generate all of the following, just in case. But they aren't always all needed, as noted below: [NB1: all of this applies only to *functions*. Thunks always have closure, info table, and entry code.] [NB2: All are needed if the function is *exported*, just to play safe.] * Fast-entry code ALWAYS NEEDED * Slow-entry code Needed iff (a) we have any un-saturated calls to the function OR (b) the function is passed as an arg OR (c) we're in the parallel world and the function has free vars [Reason: in parallel world, we always enter functions with free vars via the closure.] * The function closure Needed iff (a) we have any un-saturated calls to the function OR (b) the function is passed as an arg OR (c) if the function has free vars (ie not top level) Why case (a) here? Because if the arg-satis check fails, UpdatePAP stuffs a pointer to the function closure in the PAP. [Could be changed; UpdatePAP could stuff in a code ptr instead, but doesn't seem worth it.] [NB: these conditions imply that we might need the closure without the slow-entry code. Here's how. f x y = let g w = ...x..y..w... in ...(g t)... Here we need a closure for g which contains x and y, but since the calls are all saturated we just jump to the fast entry point for g, with R1 pointing to the closure for g.] * Standard info table Needed iff (a) we have any un-saturated calls to the function OR (b) the function is passed as an arg OR (c) the function has free vars (ie not top level) NB. In the sequential world, (c) is only required so that the function closure has an info table to point to, to keep the storage manager happy. If (c) alone is true we could fake up an info table by choosing one of a standard family of info tables, whose entry code just bombs out. [NB In the parallel world (c) is needed regardless because we enter functions with free vars via the closure.] If (c) is retained, then we'll sometimes generate an info table (for storage mgr purposes) without slow-entry code. Then we need to use an error label in the info table to substitute for the absent slow entry code. -} staticClosureRequired :: Name -> StgBinderInfo -> LambdaFormInfo -> Bool staticClosureRequired binder bndr_info (LFReEntrant top_level _ _ _ _) -- It's a function = ASSERT( isTopLevel top_level ) -- Assumption: it's a top-level, no-free-var binding not (satCallsOnly bndr_info) staticClosureRequired binder other_binder_info other_lf_info = True -} }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Even if there were saturated calls only, we'd still need the info table and slow entry code to support the heap/stack checks. And we'd need the info table for the closure, which might be referred to by SRTs. And the function has to be not externally visible. So it seems like only in very narrow circumstances could we save anything here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Oh, and most of the time there isn't any slow entry code because we have built-in argument patterns that cover a good proportion of functions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * differential: => Phab:D5232 Comment: Great, because this means less work for #9718. Shall we merge Phab:D5232?
info table and slow entry code to support the heap/stack checks
Out of curiosity, why do we need entry code for heap/stack checks? Do we use slow entry code as continuation when we do GC? Or something else? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Yes, the slow entry code is used as the continuation for the heap check for a function entry point, which saves a lot of code. See `stg_gc_fun` in `rts/HeapStackCheck.cmm`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): I'm conveniently using `satCallsOnly` when performing [https://github.com/sgraf812/ghc/blob/7839fe7fdb1a0b89d1edbff56f7ab6cee6293b0... late lambda lifting]. It would not be terribly upsetting to see this go, but I found it convenient not having to compute the same analysis info myself. The same arguments as for free variables apply, though: It's occurrence information which is probably fragile enough to just be generated on demand. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * cc: sgraf (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
The same arguments as for free variables apply, though: It's occurrence information which is probably fragile enough to just be generated on demand.
I think so too. That is, let's not generate it at the beginning. If LLF wants it, it's probably best just to calculate it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Commit this? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): You mean Phab:D5232? sgraf is using StgBinderInfo for late lambda lift so I thought we probably shouldn't remove it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): I'm convinced Phab:D5232 makes sense, so we should do this. Phab:D5224 can wait. Will there be an analysis that recovers the functionality or do I have to write my own (occurrence analyser on STG)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
Will there be an analysis that recovers the functionality or do I have to write my own (occurrence analyser on STG)?
I think the latter. It's extremely easy of course. The only tricky bit is to decide how to record the results in the tree. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): I continued discussion of the consequences of this ticket for LLF in https://ghc.haskell.org/trac/ghc/ticket/9476#comment:49. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15770: Missing optimisation opportunity in code gen for always-saturated
applications?
-------------------------------------+-------------------------------------
Reporter: osa1 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version:
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D5232
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ömer Sinan Ağacan

#15770: Missing optimisation opportunity in code gen for always-saturated applications? -------------------------------------+------------------------------------- Reporter: osa1 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5232 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => CodeGen -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15770#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC