
#9476: Implement late lambda-lifting -------------------------------------+------------------------------------- Reporter: simonpj | Owner: sgraf Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: LateLamLift Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #8763 #13286 | Differential Rev(s): Wiki Page: LateLamLift | -------------------------------------+------------------------------------- Comment (by simonpj): The non-recursive case
f-5-<n>: This varied max number of non-recursive args between 5 and 10 (attachment:nonrec-5-6-8-10. Allowing up to 8 parameters had great effect on allocations in cichelli (-10.4%),
Hmm. Consider {{{ let f x = e in ...(f e1)...(f e2)... }}} where `e` has a lot of free vars `y1` .. `y20`. If we lift we get a top- level `f` with 21 args, and the calls look like: {{{ ...(f y1 .. y20 e1) ... (f y1 .. y20 e2)... }}} Now * In the original form we store `y1`..`y20` into the function closure for `f`, once. * In the lifted form we store (most of) `y1`..`y20` on the stack, once for each call. So if we had a lot of calls, there's be more memory traffic. Plus of course, any case-expression evals would need to save `y1`..`y20` on the stack, and the reload them to make the call... I was thinking that maybe for non-rec functions we could claim that lifting was a win regardless of the number of parameters, but I don't think that's true. So it's a bit surprising to me that even with a threshold of 10 we have such a small hit on instruction count. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9476#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler