
Looks plausible to me, but needs a careful Note to explain the issues.
But before we go too far with this, I'd like to point to [wiki:LateLamLift late lambda lifting]. In the core reported in comment:44, all we need to do is lambda-lift `$wc` and `go_up` to top level, and all will be well, I claim. And that is precisely what late- lambda lifting does. And the result might be faster than the very careful code above, because of the extra argument passing and case-tesing it has to do.
To me LLF is low-hanging fruit. There are promising results described on
#8763: forM_ [1..N] does not get fused (10 times slower than go function) -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #7206 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Replying to [comment:61 simonpj]: the wiki page, and the whole join-point business eliminates its principal shortcoming.
I wonder if, before going ahead with this somewhat-delicate `efdtIntFB`
business, it might be fun to re-awaken LLF? I'll give it a try. Without understanding all operational consequences of LLF, I'd still guess that making sure all `c`s inline would be more beneficial in this scenario. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8763#comment:62 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler