
I have a lingering concern: ''why'' did the old `coercionKindRole`
So I wonder if there are more opportunities here. None of this changes
#11735: Optimize coercionKind -------------------------------------+------------------------------------- Reporter: goldfire | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:35 goldfire]: perform so miserably? In a call to `coercionRole`, the kind calculations should never be forced. So what takes up all the memory? Is it really just the tuples? If so, then we've discovered a major way to speed up other areas of GHC: convert tuples to be unboxed. Even better, we've discovered a major missing optimization, which could probably automate the unboxing somehow. I think the hypothesis was something like this: Suppose we have a deeply nested `NthCo`, and we want to calculate its coercionKind. In order to do that, we need its coercionRole, which involves recursing through the whole thing (O(n)), but once we have that result, we only use it to decide whether we need to keep recursing, and then throw it away. And then in the next step, we calculate the coercionRole again. So the whole thing ends up as O(n²). Whereas if we cache the roles throughout, we only ever calculate each of them once, at construction time, so we never get the quadratic badness. So it's not that the role calculation forces the kind calculation, but the other way around - in order to calculate the correct kind for an NthCo, we need to know its role, but potentially also the roles of all of its children. So in a way, the caching serves as memoization. the current direction of travel (caching is a good idea, regardless of my question here), but perhaps suggests another future line of inquiry. Considering how the Grammar.hs example still takes about 20 seconds to compile, and there are a few rather whopping candidates popping up in the profile, yes, I think it is very likely that we can find other opportunities. I will definitely look into the tuple unboxing thing, and also try to get to the bottom of the CoreTidy and simplCast cost centres. Who knows, maybe they're somehow related, even. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:37 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler