Re: [GHC] #11735: Optimize coercionKind

26 Jan 2018

      ...
I have a lingering concern: ''why'' did the old `coercionKindRole`
...
So I wonder if there are more opportunities here. None of this changes
#11735: Optimize coercionKind
-------------------------------------+-------------------------------------
        Reporter:  goldfire          |                Owner:  (none)
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.10.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by tdammers):

 Replying to [comment:35 goldfire]:
 perform so miserably? In a call to `coercionRole`, the kind calculations
 should never be forced. So what takes up all the memory? Is it really just
 the tuples? If so, then we've discovered a major way to speed up other
 areas of GHC: convert tuples to be unboxed. Even better, we've discovered
 a major missing optimization, which could probably automate the unboxing
 somehow.

 I think the hypothesis was something like this: Suppose we have a deeply
 nested `NthCo`, and we want to calculate its coercionKind. In order to do
 that, we need its coercionRole, which involves recursing through the whole
 thing (O(n)), but once we have that result, we only use it to decide
 whether we need to keep recursing, and then throw it away. And then in the
 next step, we calculate the coercionRole again. So the whole thing ends up
 as O(n²). Whereas if we cache the roles throughout, we only ever calculate
 each of them once, at construction time, so we never get the quadratic
 badness. So it's not that the role calculation forces the kind
 calculation, but the other way around - in order to calculate the correct
 kind for an NthCo, we need to know its role, but potentially also the
 roles of all of its children. So in a way, the caching serves as
 memoization.

 the current direction of travel (caching is a good idea, regardless of my
 question here), but perhaps suggests another future line of inquiry.

 Considering how the Grammar.hs example still takes about 20 seconds to
 compile, and there are a few rather whopping candidates popping up in the
 profile, yes, I think it is very likely that we can find other
 opportunities. I will definitely look into the tuple unboxing thing, and
 also try to get to the bottom of the CoreTidy and simplCast cost centres.
 Who knows, maybe they're somehow related, even.

-- 
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:37
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler