Re: [GHC] #11735: Optimize coercionKind

30 Jan 2018

      #11735: Optimize coercionKind
-------------------------------------+-------------------------------------
        Reporter:  goldfire          |                Owner:  (none)
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.10.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by tdammers):

 Replying to [comment:49 goldfire]:
...
It does mean that running time will be proportional to the sum of the
 indices in the `NthCo`s, but that's to be expected.
Ah, you are right, this is where I went wrong - the `nth` calls follow the
 indices, not the nested NthCos themselves. Also explains why I'm not
 seeing the kind of differences in the profiles that I expected.
...
...
The un-refactored version has `coercionKind`, which is simply
recursive (O(n)), and we have `coercionRole`, which recurses via
 `coercionKindRole`. The latter is really just a matter of calling
 `coercionKind` and `coercionRole` individually though; the `coercionRole`
 call makes `coercionRole` simply recursive (O(n)), but the `coercionKind`
 call introduces another O(n), making the entire thing also quadratic.
...
Agreed here. But note that this would remain quadratic even if all the
indices in the `NthCo`s were 1. So, if you call the original
 `coercionKindRole` quadratic, then this would be ''cubic'' (but I don't
 think that's a fair characterization -- there's nothing raised to the
 third power here).
...
...
So in terms of big-O, HEAD and "un-refactored" should be the same. Why
one perfoms better than the other is somewhat unclear to me though.
...
Disagree here, as explained above. The old `coercionKindRole` was
asymptotically better. But now that we cache roles, it should be good
 again.

 Profiling results don't agree, but other than that it seems plausible.
 Maybe we're hitting some sort of edge case here?
...
...
...
On the other hand, the separated version really is quadratic... and
...
...
Yes, but we don't know if it's actually big-O-faster, or just happens
have a more favorable constant factor.
Fair enough.
I'm stymied as to why my patches make things worse. Maybe moving the
 `isReflexiveCo` check in `addCoerce` to the top was a bad idea? Try moving
yet it's faster (on this test case)! That's the conundrum.
 that back to where it was and then try again. The `simplCast` stuff should
 be vastly better than it was!

 This one has me baffled as well. There's a slight possibility that the
 profiling runs I did were contaminated with other activity on the same
 machine, so I might do another run on a separate machine with nothing else
 going on.

-- 
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:51
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler