
#11735: Optimize coercionKind -------------------------------------+------------------------------------- Reporter: goldfire | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:49 goldfire]:
It does mean that running time will be proportional to the sum of the indices in the `NthCo`s, but that's to be expected.
Ah, you are right, this is where I went wrong - the `nth` calls follow the indices, not the nested NthCos themselves. Also explains why I'm not seeing the kind of differences in the profiles that I expected.
The un-refactored version has `coercionKind`, which is simply
recursive (O(n)), and we have `coercionRole`, which recurses via `coercionKindRole`. The latter is really just a matter of calling `coercionKind` and `coercionRole` individually though; the `coercionRole` call makes `coercionRole` simply recursive (O(n)), but the `coercionKind` call introduces another O(n), making the entire thing also quadratic.
Agreed here. But note that this would remain quadratic even if all the
indices in the `NthCo`s were 1. So, if you call the original `coercionKindRole` quadratic, then this would be ''cubic'' (but I don't think that's a fair characterization -- there's nothing raised to the third power here).
So in terms of big-O, HEAD and "un-refactored" should be the same. Why
one perfoms better than the other is somewhat unclear to me though.
Disagree here, as explained above. The old `coercionKindRole` was
asymptotically better. But now that we cache roles, it should be good again. Profiling results don't agree, but other than that it seems plausible. Maybe we're hitting some sort of edge case here?
On the other hand, the separated version really is quadratic... and
Yes, but we don't know if it's actually big-O-faster, or just happens
have a more favorable constant factor.
Fair enough.
I'm stymied as to why my patches make things worse. Maybe moving the `isReflexiveCo` check in `addCoerce` to the top was a bad idea? Try moving
yet it's faster (on this test case)! That's the conundrum. that back to where it was and then try again. The `simplCast` stuff should be vastly better than it was! This one has me baffled as well. There's a slight possibility that the profiling runs I did were contaminated with other activity on the same machine, so I might do another run on a separate machine with nothing else going on. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:51 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler