
Sorry, still confused. :(
How are `coercionKind` and `coercionRole` ''mutually'' recursive? I see
But you're right that I'm trying to understand better why there's a
#11735: Optimize coercionKind -------------------------------------+------------------------------------- Reporter: goldfire | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:40 goldfire]: that `coercionRole` calls `coercionKind` but not the other way around. Sorry for the confusion, I shouldn't be trying to explain things I only half understand myself when tired. Looking at HEAD on `master`, we have the `coercionKindRole` function that is simply recursive for NthCo, and has the `nth` call embedded. So for nested NthCo's, its behavior should be quadratic. The un-refactored version has `coercionKind`, which is simply recursive (O(n)), and we have `coercionRole`, which recurses via `coercionKindRole`. The latter is really just a matter of calling `coercionKind` and `coercionRole` individually though; the `coercionRole` call makes `coercionRole` simply recursive (O(n)), but the `coercionKind` call introduces another O(n), making the entire thing also quadratic. Calling this "mutually recursive" is of course a brainfart on my side, since `coercionKind` never calls back into `coercionRole`. So in terms of big-O, HEAD and "un-refactored" should be the same. Why one perfoms better than the other is somewhat unclear to me though. performance improvement in this patch (even before any caching). In the nested `NthCo` case, I'm pretty sure your refactor would be worse. But in the test case at hand (which I assume doesn't have nested `NthCo`s -- haven't looked), your change is clearly an improvement.
However, I don't think your analysis above is really the problem. I
would expect that the running time of `coercionKind` or `coercionRole` on nested `NthCo`s to be linear in the sum of the `d`s -- that is, we'll have to add together all the indices. You've shown above that the old recursion pattern (from `coercionKindRole`) traverses down the linked list twice (once in `getNth` and once in `nthRole`), but this shouldn't change asymptotic complexity. And, usually, `d` is quite small, and so I wouldn't expect this to show up at all, really. I still don't think we've quite gotten to the bottom of why separating out `coercionKind` and `coercionRole` should effect a performance improvement. Indeed, it's not the problem, or rather, the un-refactoring alone doesn't fix anything - if it makes a difference at all, then it's most likely just a constant-factor improvement, nothing fundamental. But that's not why we did it anyway; the reason we did it was so that we could more easily implement the caching (storing precalculated roles in the NthCo itself), which breaks down one of the linear terms to constant.
On the other hand, the separated version really is quadratic... and yet it's faster (on this test case)! That's the conundrum.
Yes, but we don't know if it's actually big-O-faster, or just happens have a more favorable constant factor.
Please don't let my nit-picking slow you down or discourage you. It's just that I think you've hit something quite interesting, and, well, I'm interested. :)
Absolutely not, your input has been super helpful, and I much prefer hard criticism over empty praise. Bring it on :) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:45 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler