
Hi Daniel, Thanks for clearing up the mystery about the dependence on prime factorization! Very interesting.
at first glance, merging c chunks of length m each is O(c*m); so, since sorting each chunk is O(m*log m), it'd overall be
O(c*m*log m) = O(n*log m) = O(n*log (n/c))
No. In the given use case m is a small constant and c is proportional to n, so the overall complexity is O(n).
It has larger overhead than sortBy,
That's true. But it's asymptotically better. Given Dennis' use case, it sounds like m will probably always be < 10. Input size will typically be a few hundred, but could be a few thousand. sortBy would indeed be just fine in those cases.
so if n is small or m is large, sortBy is likely to be better, but if n is large and m relatively small, sortAlmostBy can be significantly (but not dramatically, excluding extreme cases where you can do much better) faster.
At most m will be on the order of 100 if we need to process all the parts of a large symphony orchestra combined together (a very unlikely use case admittedly), and in that case the input size could be as much as 10^6, or possibly even more for something like the Mahler. (Come to think of it, it's pretty amazing that Mahler wrote so many notes and actually got people to play them.) There I believe sortAlmostBy could be quite an improvement. I'd be interested to hear ideas to make it even better though. Thanks, Yitz