
On Tuesday 27 September 2011, 18:34:01, Yitzchak Gale wrote:
Hi Daniel,
Thanks for clearing up the mystery about the dependence on prime factorization! Very interesting.
at first glance, merging c chunks of length m each is O(c*m); so, since sorting each chunk is O(m*log m), it'd overall be
O(c*m*log m) = O(n*log m) = O(n*log (n/c))
No. In the given use case m is a small constant and c is proportional to n, so the overall complexity is O(n).
Not a contradiction. If m is a constant (> 1), O(n*log m) = O(n) (and sorting chunks of length m is O(m*log m) = O(1) then).
It has larger overhead than sortBy,
That's true. But it's asymptotically better.
Given Dennis' use case, it sounds like m will probably always be < 10. Input size will typically be a few hundred, but could be a few thousand. sortBy would indeed be just fine in those cases.
so if n is small or m is large, sortBy is likely to be better, but if n is large and m relatively small, sortAlmostBy can be significantly (but not dramatically, excluding extreme cases where you can do much better) faster.
At most m will be on the order of 100 if we need to process all the parts of a large symphony orchestra combined together (a very unlikely use case admittedly), and in that case the input size could be as much as 10^6, or possibly even more for something like the Mahler.
Yeah, but log (10^6) is still a fairly small number, so counting reduction steps, it wouldn't be dramatically better. However, I forgot to take memory usage into account, so in terms of wall-clock (or CPU) time, it could be dramatically better even for not too large n and no too small m.
(Come to think of it, it's pretty amazing that Mahler wrote so many notes and actually got people to play them.)
People do stranger things. People listen to Wagner, for example.
There I believe sortAlmostBy could be quite an improvement.
Certainly. (Just included Data.List.sort in the benchmark, for these short lists, it takes less than twice as long as sortAlmostBy)
I'd be interested to hear ideas to make it even better though.
mergeAABy :: (a -> a -> Ordering) -> [[a]] -> [a] mergeAABy cmp ((x:xs):(y:ys):zss) = foo x xs y ys zss where foo u us v vs wss = case cmp u v of GT -> v : case vs of (b:bs) -> foo u us b bs wss [] -> u : us ++ mergeAABy cmp wss _ -> u : case us of (c:cs) -> foo c cs v vs wss [] -> case wss of ((w:ws):kss) -> foo v vs w ws kss _ -> v : vs mergeAABy _ [xs] = xs mergeAABy _ _ = [] seems to be a bit faster (I have made the lists longer, substituted 2000 by 5000): dafis@schwartz:~/Haskell/BeginnersTesting> ./benchAlmost -nis 20 -kis 20 (20,20) -- print parameters to make sure they're evaluated 284090563 -- print sum of all random Ints to make sure they're evaluated warming up estimating clock resolution... mean is 2.183860 us (320001 iterations) found 41404 outliers among 319999 samples (12.9%) 2443 (0.8%) low severe 38961 (12.2%) high severe estimating cost of a clock call... mean is 50.65112 ns (14 iterations) found 2 outliers among 14 samples (14.3%) 2 (14.3%) high mild benchmarking Yitz collecting 100 samples, 1 iterations each, in estimated 6.537795 s mean: 67.05025 ms, lb 66.59363 ms, ub 67.62786 ms, ci 0.950 std dev: 2.616184 ms, lb 2.179318 ms, ub 3.100722 ms, ci 0.950 benchmarking Mine collecting 100 samples, 1 iterations each, in estimated 5.566311 s mean: 56.91069 ms, lb 56.54846 ms, ub 57.37602 ms, ci 0.950 std dev: 2.100948 ms, lb 1.722233 ms, ub 2.585056 ms, ci 0.950 With n = k = 50: benchmarking Yitz collecting 100 samples, 1 iterations each, in estimated 7.366776 s mean: 75.63603 ms, lb 75.17202 ms, ub 76.20199 ms, ci 0.950 std dev: 2.617970 ms, lb 2.227238 ms, ub 3.040563 ms, ci 0.950 benchmarking Mine collecting 100 samples, 1 iterations each, in estimated 6.517696 s mean: 68.00708 ms, lb 67.26089 ms, ub 69.04105 ms, ci 0.950 std dev: 4.463628 ms, lb 3.484169 ms, ub 5.950722 ms, ci 0.950 n = k = 100: benchmarking Yitz collecting 100 samples, 1 iterations each, in estimated 9.756303 s mean: 87.39906 ms, lb 86.90397 ms, ub 87.99232 ms, ci 0.950 std dev: 2.763955 ms, lb 2.382008 ms, ub 3.168909 ms, ci 0.950 benchmarking Mine collecting 100 samples, 1 iterations each, in estimated 7.505393 s mean: 77.05801 ms, lb 76.51954 ms, ub 77.74727 ms, ci 0.950 std dev: 3.117756 ms, lb 2.594126 ms, ub 3.769922 ms, ci 0.950