>From libraries/base/Data/List.hs:
> If I heap profile the random_list case with only 10000 then I see
> random_list peaks at using about 2.5M of memory, whereas in the same
> program using List.sort it uses only 100k.
This is because the random number generator is evaluated too lazy.
We
can make mergesort more eager by replacing it with natural mergesort.
i.e.
mergesort :: (a -> a -> Ordering) -> [a] -> [a]
mergesort cmp = mergesort' cmp . map wrap
gets replaced by
mergesort :: (a -> a -> Ordering) -> [a] -> [a]
mergesort cmp = mergesort' cmp . map runner
-- | decomposes list into monotonic runs
runner :: (a -> a -> Ordering) -> [a] -> [[a]]
runner _ [] = []
runner cmp l = runner' l
where
-- | increasing runs
runner' xss@(x:xs) = case findrun (\a b->cmp a b/=GT) [x] xs of
(run, []) -> [reverse run]
([_], _) -> runner'' xss
(run, rest) -> reverse run : runner'' rest
-- | decreasing runs.
-- | We consider (x>y) instead of (x>=y) to ensure stability.
runner'' xss@(x:xs) = case findrun (\a b->cmp a b==GT) [x] xs of
(run, []) -> [run]
([_], _) -> runner' xss
(run, rest) -> run : runner' rest
-- | Do the work.
findrun _ a [] = (a, [])
findrun less a xss@(x:xs)
| (head a) `less` x = findrun less (x:a) xs
| otherwise = (a, xss)
For the Demo attatched, the speedup is more than threefold for N=10^5
and the stack usage drops from 7Mb to below 1M (with flag -O1).
Additionally, natural msort has the additional advantage, that it is
O(N) for (almost) (anti-)sorted lists. It is also a littlebit faster
on average, because runner actually does some work whereas wrap does
not.
Greetings,
--
Frieder Kalisch
f.kalisch(a)tphys.uni-heidelberg.de