
From libraries/base/Data/List.hs: If I heap profile the random_list case with only 10000 then I see random_list peaks at using about 2.5M of memory, whereas in the same program using List.sort it uses only 100k.
This is because the random number generator is evaluated too lazy. We can make mergesort more eager by replacing it with natural mergesort. i.e. mergesort :: (a -> a -> Ordering) -> [a] -> [a] mergesort cmp = mergesort' cmp . map wrap gets replaced by mergesort :: (a -> a -> Ordering) -> [a] -> [a] mergesort cmp = mergesort' cmp . map runner -- | decomposes list into monotonic runs runner :: (a -> a -> Ordering) -> [a] -> [[a]] runner _ [] = [] runner cmp l = runner' l where -- | increasing runs runner' xss@(x:xs) = case findrun (\a b->cmp a b/=GT) [x] xs of (run, []) -> [reverse run] ([_], _) -> runner'' xss (run, rest) -> reverse run : runner'' rest -- | decreasing runs. -- | We consider (x>y) instead of (x>=y) to ensure stability. runner'' xss@(x:xs) = case findrun (\a b->cmp a b==GT) [x] xs of (run, []) -> [run] ([_], _) -> runner' xss (run, rest) -> run : runner' rest -- | Do the work. findrun _ a [] = (a, []) findrun less a xss@(x:xs) | (head a) `less` x = findrun less (x:a) xs | otherwise = (a, xss) For the Demo attatched, the speedup is more than threefold for N=10^5 and the stack usage drops from 7Mb to below 1M (with flag -O1). Additionally, natural msort has the additional advantage, that it is O(N) for (almost) (anti-)sorted lists. It is also a littlebit faster on average, because runner actually does some work whereas wrap does not. Greetings, -- Frieder Kalisch f.kalisch@tphys.uni-heidelberg.de
participants (1)
-
Frieder Kalisch