
On Jan 15, 2008 8:16 AM, Bertram Felgenhauer
Marcus D. Gabriel wrote:
By a rather indirect route, I discovered that I obtain an almost factor of two improvement in performance in Data.List.sort if I make one small change in the implementation of the function merge which supports mergesort and hence sortBy and sort. Admittedly, the improvement was only noticeable to me when sorting for example one million random Int. The current code in libraries/base/Data/List.hs for merge is
merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a] merge cmp xs [] = xs merge cmp [] ys = ys [snip]
and all that I did was
merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a] merge cmp [] ys = ys merge cmp xs [] = xs [snip]
that is, I swapped the order of the first two lines. By the way, the second version is much less likely to overflow the stack than the first version.
Can some confirm this? If you confirm it, can someone explain to me why one obtains this performance improvement? I currently just do not grasp the point.
I'm not sure why there is a performance difference, but there is one major difference in the behaviour of these two implementations:
The library version evaluates the list from right to left (i.e. it compares the last two elements of the list first), because the third argument of 'merge' is forced before the second one.
Swapping those two lines of 'merge' results in a version which compares the first two elements of the list first.
This means that the library version produces a stack overflow on lists generated in an iterate like fashion (say, take 1000000 [0..]). The modified version produces a stack overflow on the reverse of that list, but I believe such lists are much rarer in practice.
Bertram, are you sure that's right? There's already a stack overflow in "take 1000000 [0..]" itself (try just "last $ take 1000000 [0..]"). See the bug http://hackage.haskell.org/trac/ghc/ticket/1997 By using [0..1000000] (which doesn't trigger that bug), I can compute both of last $ sort [0..1000000] last $ sort $ reverse [0..1000000] without any stack overflows. (Using ghc-6.8.2) Best, -Judah