
On 16 September 2013 18:35, Florian Lammel
Hi,
Hi Florian,
I've just started learning Haskell and am trying to implement some basic algorithms.
I'm a beginner in Haskell as well so my advice may be corrected by someone else :)
My shot at merge sort looks like this:
mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = merge (mergeSort as) (mergeSort bs) where (as, bs) = splitInHalf xs
splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys
merge :: Ord a => [a] -> [a] -> [a] merge xs [] = xs merge [] ys = ys merge (x:xs) (y:ys) = if x < y then x:(merge xs (y:ys)) else y:(merge ys (x:xs))
As far as I can tell, my implementation is more or less the same as on rosetta code [0], literate programs [1] and several stack overflow questions [2][3].
Well actually the Rosetta code link shows 3 different implementations and yours is the first of the three. Did you try the second one (the bottom up merge)? This is also what's used in the answer to the SO post [3] and according to the author there it's roughly what's used by ghc for Data.List.sort so it's presumably a good algorithm.
The algorithm works, but for large lists (for example, 500k random numbers), I get a stack overflow.
I'm not sure how this gets optimised but your call-stack is essentially as long as the input list. That's only going to work with such big inputs if the recursion is somehow optimised which I think happens because of lazy evaluation. However this depends on your merge function pulling a similar number of items from the 'as' list as from the 'bs' list. Otherwise (I think) you'll have a whole load of splitInHalf frames in the stack and maybe that gives your overflow. I may be *entirely* wrong but I think the problem comes from returning two lists from a function as you do in splitInHalf but then not consuming the same number of items from each at any one time. I don't think this can be optimised in quite the same lazy manner. It would be fine if you consumed from both lists in unison as in e.g. a merge-sum algorithm.
So my question is: How can I change my code so that it works on larger inputs? Or, more generally, how can I write recursive algorithms in functional languages that require more nested function calls than fit in the stack?
Have you tried the other algorithm from e..g Rosetta code? That one looks like it would work better in a lazy evaluation context since it just works sequentially on a list of lists. Oscar