
Hi, I've just started learning Haskell and am trying to implement some basic algorithms. My shot at merge sort looks like this: mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = merge (mergeSort as) (mergeSort bs) where (as, bs) = splitInHalf xs splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys merge :: Ord a => [a] -> [a] -> [a] merge xs [] = xs merge [] ys = ys merge (x:xs) (y:ys) = if x < y then x:(merge xs (y:ys)) else y:(merge ys (x:xs)) As far as I can tell, my implementation is more or less the same as on rosetta code [0], literate programs [1] and several stack overflow questions [2][3]. The algorithm works, but for large lists (for example, 500k random numbers), I get a stack overflow. So my question is: How can I change my code so that it works on larger inputs? Or, more generally, how can I write recursive algorithms in functional languages that require more nested function calls than fit in the stack? Regards Florian [0] http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#Haskell [1] http://en.literateprograms.org/Merge_sort_(Haskell) [2] http://stackoverflow.com/questions/7554226/haskell-merge-sort-sorting-words-... [3] http://stackoverflow.com/questions/1215432/merge-sort-in-haskell

Hi, I've just started learning Haskell and am trying to implement some basic algorithms. My shot at merge sort looks like this: mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = merge (mergeSort as) (mergeSort bs) where (as, bs) = splitInHalf xs splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys merge :: Ord a => [a] -> [a] -> [a] merge xs [] = xs merge [] ys = ys merge (x:xs) (y:ys) = if x < y then x:(merge xs (y:ys)) else y:(merge ys (x:xs)) As far as I can tell, my implementation is more or less the same as on rosetta code [0], literate programs [1] and several stack overflow questions [2][3]. The algorithm works, but for large lists (for example, 500k random numbers), I get a stack overflow. So my question is: How can I change my code so that it works on larger inputs? Or, more generally, how can I write recursive algorithms in functional languages that require more nested function calls than fit in the stack? Regards Florian [0] http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#Haskell [1] http://en.literateprograms.org/Merge_sort_(Haskell) [2] http://stackoverflow.com/questions/7554226/haskell-merge-sort-sorting-words-... [3] http://stackoverflow.com/questions/1215432/merge-sort-in-haskell

On 16 September 2013 18:35, Florian Lammel
Hi,
Hi Florian,
I've just started learning Haskell and am trying to implement some basic algorithms.
I'm a beginner in Haskell as well so my advice may be corrected by someone else :)
My shot at merge sort looks like this:
mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = merge (mergeSort as) (mergeSort bs) where (as, bs) = splitInHalf xs
splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys
merge :: Ord a => [a] -> [a] -> [a] merge xs [] = xs merge [] ys = ys merge (x:xs) (y:ys) = if x < y then x:(merge xs (y:ys)) else y:(merge ys (x:xs))
As far as I can tell, my implementation is more or less the same as on rosetta code [0], literate programs [1] and several stack overflow questions [2][3].
Well actually the Rosetta code link shows 3 different implementations and yours is the first of the three. Did you try the second one (the bottom up merge)? This is also what's used in the answer to the SO post [3] and according to the author there it's roughly what's used by ghc for Data.List.sort so it's presumably a good algorithm.
The algorithm works, but for large lists (for example, 500k random numbers), I get a stack overflow.
I'm not sure how this gets optimised but your call-stack is essentially as long as the input list. That's only going to work with such big inputs if the recursion is somehow optimised which I think happens because of lazy evaluation. However this depends on your merge function pulling a similar number of items from the 'as' list as from the 'bs' list. Otherwise (I think) you'll have a whole load of splitInHalf frames in the stack and maybe that gives your overflow. I may be *entirely* wrong but I think the problem comes from returning two lists from a function as you do in splitInHalf but then not consuming the same number of items from each at any one time. I don't think this can be optimised in quite the same lazy manner. It would be fine if you consumed from both lists in unison as in e.g. a merge-sum algorithm.
So my question is: How can I change my code so that it works on larger inputs? Or, more generally, how can I write recursive algorithms in functional languages that require more nested function calls than fit in the stack?
Have you tried the other algorithm from e..g Rosetta code? That one looks like it would work better in a lazy evaluation context since it just works sequentially on a list of lists. Oscar

On Mon, Sep 16, 2013 at 2:34 PM, Oscar Benjamin
I'm not sure how this gets optimised but your call-stack is
GHC's stack is a pattern match stack, *not* a call stack. While implementations are allowed to differ in how they produce non-strict evaluation, GHC uses graph reduction; "calls" are not necessarily done in the order you would expect in a procedural language. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 16 September 2013 19:39, Brandon Allbery
On Mon, Sep 16, 2013 at 2:34 PM, Oscar Benjamin
wrote: I'm not sure how this gets optimised but your call-stack is
GHC's stack is a pattern match stack, *not* a call stack. While implementations are allowed to differ in how they produce non-strict evaluation, GHC uses graph reduction; "calls" are not necessarily done in the order you would expect in a procedural language.
Okay so I'm not so clear on the terminology that should be used for these things. My understanding is that lazy recursion over a list is fine in Haskell where it would be problematic in procedural languages, for example this function: timesTwo (x:xs) = 2*x:(timesTwo xs) recurses for every item in a list and could blow the stack for even small lists in a procedural language but is okay in Haskell. My understanding is that it's because it just evaluates whichever elements of the list are needed at the time so if you consume the list e.g. printing out the items then it never builds up any actual accumulating stack/heap of in-memory data during processing. However when you do this splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys and print out 1000s of values from one list and not the other then surely *something* has to build up in memory. How else would it remember which items belong in the other list? (This is what will happen when the OP tries to sort 500k random numbers since sqrt(500k) is roughly 1000.) Or am I still just not getting it? Oscar

Le 16 sept. 2013 19:35, "Florian Lammel"
splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys
Try the following :
where ~(xs, ys) = splitInHalf xys
And let us know if that help. -- Chaddaï

I an iterative solution works much better. Here's an example, using your
merge function.
mergeSort1 :: Ord a => [a] -> [a]
mergeSort1 = go . map (:[])
where go [] = []
go [xs] = xs
go xss = go (mergePairs xss)
mergePairs (xs:ys:pairs) = merge xs ys : mergePairs pairs
mergePairs pairs = pairs
h> :!ghc -O2 Merge -fforce-recomp
[1 of 1] Compiling Merge ( Merge.hs, Merge.o )
h> :m + Data.List
h> :load Merge
Ok, modules loaded: Merge.
h> :set +s
h> length $ let n = 1000000 in mergeSort [n, n-1 .. 1]
1000000
(5.15 secs, 2924212128 bytes)
h> length $ let n = 1000000 in mergeSort1 [n, n-1 .. 1]
1000000
(1.58 secs, 1027356744 bytes)
h> length $ let n = 1000000 in sort [n, n-1 .. 1]
1000000
(0.24 secs, 107481296 bytes)
The source to Data.List's sort is clever, you should give it a read:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.htm...
On Mon, Sep 16, 2013 at 10:35 AM, Florian Lammel
Hi,
I've just started learning Haskell and am trying to implement some basic algorithms. My shot at merge sort looks like this:
mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = merge (mergeSort as) (mergeSort bs) where (as, bs) = splitInHalf xs
splitInHalf :: [a] -> ([a], [a]) splitInHalf [] = ([], []) splitInHalf [x] = ([x], []) splitInHalf (x:y:xys) = (x:xs, y:ys) where (xs, ys) = splitInHalf xys
merge :: Ord a => [a] -> [a] -> [a] merge xs [] = xs merge [] ys = ys merge (x:xs) (y:ys) = if x < y then x:(merge xs (y:ys)) else y:(merge ys (x:xs))
As far as I can tell, my implementation is more or less the same as on rosetta code [0], literate programs [1] and several stack overflow questions [2][3].
The algorithm works, but for large lists (for example, 500k random numbers), I get a stack overflow.
So my question is: How can I change my code so that it works on larger inputs? Or, more generally, how can I write recursive algorithms in functional languages that require more nested function calls than fit in the stack?
Regards Florian
[0] http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#Haskell [1] http://en.literateprograms.org/Merge_sort_(Haskell) [2] http://stackoverflow.com/questions/7554226/haskell-merge-sort-sorting-words-... [3] http://stackoverflow.com/questions/1215432/merge-sort-in-haskell
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
participants (5)
-
Bob Ippolito
-
Brandon Allbery
-
Chaddaï Fouché
-
Florian Lammel
-
Oscar Benjamin