RE: Weird profiling behaviour

Simon Marlow

26 Jun 2002 26 Jun '02

1:35 p.m.

...
...
The default definition of sortBy uses insertion sort

...
I have vague recollection of the wisdom of this choice being questioned

And now I think I'm about question it as well...

5.02 uses quicksort, but 5.04 will use mergesort instead which has much more predictable performance behaviour. Cheers, Simon

Show replies by date

Koen Claessen

26 Jun 26 Jun

2:06 p.m.

New subject: mergesort

Simon, | 5.02 uses quicksort, but 5.04 will use mergesort | instead which has much more predictable performance | behaviour. What implementation of mergesort are you using? (Could you send me code?) I found that all implementations of mergesort I tried perform badly on large list (like 10000 elements). GHC blows out of stack in that case, but not GHC's current built-in sort function. I assume you also realize that quicksort behaves more lazy than mergesort, which might be the reason for the above. I guess there should be a module from which one can pick the best known implementation of a particular sorting algorithm. Regards, /Koen.

Andrew J Bromage

11:26 p.m.

New subject: mergesort

G'day all. On Wed, Jun 26, 2002 at 04:06:50PM +0200, Koen Claessen wrote:

...

What implementation of mergesort are you using? (Could you send me code?)

I don't know for sure, but I'd be willing to bet it's Carsten Holst's merge sort with Andy Gill's partition function. If you have the GHC sources handy, you'll find it in ghc/compiler/utils/Util.lhs and the function is called mergeSort (note the capitalisation). There are a number of variations for various alternate orderings in there too. If you don't have the GHC source and don't want to download it for some reason, I've temporarily put a copy of the relevant file at: http://alicorna.net/Util.lhs This will be removed in a few days. Cheers, Andrew Bromage

ketil＠ii.uib.no

27 Jun 27 Jun

6:49 a.m.

New subject: Weird profiling behaviour

"Simon Marlow" writes:

...

5.02 uses quicksort,

That's funny, since I see quadratic scaling, I must be hitting worst case both times? 'sort' and 'sortBy' *are* implemented in the same way, right? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Colin Runciman

7:39 a.m.

New subject: Weird profiling behaviour

Ketil Z. Malde wrote:

...

...
5.02 uses quicksort,

That's funny, since I see quadratic scaling, I must be hitting worst case both times? 'sort' and 'sortBy' *are* implemented in the same way, right?

Implementations of QuickSort on lists usually take the easy option of using the head of the list as the threshold value for partitioning. As a consequence QuickSort does indeed degenerate to quadratic cost in the worst case. Also, curiously enough, it could just as well be the problem that your int-sorting phase has too *little* sorting to do, as this common version of quickSort degenerates both for in-order and reverse-order inputs. Regards Colin R

ketil＠ii.uib.no

9:10 a.m.

New subject: Weird profiling behaviour

Colin Runciman writes:

...

Also, curiously enough, it could just as well be the problem that your int-sorting phase has too *little* sorting to do, as this common version of quickSort degenerates both for in-order and reverse-order inputs.

*lights go on* Of course! While I have about 90K values to sort, it's only a range from 0 to about 5-600, and a less than even distribution at that. (I must be a lot denser than I thought. Colin, if you ever happen to pass by, do let me know, I think I owe you a beer.) Okay: bucket sort; does anybody know of a nice bucket sort I can rip off? :-) (Actually, while I haven't done the math or the tests to say for sure, I suspect a trivial mod to QS where equals are kept in a separate list might do just fine. Would that be a sensible modification to put in the standard libraries, I wonder?) Thanks again! -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Eray Ozkural

15 Jul 15 Jul

3:44 p.m.

New subject: Weird profiling behaviour

On Thursday 27 June 2002 12:10, Ketil Z. Malde wrote:

...

*lights go on*

Of course! While I have about 90K values to sort, it's only a range from 0 to about 5-600, and a less than even distribution at that. (I must be a lot denser than I thought. Colin, if you ever happen to pass by, do let me know, I think I owe you a beer.)

Why don't you implement a decent quicksort (like randomized quicksort) that works on an array? -- Eray Ozkural Comp. Sci. Dept., Bilkent University, Ankara www: http://www.cs.bilkent.edu.tr/~erayo GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C

8387

Age (days ago)

8406

Last active (days ago)

List overview

Download

6 comments

6 participants

participants (6)

Andrew J Bromage
Colin Runciman
Eray Ozkural
ketil＠ii.uib.no
Koen Claessen
Simon Marlow