ANN: psqueue-benchmarks - benchmarks of priority queue implementations

(This is a slightly detailed email. If you are the maintainer of one of the packages benchmarked here, you might want to read it though.) Today I was looking for a Priority Queue that also allows a delete operation (some call this a "Priority Search Queue"). I found http://stackoverflow.com/questions/6976559/comparison-of-priority-queue-impl... and after looking at the 10 queue alternatives, I got to the following conclusions: * Only 3 of them allow to delete entries (are p*s*queues) * Most of them are outdated or at least unmaintained for the last 3-5 years * There was an effort to get a priority queue into containers (see the stackoverflow link), but it was not agreed on * Those efforts were driven by Louis Wasserman, who wrote one of the 3 psqueues (queuelike), but now only maintains a non-ps-queue (pqueue) that seems to be the most popular priority queue implementation PSQueue implementations ----------------------- The three packages that are psqueues are: - PSQueue (http://hackage.haskell.org/package/PSQueue-1.1) * original implementation from paper of Ralf Hinze * last upload 2008 * no test suite (but small commented out QC properties), no benchmarks * no code repository - fingertree-psqueue (http://hackage.haskell.org/package/fingertree-psqueue-0.3) * last upload 2011 * no tests, no benchmarks * no code repository - queuelike (http://hackage.haskell.org/package/queuelike-1.0.9) * last upload 2009 * no tests, no benchmarks * no code repository Benchmarks ---------- Unfortunately, none of them had tests, code in repositories or any indication about their real-world performance, so I made some criterion benchmarks. You can find them here: https://github.com/nh2/psqueue-benchmarks Graphs: http://htmlpreview.github.com/?https://raw.github.com/nh2/psqueue-benchmarks... Benchmark results ----------------- * PSQueue throws a stack space overflow if you try to put in 100000 Ints * PSQueue suffers from some significant worst case in terms of queue creation, sometimes creating a queue from random numbers just takes 5 times longer (1st graph). This only happens sometimes (despite Criterion) * queuelike creation is instant - it seems to work around my benchmark somehow * converting a queuelike queue to a list surprisingly takes 10 times longer than with the other packages * in terms of average performance, the three are quite close to each other (apart from the point above). queuelike seems fastest, fingertree-psqueue is second and PSQueue slowest, with a difference of +30% to the next one My questions to the maintainers ------------------------------- @Scott: Do you have an idea why PSQueue stack overflows? @Louis: Why did you switch from queuelike to pqueue? Could you put the code up somewhere manageable (repo)? Is it possible to make pqueue a full pSqueue implementation? Niklas

Hi Niklas,
* PSQueue throws a stack space overflow if you try to put in 100000 * Ints
A slightly different implementation is used in GHC: https://github.com/ghc/packages-base/blob/master/GHC/Event/PSQ.hs Could you test it? If this code also has the same problem, I need to fix it. --Kazu

I do not know why it overflows. It's been a while, but isn't the answer
usually "too much laziness"? Maybe try changing the foldr in fromList to
foldr'? I would try it out quickly but do not have ghc installed on any
computers here.
I am happy start a repo for this library, but there is not much history to
import so anyone else may do it. I'm not sure how hackage upload
permissions work... I guess I just change the maintainer field in the
.cabal file from myself to someone else...? Any volunteers?
On Thu, Mar 28, 2013 at 11:16 PM, Kazu Yamamoto
Hi Niklas,
* PSQueue throws a stack space overflow if you try to put in 100000 * Ints
A slightly different implementation is used in GHC:
https://github.com/ghc/packages-base/blob/master/GHC/Event/PSQ.hs
Could you test it? If this code also has the same problem, I need to fix it.
--Kazu

Hey Scott, I quickly tried your suggestion, plugging in foldr' from Data.Foldable and sprinkling a few seqs in some places, but it doesn't help the stack overflow. On Fri 29 Mar 2013 16:23:55 GMT, Scott Dillard wrote:
I do not know why it overflows. It's been a while, but isn't the answer usually "too much laziness"? Maybe try changing the foldr in fromList to foldr'? I would try it out quickly but do not have ghc installed on any computers here.
I am happy start a repo for this library, but there is not much history to import so anyone else may do it. I'm not sure how hackage upload permissions work... I guess I just change the maintainer field in the .cabal file from myself to someone else...? Any volunteers?
On Thu, Mar 28, 2013 at 11:16 PM, Kazu Yamamoto
mailto:kazu@iij.ad.jp> wrote: Hi Niklas,
> * PSQueue throws a stack space overflow if you try to put in 100000 > * Ints
A slightly different implementation is used in GHC:
https://github.com/ghc/packages-base/blob/master/GHC/Event/PSQ.hs
Could you test it? If this code also has the same problem, I need to fix it.
--Kazu

I had a 5 second look at the PSQueue implementation and here's what I got
so far:
* fromList should use foldl'.
* LTree should be spine strict (i.e. strict in the (LTree k p) fields).
* Winner should be strict in the (LTree k p) field and probably in all
other fields as well.
This is a nice example showing that strict fields is the right default. If
fields need to be lazy there should ideally be a comment explaining why
that is needed (e.g. in the case of finger trees and lists).
On Fri, Mar 29, 2013 at 9:53 AM, Niklas Hambüchen
Hey Scott,
I quickly tried your suggestion, plugging in foldr' from Data.Foldable and sprinkling a few seqs in some places, but it doesn't help the stack overflow.
On Fri 29 Mar 2013 16:23:55 GMT, Scott Dillard wrote:
I do not know why it overflows. It's been a while, but isn't the answer usually "too much laziness"? Maybe try changing the foldr in fromList to foldr'? I would try it out quickly but do not have ghc installed on any computers here.
I am happy start a repo for this library, but there is not much history to import so anyone else may do it. I'm not sure how hackage upload permissions work... I guess I just change the maintainer field in the .cabal file from myself to someone else...? Any volunteers?
On Thu, Mar 28, 2013 at 11:16 PM, Kazu Yamamoto
mailto:kazu@iij.ad.jp> wrote: Hi Niklas,
> * PSQueue throws a stack space overflow if you try to put in 100000 > * Ints
A slightly different implementation is used in GHC:
https://github.com/ghc/packages-base/blob/master/GHC/Event/PSQ.hs
Could you test it? If this code also has the same problem, I need to fix it.
--Kazu
_______________________________________________ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell

Hey Kazu, I added GHC's PSQ to the benchmark, the new figures are on http://htmlpreview.github.com/?https://raw.github.com/nh2/psqueue-benchmarks... No, it does not stack overflow, and it seems to perform slightly better than the other implementations; it also doesn't suffer from the toList slowness problem as does listlike. However, it is probably not as generally usable as it hardcodes the priorities to be Doubles. (So far I only benchmark really trivial things and the other functions could be benchmarked as well.) Niklas On 29/03/13 06:16, Kazu Yamamoto (山本和彦) wrote:
Hi Niklas,
* PSQueue throws a stack space overflow if you try to put in 100000 * Ints
A slightly different implementation is used in GHC:
https://github.com/ghc/packages-base/blob/master/GHC/Event/PSQ.hs
Could you test it? If this code also has the same problem, I need to fix it.
--Kazu

Hi Niklas,
No, it does not stack overflow, and it seems to perform slightly better than the other implementations; it also doesn't suffer from the toList slowness problem as does listlike.
Thanks. It's nice.
However, it is probably not as generally usable as it hardcodes the priorities to be Doubles.
I think that you can import the tips of GHC PSQ to original PSQ. P.S. If you need test cases, you can find some properties for Heap (priority queue) here: https://github.com/kazu-yamamoto/llrbtree/blob/master/test/Heap.hs You can add some properties relating dilatation to them. --Kazu

I actually found a (potential) problem with the GHC implementation. See here: https://github.com/nh2/psqueue-benchmarks/blob/db89731c5b4bdd2ff2ef81022a65f... If I fromList 1000000 entries into the queue, it stack space overflows. I got the same problem with the fingertree implementation, so maybe I just construct the test case wrong and cause the stack space overflow myself, but it works with the other two implementations. Also, looking at the updated graph: http://htmlpreview.github.com/?https://raw.github.com/nh2/psqueue-benchmarks... we can see that GHC's queue is 3 times slower than queuelike for "findmin sequential". Where could the stack overflows come from? Niklas On 30/03/13 09:07, Kazu Yamamoto (山本和彦) wrote:
Hi Niklas,
No, it does not stack overflow, and it seems to perform slightly better than the other implementations; it also doesn't suffer from the toList slowness problem as does listlike.
Thanks. It's nice.
However, it is probably not as generally usable as it hardcodes the priorities to be Doubles.
I think that you can import the tips of GHC PSQ to original PSQ.
P.S.
If you need test cases, you can find some properties for Heap (priority queue) here:
https://github.com/kazu-yamamoto/llrbtree/blob/master/test/Heap.hs
You can add some properties relating dilatation to them.
--Kazu

Does not compiles under ghc 7.6.2
Date: Sat, 13 Apr 2013 11:09:13 +0800 From: mail@nh2.me To: kazu@iij.ad.jp CC: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] ANN: psqueue-benchmarks - benchmarks of priority queue implementations
I actually found a (potential) problem with the GHC implementation.
See here:
https://github.com/nh2/psqueue-benchmarks/blob/db89731c5b4bdd2ff2ef81022a65f...

Hi,
See here:
https://github.com/nh2/psqueue-benchmarks/blob/db89731c5b4bdd2ff2ef81022a65f...
If I fromList 1000000 entries into the queue, it stack space overflows.
Are you sure that this is a bug of GHC PSQ? I think that "replicateM _GHC_CRASH_N" causes "Stack space overflow". If you compile it with -rtsopts and run it +RTS -K100M, I guess you don't see the problem. --Kazu

On Fri, 29 Mar 2013, Niklas Hambüchen wrote:
(This is a slightly detailed email. If you are the maintainer of one of the packages benchmarked here, you might want to read it though.)
Could you please put your experiences the Wiki? This would help others to choose a package.

Bearing in mind that I haven't looked at this in several years...
Why did you switch from queuelike to pqueue?
Because I liked the API better?
Could you put the code up somewhere manageable (repo)?
I had it up on darcs, but since that's not there any more, I don't have any more source history than you do.
Is it possible to make pqueue a full pSqueue implementation? Not without rewriting the API and data structure...or, equivalently, just starting from scratch.
Louis Wasserman
wasserman.louis@gmail.com
http://profiles.google.com/wasserman.louis
On Thu, Mar 28, 2013 at 10:20 PM, Niklas Hambüchen
(This is a slightly detailed email. If you are the maintainer of one of the packages benchmarked here, you might want to read it though.)
Today I was looking for a Priority Queue that also allows a delete operation (some call this a "Priority Search Queue").
I found
http://stackoverflow.com/questions/6976559/comparison-of-priority-queue-impl... and after looking at the 10 queue alternatives, I got to the following conclusions:
* Only 3 of them allow to delete entries (are p*s*queues) * Most of them are outdated or at least unmaintained for the last 3-5 years * There was an effort to get a priority queue into containers (see the stackoverflow link), but it was not agreed on * Those efforts were driven by Louis Wasserman, who wrote one of the 3 psqueues (queuelike), but now only maintains a non-ps-queue (pqueue) that seems to be the most popular priority queue implementation
PSQueue implementations -----------------------
The three packages that are psqueues are:
- PSQueue (http://hackage.haskell.org/package/PSQueue-1.1) * original implementation from paper of Ralf Hinze * last upload 2008 * no test suite (but small commented out QC properties), no benchmarks * no code repository
- fingertree-psqueue (http://hackage.haskell.org/package/fingertree-psqueue-0.3) * last upload 2011 * no tests, no benchmarks * no code repository
- queuelike (http://hackage.haskell.org/package/queuelike-1.0.9) * last upload 2009 * no tests, no benchmarks * no code repository
Benchmarks ----------
Unfortunately, none of them had tests, code in repositories or any indication about their real-world performance, so I made some criterion benchmarks. You can find them here:
https://github.com/nh2/psqueue-benchmarks Graphs:
http://htmlpreview.github.com/?https://raw.github.com/nh2/psqueue-benchmarks...
Benchmark results -----------------
* PSQueue throws a stack space overflow if you try to put in 100000 Ints
* PSQueue suffers from some significant worst case in terms of queue creation, sometimes creating a queue from random numbers just takes 5 times longer (1st graph). This only happens sometimes (despite Criterion)
* queuelike creation is instant - it seems to work around my benchmark somehow
* converting a queuelike queue to a list surprisingly takes 10 times longer than with the other packages
* in terms of average performance, the three are quite close to each other (apart from the point above). queuelike seems fastest, fingertree-psqueue is second and PSQueue slowest, with a difference of +30% to the next one
My questions to the maintainers -------------------------------
@Scott: Do you have an idea why PSQueue stack overflows?
@Louis: Why did you switch from queuelike to pqueue? Could you put the code up somewhere manageable (repo)? Is it possible to make pqueue a full pSqueue implementation?
Niklas

On Fri, 29 Mar 2013, Louis Wasserman wrote:
Bearing in mind that I haven't looked at this in several years...
Why did you switch from queuelike to pqueue?
Because I liked the API better?
Could you put the code up somewhere manageable (repo)?
I had it up on darcs, but since that's not there any more, I don't have any more source history than you do.
Was it on code.haskell.org? Then it might have been moved to a non-web directory after the last attack 2011.

Does that mean the repo is still there without web access or gone? On 29/03/13 20:14, Henning Thielemann wrote:
Was it on code.haskell.org? Then it might have been moved to a non-web directory after the last attack 2011.

On Fri, 29 Mar 2013, Niklas Hambüchen wrote:
On 29/03/13 20:14, Henning Thielemann wrote:
Was it on code.haskell.org? Then it might have been moved to a non-web directory after the last attack 2011.
Does that mean the repo is still there without web access
I assume that.
or gone?
If the original author has not deleted it, then it should still be somewhere: http://www.haskell.org/pipermail/haskell-cafe/2011-February/089352.html

Hey Louis, I think that queuelike is still a nice psqueue implementation (and I personally don't dislike the api), so may I ask two more questions: * Do you have any clue why toList is 10 times slower than in the other implementation? It is based on extract, and queuelike's extract is very fast compared to the others ... that is weird. * What could I do such that queuelike creation is not measured as instant? Using whnf does not seem to be enough. Thank you Niklas

I don't remember the answer to either of your questions, I'm afraid --
queuelike was last updated in 2009 (!), and that's really the last time I
looked at it. That said, I'm not sure I follow how queuelike is a psqueue
at all as opposed to a pqueue?
Louis Wasserman
wasserman.louis@gmail.com
http://profiles.google.com/wasserman.louis
On Fri, Mar 29, 2013 at 3:17 PM, Niklas Hambüchen
Hey Louis,
I think that queuelike is still a nice psqueue implementation (and I personally don't dislike the api), so may I ask two more questions:
* Do you have any clue why toList is 10 times slower than in the other implementation? It is based on extract, and queuelike's extract is very fast compared to the others ... that is weird.
* What could I do such that queuelike creation is not measured as instant? Using whnf does not seem to be enough.
Thank you Niklas

On 30/03/13 06:44, Louis Wasserman wrote:
That said, I'm not sure I follow how queuelike is a psqueue at all as opposed to a pqueue?
Louis, you are actually right. I was tricked by the delete function, which takes only the queue, not the key, so it simply pops the top - queuelike is not a psqueue.
participants (7)
-
Branimir Maksimovic
-
Henning Thielemann
-
Johan Tibell
-
Kazu Yamamoto
-
Louis Wasserman
-
Niklas Hambüchen
-
Scott Dillard