Progress on priority queues for containers

A brief update: I'm continuing to work on a priority queue implementation for containers. I've pretty much settled on a variation on a binomial heap that actually has the type-checker guarantee the correct binomial structure. It has the following properties: - Amortized O(1) insertion. There are variations (skew binomial heaps) with guaranteed O(1) insertion, but they add tremendously to the complexity of the implementation, if they work at all with this approach (I've encountered some difficulties with this approach). I think this is fine, as-is. - Worst-case O(log n) union. (If you're working in a single-threaded fashion, I am pretty convinced that you can treat union as amortized O(0), but O(log n) is pretty good in any event.) - O(1) find-min. - Worst-case O(log n) extract. The primary competing implementation is a pairing heap implementation, with the following properties: - O(1) insertion, union, and find-min, worst-case and amortized. - Amortized O(log n) extract, but worst-case O(n). If used in a persistent context, it's possible that a user could encounter a situation that performs linear-time extraction operations many times. (This is my primary concern with this implementation, especially given that containers users will be expecting a reliable data structure that won't break down.) For a simple heap sort, the pairing heap is just about twice as fast as the binomial heap, but I think having reasonable worst-case guarantees for containers users is important enough to dominate that concern. There has also been some discussion of using priority search queues -- priority queues in which you can also find previously inserted elements -- but I think this is more appropriate for a separate package, and not something that is necessarily needed by most users. I've also written a wide variety of utility functions, which you can see in the implementation herehttp://hackage.haskell.org/trac/ghc/attachment/ticket/3909/containers-pqueue.... In particular, a number of Data.List functions are modified to apply to priority queues, including take, drop, splitAt, takeWhile, dropWhile, span, break, filter, and partition. All of these are implemented asymptotically optimally. Do people have any other thoughts? Louis Wasserman wasserman.louis@gmail.com http://profiles.google.com/wasserman.louis

Hello, I have uploaded a small benchmark to the ticket.
What do people think of the following?
Put pairing heaps (by far the fastest implementation so far, nearly 2.3x faster than binomial heaps) into containers. Launch packages containing high-quality priority search queues and real-time-ish binomial queues. (The binomial queue implementation guarantees that deleteMin has worst-case time O(log n), while pairing heaps have worst-case O(n). For people with real-time speed needs, this is relevant.) Endorse these packages in the documentation in Data.PQueue, for people with these specialized needs.
I think PQueue in containers should work in a persistent setting, which pairing heaps do not. That is why I put lazy pairing heaps in the benchmark. This variant is from Okasaki's book Purely Functional Data Structures and is _conjectured_ (and believed, at least by Okasaki and me :) to have same amortized bounds in persistent setting. Personally I would put lazy pairing heaps to the containers (beware, the current implementation has to be yet worked if we agree). If not, I would use binomial heaps or maybe benchmark leftist heaps. Have a nice day, Milan
participants (2)
-
Louis Wasserman
-
Milan Straka