Progress on priority queues for containers

7 Mar 2010

      A brief update:

I'm continuing to work on a priority queue implementation for containers.
I've pretty much settled on a variation on a binomial heap that actually has
the type-checker guarantee the correct binomial structure.  It has the
following properties:

   - Amortized O(1) insertion.  There are variations (skew binomial heaps)
   with guaranteed O(1) insertion, but they add tremendously to the complexity
   of the implementation, if they work at all with this approach (I've
   encountered some difficulties with this approach).  I think this is fine,
   as-is.
   - Worst-case O(log n) union.  (If you're working in a single-threaded
   fashion, I am pretty convinced that you can treat union as amortized O(0),
   but O(log n) is pretty good in any event.)
   - O(1) find-min.
   - Worst-case O(log n) extract.

The primary competing implementation is a pairing heap implementation, with
the following properties:

   - O(1) insertion, union, and find-min, worst-case and amortized.
   - Amortized O(log n) extract, but worst-case O(n).  If used in a
   persistent context, it's possible that a user could encounter a situation
   that performs linear-time extraction operations many times.  (This is my
   primary concern with this implementation, especially given that containers
   users will be expecting a reliable data structure that won't break down.)

For a simple heap sort, the pairing heap is just about twice as fast as the
binomial heap, but I think having reasonable worst-case guarantees for
containers users is important enough to dominate that concern.

There has also been some discussion of using priority search queues --
priority queues in which you can also find previously inserted elements --
but I think this is more appropriate for a separate package, and not
something that is necessarily needed by most users.

I've also written a wide variety of utility functions, which you can see in
the implementation
herehttp://hackage.haskell.org/trac/ghc/attachment/ticket/3909/containers-pqueue....
In particular, a number of Data.List functions are modified to apply to
priority queues, including take, drop, splitAt, takeWhile, dropWhile, span,
break, filter, and partition.  All of these are implemented asymptotically
optimally.

Do people have any other thoughts?

Louis Wasserman
wasserman.louis@gmail.com
http://profiles.google.com/wasserman.louis

Louis Wasserman

Milan Straka

tags

participants (2)