
I've been puzzled by the lack of a priority queue implementation in the hierarchical libraries for some time. Sure, you could use a Data.Map as a priority queue, but something about having to pay O(log n) to find the smallest element bugs me. So, I decided to implement my own priority queue with optimal asymptotic time complexity. That is: insert : O(1) findMin : O(1) meld : O(1) deleteMin : O(log n) It's based heavily on the strategy detailed in the paper "Optimal Purely Functional Priority Queues" by Gerth StØlting Brodal and Chris Okasaki. This is asymptotically better than other, simpler, data structures, but the "constant factor" is likely very high (I haven't done any sort of serious benchmarking). I figured that if you have a small data set, the speed isn't going to matter much anyway, and if you have a large enough data set the asymptotically better data structure is faster. Anyway This was just something I threw together when I had some spare time. There may be misstakes in there, there's probably tons of room for improvement, but at least it's a priority queue! Download at: http://www.dtek.chalmers.se/~sylvan/PriorityQueue/ ----------------------------------------------------------------------------------------------------------- File Description ----------------------------------------------------------------------------------------------------------- PrimPQ.hs The internal workings of the skewed binomial tree PriorityQueue.hs The main priority queue module PriorityQueueTests.hs Some quickcheck properties ----------------------------------------------------------------------------------------------------------- /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

I have something I call a "HashHeap" which I use in my "RevModeDiff" (automatic differentiation) module. Both are here: http://ofb.net/~frederik/futility/ I wrote it when I was first learning Haskell so it's a bit unpolished, but I think some reincarnation of the same concept might be a good complement to a priority queue implementation such as yours. What it is is a map of priorities to hashtables of keys to values. Each value has a priority and a key. This allows you to insert something with a non-unique priority and then remove it later. You can also remove everything with the smallest priority - all the values in the smallest-priority hash table. It does take log(n) time to find the top priority. In my application this is not a problem, though, since many things share the same priority. If I were to have taken the purely functional route of using a map of (priority,value) tuple keys then it might be a bigger problem. However, my values are not Ord so that route was not allowed. I often find that when I want to use priority queues, I'm annoyed by the fact that I can't remove something which is not the minimum. Suppose I'm scheduling jobs and I want to cancel a job, etc. I can imagine that with an imperative data structure I could have it both ways - deletion of internal elements, and O(1) findMin. Is this not possible with a functional implementation? It doesn't seem to be a feature of your implementation. If it were possible, it seems like it might be nice to have it as part of the interface. One more thing, about the order of arguments: insertBy :: (a -> a -> Ordering) -- ^ An ordering function -> a -- ^ The element to insert -> PriorityQueue a -- ^ The priority queue in which the element will be inserted -> PriorityQueue a -- ^ The resulting priority queue I see the same kind of argument ordering in Data.Map. Why is the data structure last? Usually one would think that the arguments should be ordered from "most constant" to "least constant" so that currying is as useful as possible. But surely the data structure is "more constant" than its elements, even (especially?) if it is persistent. Cheers, Frederik On Thu, Aug 11, 2005 at 08:16:56PM +0200, Sebastian Sylvan wrote:
I've been puzzled by the lack of a priority queue implementation in the hierarchical libraries for some time. Sure, you could use a Data.Map as a priority queue, but something about having to pay O(log n) to find the smallest element bugs me.
So, I decided to implement my own priority queue with optimal asymptotic time complexity. That is: insert : O(1) findMin : O(1) meld : O(1) deleteMin : O(log n)
It's based heavily on the strategy detailed in the paper "Optimal Purely Functional Priority Queues" by Gerth StØlting Brodal and Chris Okasaki.
This is asymptotically better than other, simpler, data structures, but the "constant factor" is likely very high (I haven't done any sort of serious benchmarking). I figured that if you have a small data set, the speed isn't going to matter much anyway, and if you have a large enough data set the asymptotically better data structure is faster.
Anyway This was just something I threw together when I had some spare time. There may be misstakes in there, there's probably tons of room for improvement, but at least it's a priority queue!
Download at: http://www.dtek.chalmers.se/~sylvan/PriorityQueue/
----------------------------------------------------------------------------------------------------------- File Description ----------------------------------------------------------------------------------------------------------- PrimPQ.hs The internal workings of the skewed binomial tree PriorityQueue.hs The main priority queue module PriorityQueueTests.hs Some quickcheck properties -----------------------------------------------------------------------------------------------------------
/S
-- Sebastian Sylvan +46(0)736-818655 UIN: 44640862 _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On 8/14/05, Frederik Eaton
I often find that when I want to use priority queues, I'm annoyed by the fact that I can't remove something which is not the minimum. Suppose I'm scheduling jobs and I want to cancel a job, etc. I can imagine that with an imperative data structure I could have it both ways - deletion of internal elements, and O(1) findMin. Is this not possible with a functional implementation?
Well you could get it at a fairly expensive cost. It is possible to take, say, a finiteMap implementation and "adapt it" to have a globally stored root which gets you O(1) findMin and (log n) for most everything else. This implementation gets optimal time complexities for other operations as well, though. O(1) for everyting except deleteMin which is (log n). O(1) for meld is, for example, quite neat.
I see the same kind of argument ordering in Data.Map. Why is the data structure last? Usually one would think that the arguments should be ordered from "most constant" to "least constant" so that currying is as useful as possible. But surely the data structure is "more constant" than its elements, even (especially?) if it is persistent.
I would think that the most constant parameter is the ordering function. The second most constant is debatable, but I tend to think that it's more common to insert the same key into a list of priority queues, for example, rather than insert different keys in to the same priority queue (that is, as long as the queue is side-effect free)... As a side note. After I posted this I found that the implementation of Edison was a lot further along than expected (there just wasn't any documentation). And apparently if you load the package "data" you get access to all that (LazyPairingHeap seems to be faster than my implementation). I happen to think that any language with self respect needs a standard data structures library. Even if "standard" is just de-facto standard. Personally I think they should include at least one implementation of all the regular data structures, along with the type class hierarchy of Edison, in the hierarchical libraries. But barring that I'm willing to settle for a nice Edison-like library with haddock documentation and a cabalised installation. I have a vague memory of running across a pseudo-complete data structures library (not Edison) a while back, but I'm having trouble finding it. /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

Sebastian Sylvan wrote:
I can imagine that with an imperative data structure I could have it both ways - deletion of internal elements, and O(1) findMin. Is this not possible with a functional implementation? It doesn't seem to be a feature of your implementation. If it were possible, it seems like it might be nice to have it as part of the interface.
When implementing a physical simulation (many-body 1d gravity) I needed a priority queue to schedule interactions that could handle many updates. Each interaction produces 2 or 3 scheduling updates. My first project in Haskell was a direct port of the imperative prototype of this simulation. Mutable arrays (in the ST monad) held a binary priority heap and a second array indexing each event in the heap. Obviously a more elegant solution was desirable. Then I ran across references to the Priority Search Queue. See Ralph Hinze's "A Simple Implementation Technique for Priority Search Queues". http://archive.cs.uu.nl/pub/RUU/CS/techreps/CS-2001/2001-09.pdf You get 0(1) findMin and O(log n) deleteMin, delete, and update. This is just what I was looking for and much more flexible as the foundation of a discreet event-based simulation library, the core of which I intend to implement when I find the time... But now, thanks to this thread, I'll first have to find time to check out Lazy Pairing Heaps and the Edison Library in general. Joel

I see the same kind of argument ordering in Data.Map. Why is the data structure last? Usually one would think that the arguments should be ordered from "most constant" to "least constant" so that currying is as useful as possible. But surely the data structure is "more constant" than its elements, even (especially?) if it is persistent.
I would think that the most constant parameter is the ordering function. The second most constant is debatable, but I tend to think that it's more common to insert the same key into a list of priority queues, for example, rather than insert different keys in to the same priority queue (that is, as long as the queue is side-effect free)...
Maybe it's a 'foldr' versus 'foldl' thing, where you and Data.Map have chosen 'foldr'. Personally I prefer 'foldl'. After all, traversing the list in order is more intuitive. Frederik
participants (3)
-
Frederik Eaton
-
Joel Koerwer
-
Sebastian Sylvan