
On Wed, Mar 03, 2010 at 03:23:01PM +0100, Milan Straka wrote:
I have started an internship in Cambridge and will spend 3 months on Haskell. One of the possible projects is improving the containers library. Things I am thinking about: - measuring the performance of existing implementations (notably Map and Set) and possibly improve them (either without or with API changes), - adding Queue and a PriorityQueue, - maybe adding a generalized trie, - maybe adding a hashtable (more like a trie of the hashes, something in the line of Ideal hash trees; and maybe a 'classic' hash modifiable in the ST monad?)
The containers package aims to provide general purpose immutable containers in portable Haskell. So things like generalized tries or mutable hash tables, while worthwhile, should be in different packages. Ordinary tries (i.e. ListMaps) would fit though. Data.Sequence already provides a queue. A Banker's queue would be a little faster, but IMO the difference isn't enough to justify the special case. The two-stack implementation would be faster, but it performs poorly in a persistent setting, so fails the "general purpose" criterion. A priority queue and (priority search queue) would be useful. It would be important to have an efficient union operation. Then you have to decide whether this needs to be stable. It's a trade-off between speed and utility. Skew heaps are simple to implement and very fast, but don't support stable union. Implementations that do seem to be a bit slower.