
Hi all, This is a request for early feedback on something that I'd like to propose for the next major HP release. Milan Straka showed [1] that it's possible to design a persistent set (and thus also map) for string-like keys that's about twice as fast as the current Data.Set using hashing and a Patricia tree (i.e. IntMap). The Clojure community has also managed to create a fast persistent map, based on Bagwell's hash array mapped trie. I'd like to propose that we get the necessary infrastructure in place for creating fast and easy to use hash-based data structure. Here's what I think is needed: * A type class for things that can be hashed, Hashable. class Hashable a where hash :: a -> Word An alternative would be to have `hash` return an Int, which would make the method easier to use with Haskell data structures, which are indexed using Ints, but Word feels like the more correct type to me. The Hashable type class would have to go in either base, containers, or a new package (there's currently a hashable package on Hackage.) * Instances of Hashable for common types, in particular ByteString and Text. I already have fast instances for Text and ByteString, based on MurmurHash2 [2]. * Two new data types in the containers package, Data.HashMap and Data.HashSet. These would be persistent data structures, initially based on IntMap, and have the same or similar API to Data.Map and Data.Set.. In the future we could also add mutable HashTables, operating in the ST monad, or other hash-based data types (persistent or mutable). Any comments? Johan 1. http://research.microsoft.com/~simonpj/papers/containers/containers.pdf 2. http://sites.google.com/site/murmurhash/