ANN: unordered-containers - a new, faster hashing-based containers library

Hi all, I am delighted to announce the release of preview versions of two new packages: unordered-containers 0.1 Efficient hashing-based container types. http://hackage.haskell.org/package/unordered-containers hashable 1.1 Efficient hashing of common types. http://hackage.haskell.org/package/hashable These packages address a need for faster container types in Haskell, especially for string types like ByteString and Text. The package achieves better performance by using hashing and a Patricia trie (also known as a radix tree), instead of a balanced tree, in its implementation. The provided HashMap type is three times faster than Data.Map for lookups and twice as fast for inserts, for all key types I've tried, including ByteString, String, and Int. I'm calling this a preview release, even though the code hash been well tested both for correctness and performance, as the API is quite basic and doesn't provide a HashSet type yet. One thing you'll notice is that the API is split into a value-strict and a value-lazy version, making it easier to use in a strict setting. If you want to contribute, please get copies of the source trees from here: * git clone https://github.com/tibbe/unordered-containers.git * git clone https://github.com/tibbe/hashable.git Alternatively, just fork the projects on GitHub: http://github.com/tibbe As I mentioned in my talk on hashing-based containers earlier this week, I'm working on an even faster version, based on hash-array mapped tries, but it won't be ready until GHC 7.2. Johan

What are your thoughts on iterative construction of maplike datastructures? Could something like builder work for maps, too? In the project I'm working on, I have a function that receives a bunch of YAML fragments and builds a big YAML map out of them. -- Jason Dusek Linux User #510144 | http://counter.li.org/

Hi Johan,
this work is highly appreciated, thanks!
* Johan Tibell
Hi all,
I am delighted to announce the release of preview versions of two new packages:
unordered-containers 0.1 Efficient hashing-based container types. http://hackage.haskell.org/package/unordered-containers
hashable 1.1 Efficient hashing of common types. http://hackage.haskell.org/package/hashable
These packages address a need for faster container types in Haskell, especially for string types like ByteString and Text. The package achieves better performance by using hashing and a Patricia trie (also known as a radix tree), instead of a balanced tree, in its implementation.
-- Roman I. Cheplyaka :: http://ro-che.info/ Don't worry what people think, they don't do it very often.

On 2/18/11 8:38 PM, Johan Tibell wrote:
Hi all,
I am delighted to announce the release of preview versions of two new packages:
unordered-containers 0.1 Efficient hashing-based container types. http://hackage.haskell.org/package/unordered-containers
How does, or will, this package differ from Milan Straka's http://hackage.haskell.org/package/hashmap ? -- Live well, ~wren

On Sat, Feb 19, 2011 at 4:49 PM, wren ng thornton
How does, or will, this package differ from Milan Straka's http://hackage.haskell.org/package/hashmap ?
It's a continuation of that work. The unordered-containers implementation is more space efficient and a bit faster. Milan also uses a Patricia trie in his implementation. I plan to switch to a hash-array mapped trie (HAMT) in the future. It's about 25% faster for lookups but at the moment quite a bit slower for inserts. Once we got some new faster array copy primops in GHC we can hopefully improve the insert time a lot. A HAMT also has much lower space overhead per entry. Johan

On Sat, Feb 19, 2011 at 4:38 AM, Johan Tibell
Hi all,
I am delighted to announce the release of preview versions of two new packages:
unordered-containers 0.1 Efficient hashing-based container types. http://hackage.haskell.org/package/unordered-containers
hashable 1.1 Efficient hashing of common types. http://hackage.haskell.org/package/hashable
These packages address a need for faster container types in Haskell, especially for string types like ByteString and Text. The package achieves better performance by using hashing and a Patricia trie (also known as a radix tree), instead of a balanced tree, in its implementation.
The provided HashMap type is three times faster than Data.Map for lookups and twice as fast for inserts, for all key types I've tried, including ByteString, String, and Int.
I'm calling this a preview release, even though the code hash been well tested both for correctness and performance, as the API is quite basic and doesn't provide a HashSet type yet. One thing you'll notice is that the API is split into a value-strict and a value-lazy version, making it easier to use in a strict setting.
If you want to contribute, please get copies of the source trees from here:
* git clone https://github.com/tibbe/unordered-containers.git * git clone https://github.com/tibbe/hashable.git
Alternatively, just fork the projects on GitHub:
As I mentioned in my talk on hashing-based containers earlier this week, I'm working on an even faster version, based on hash-array mapped tries, but it won't be ready until GHC 7.2.
Johan
What about making Hashable subclass of Eq class Eq a => Hashable a where I think it's is obvious from Hasable class properties that some kind of equality is needed and I think it will reduce type class constraints in actual hash table implementation in unordered-containers package. Also I think that value of hash functions is obviously a Monoid and it will be convenient to have Monoid instance newtype Hash = Hash Int instance Monoid Hash where mempty = Hash 0 Hash a `mappend` Hash b = Hash (a `combine` b) class Eq a => Hashable a where hash :: a -> Hash hashWithSalt :: Hash -> a -> Hash hashWithSalt salt x = salt `mappend` hash x -- Victor Nazarov

2011-02-23 13:56, Victor Nazarov skrev:
Also I think that value of hash functions is obviously a Monoid and it will be convenient to have Monoid instance
newtype Hash = Hash Int instance Monoid Hash where mempty = Hash 0 Hash a `mappend` Hash b = Hash (a `combine` b)
class Eq a => Hashable a where hash :: a -> Hash hashWithSalt :: Hash -> a -> Hash
hashWithSalt salt x = salt `mappend` hash x
Monoid would be a good idea if combine was associative :) Prelude Data.Hashable> let a = hash "a" Prelude Data.Hashable> let b = hash "b" Prelude Data.Hashable> let c = hash "c" Prelude Data.Hashable> (a `combine` b) `combine` c 198573605 Prelude Data.Hashable> a `combine` (b `combine` c) 177445 / Emil

On Wed, Feb 23, 2011 at 4:56 AM, Victor Nazarov
What about making Hashable subclass of Eq
class Eq a => Hashable a where
I think it's is obvious from Hasable class properties that some kind of equality is needed and I think it will reduce type class constraints in actual hash table implementation in unordered-containers package.
The documentation could perhaps be worded better: if 'a' is also an instance of 'Eq', then the stated property should hold. You can imagine things that can be hashed by not compared for equality. Also, I don't know if type classes are a good way to model sub-typing relationships (see Oleg's email about the Functor/Applicative/Monad proposal). Johan
participants (6)
-
Emil Axelsson
-
Jason Dusek
-
Johan Tibell
-
Roman Cheplyaka
-
Victor Nazarov
-
wren ng thornton