Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

23 Feb 2011

      I took a rapid look and they seem a replacement for pure Map's, but not for
 mutable HashTable's. Sorry if it isn't the case. I don´t know if
Data.HashTable has improved, but the performance used to be very poor in
comparison with other languages.

The point is that pure data structures can not be used as shared data in
multithreaded environments. They must be encapsulated in mutable blocking
references, such is MVars, and the whole update process blocks any other
thread . (it is worst when using TVars)  So they are not a replacement for
mutable data structures such is Data.HashTable.

For this reason I think that an inprovement/mutable-replacement of
Data.HashTable is needed. If this hasn´t been done already. Are there
some improvements on it that I don't know?

2011/2/23 Max Bolingbroke 
...
On 23 February 2011 05:31, Johan Tibell  wrote:
...
On Tue, Feb 22, 2011 at 9:19 PM, Johan Tibell 
wrote:
...
Initial numbers suggest that lookup gets 3% slower and insert/delete
6% slower. The upside is O(1) size.
Can someone come up with a real world example where O(1) size is
important?
I'm a bit sceptical that it is (I was not convinced by the earlier
strict-set-inclusion argument, since that's another Data.Map feature
I've never used). I thought of some other possibilities though:
 1. If copying an unordered-collection to a flat array you can improve
the constant factors (not the asymptotics) with O(1) size to
pre-allocate the array
 2. If building a map in a fixed point loop (I personally do this a
lot) where you know that the key uniquely determines the element, you
can test if a fixed point is reached in O(1) by just comparing the
sizes. Depending what you are taking a fixed point of, this may change
the asymptotics
 3. Some map combining algorithms work more efficiently when one of
their two arguments are smaller. For example, Data.Map.union is most
efficient for (bigmap `union` smallmap). If you don't care about which
of the two input maps wins when they contain the same keys, you can
improve constant factors by testing the size of the map input to size
(in O(1)) and flipping the arguments if you got (smallmap `union`
bigmap) instead of the desirable way round.
Personally I don't find any of these *particularly* compelling. But a
~6% slowdown for this functionality is not too bad - have you had a
chance to look at the core to see if the cause of the slowdown
manifests itself at that level? Perhaps it is possible to tweak the
code to make this cheaper.
Also, what was the size of the collections you used in your benchmark?
I would expect the relative cost of maintaining the size to get lower
as you increased the size of the collection.
Max
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

Alberto G. Corona