Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

24 Feb 2011


      On Wed, Feb 23, 2011 at 08:45:47AM +0000, Max Bolingbroke wrote:
...
3. Some map combining algorithms work more efficiently when one of
their two arguments are smaller. For example, Data.Map.union is most
efficient for (bigmap `union` smallmap). If you don't care about which
of the two input maps wins when they contain the same keys, you can
improve constant factors by testing the size of the map input to size
(in O(1)) and flipping the arguments if you got (smallmap `union`
bigmap) instead of the desirable way round.
This is a most unfortunate feature of Data.Map, forcing users to bend the
logic of their application to suit the library.  Ideally you'd want binary
operations that are symmetric and associative performance-wise, freeing
users to follow the structure of their problem domain.  The documentation
(and the original paper) doesn't quantify the gain for such unions,
but hopefully it achieves the optimum: O(m log(n/m)), where m and n are
the sizes of the smaller and larger trees respectively.  Then building
a tree costs O(n log(n)), regardless of the way you do it.

Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

Ross Paterson