
On Wed, Feb 23, 2011 at 08:45:47AM +0000, Max Bolingbroke wrote:
3. Some map combining algorithms work more efficiently when one of their two arguments are smaller. For example, Data.Map.union is most efficient for (bigmap `union` smallmap). If you don't care about which of the two input maps wins when they contain the same keys, you can improve constant factors by testing the size of the map input to size (in O(1)) and flipping the arguments if you got (smallmap `union` bigmap) instead of the desirable way round.
This is a most unfortunate feature of Data.Map, forcing users to bend the logic of their application to suit the library. Ideally you'd want binary operations that are symmetric and associative performance-wise, freeing users to follow the structure of their problem domain. The documentation (and the original paper) doesn't quantify the gain for such unions, but hopefully it achieves the optimum: O(m log(n/m)), where m and n are the sizes of the smaller and larger trees respectively. Then building a tree costs O(n log(n)), regardless of the way you do it.