On Thu, Feb 23, 2012 at 10:13 PM, wren ng thornton
<wren@freegeek.org> wrote:
On 2/23/12 9:16 PM, Clark Gaebel wrote:
Looking at IntMap's left-biased 'union' function [1], I noticed that the
complexity is O(n+m) where n is the size of the left map, and m is the size
of the right map.
Since insertion [2] is O(min(n, W)) [ where W is the number of bits in an
Int ], wouldn't it be more efficient to just fold 'insert' over one of the
lists for a complexity of O(m*min(n, W))? This would degrade into O(m) in
the worst case, as opposed to the current O(n+m).
The important things to bear in mind here are (1) the constant factors actually matter in practice, and (2) what's actually going on. While O(min(n,W)) is correct, it's incorrect to think about it as just a constant (or as just a linear function). While technically incorrect, it's better to think of it as O(log n) in order to get an intuition for how it works. And O(m+n) is much nicer than O(m*log n).
Doing a fold with insert means that we must pay for the cost of traversing one of the maps entirely, and the cost of walking the spine for a lookup/insert m times. Whereas, with the merge function we only have to traverse the portions of the spines which intersect, and we only have to do it in one pass. In doing the fold, we're essentially ignoring the fact that the maps have a trie structure, since we have to traverse from the top for every insert; whereas for the merge, we make use of the structure in order to avoid redundant traversals of the top part of the structure.
Thus, the merge is doing less work. So, in theory, it should be faster. However, again, the thing to beware of is the constant factors. In particular, big-O algorithmic analysis doesn't really account for things like locality and cache coherence, so one should always be on the lookout for places where duplicating work is actually faster in practice. If you're curious, you can always implement your own union using the fold-with-insert method and then run some benchmarks.
--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe