
Jan-Willem Maessen wrote:
Just to clarify: divide and conquer splits one tree on the root value of the other (possibly avoiding enforcing the balance metric until after joining trees, though not obvious how / if that's useful)? The definition of "divide and conquer" on trees without a fixed structure is rather unclear, which is why the question comes up in the first place.
The Divide and conquer algorithm presented in the Adams paper treats the two trees differently, depending on size. The algorithm used in AVL lib doesn't do this, it treats them both the same. You split each tree by the root of the other tree, then do the sub-unions on the three resulting ranges, then join these using the 2 orignal roots as "bridging" elements. This seems to give better results, and also aesthetically (an important consideration) seems more natural. With AVL I don't think there's really anything to be gained by not balancing intermediate trees as balancing costs practically nothing and has obvious advantages. Still it's not perfect. If the two trees are about the same size this still seemed to cost about 20% more comparisons than a noddy list merge union would. It's a big win over lists if there's a substantial difference in tree sizes though (and a big win over Hedge in all cases I think). It would be nice if someone could do a good theoretical analysis of the performance of these algorithms. I didn't because my maths wasn't good enough. I just chose algorithms empirically to minimise comparison counts (not execution times), which is the right thing to do for polymorphic implementations. Regards -- Adrian Hey