Re: [Haskell-cafe] lookup tables & style guidelines

27 Apr 2008

      Jan-Willem Maessen wrote:
...
Just to clarify: divide and conquer splits one tree on the root value of 
the other (possibly avoiding enforcing the balance metric until after 
joining trees, though not obvious how / if that's useful)?  The 
definition of "divide and conquer" on trees without a fixed structure is 
rather unclear, which is why the question comes up in the first place.
The Divide and conquer algorithm presented in the Adams paper treats
the two trees differently, depending on size. The algorithm used in
AVL lib doesn't do this, it treats them both the same. You split each
tree by the root of the other tree, then do the sub-unions on the
three resulting ranges, then join these using the 2 orignal roots as
"bridging" elements.

This seems to give better results, and also aesthetically (an important
consideration) seems more natural.

With AVL I don't think there's really anything to be gained by not
balancing intermediate trees as balancing costs practically nothing
and has obvious advantages.

Still it's not perfect. If the two trees are about the same size this
still seemed to cost about 20% more comparisons than a noddy list
merge union would. It's a big win over lists if there's a substantial
difference in tree sizes though (and a big win over Hedge in all
cases I think).

It would be nice if someone could do a good theoretical analysis of
the performance of these algorithms. I didn't because my maths wasn't
good enough. I just chose algorithms empirically to minimise
comparison counts (not execution times), which is the right thing to
do for polymorphic implementations.

Regards
--
Adrian Hey