
On Tue, 23 Mar 2004, Daan Leijen wrote:
1) However, both Data.FiniteMap and DData.Map use the correct ratio.
I have seen this stated in your code, but that's not the bug that I have described. (Message attached again.)
2) replace Adam's "<" comparisons with "<=" comparisons.
That's not the "my" bug, neither.
I have been redoing proofs of Adam's algorithms too some years ago when writing DData -- there were some errors but I think that calling it "bugged" is rather harsh :-)
The precondition-problem is not easy to fix. However, I ackknowledge that your balancing-checks are really good evidence for the correctness of the algorithm. (I didn't notice them first, because you checked via an external function, not in a smart constructor. And I looked only there.) Well, having mathematically unproven code is a common thing and should indeed be no problem. It's just that I personnally prefer algorithms that I understand, i.e. I know why they are correct. For Adam's algorithm this is not the case, and I think other people would have serious problems, too, when explaining "under what (pre)condition does 'balance' produce a correctly balanced tree?" and "why does 'join' always observe that precondition?".
In general, I would like to stress that one tends to overrate the usefullness of balancing -- Balancing is automatic for random data and balancing is only needed to avoid some pathological worst-cases. Almost any choice of balancing scheme will do.
Right, and I also see worst-case bounds rather as some kind of assuration, not a direct efficiency booster. However, worst-case bounds also allow the use of simple operations in unforeseen situations, they make them more generally useable. For example (although ordering is not balancing), one usually provides a separate "fromOrderedList" operation to create ordered structures. Algorithms with good worst-case performance can do that job, too, without special functions (also easier for the user), and even more, they work on "almost ordered" lists, too. In general, when using unbalanced collections (and structures without good worst-case bounds in general), one has to be _more careful_. Robert On Tue, 23 Mar 2004, Robert Will wrote:
To prove the correctness of the algorithm, by the way.... it suffices to give a precondition for 'balance' (T' in Adams' paper) and prove that 'merge' (Adams' 'concat3') observes that precondition. That's the bug in Adams' proof. Adams proved 'balanced' correct for the precondition that that the balancing invariant is violated by at most one. (size left <= size right + 1 and v.v.) This is observed by any other call to 'balance', but not (?) from 'merge'.
The current algorithm looks good enough for me,
Yeah, in DData it looks better than FiniteMap where they just disabled the failing assertions. Oh, perhaps the algorithm is wrong with respect to the simple balancing invariant, but correct with respect to some more complicated (amortised?) invariant. That would explain Adams' failure (and my reluctance) to proof it _and_ its good behaviour in practice.
As a side note, I've re-implemented the "SpellCheck" test with DData, and it behaves ok.
As it does with many other applications...