
Hello,
Hello Milan,
Thank you for your good job. I have several suggestions and questions.
* Suggestion
-- That leaves two reasonable variants, delta=3 and delta=4, both with ratio=2. --
According my tests, integer solutions is (3,2) and (4,2) ONLY. But this comments can be read as if there are other integer solutions with ratio other than 2. What about saying that (3,2) and (4,2) are only integer solutions? (i.e. Valid integer value of ratio is 2 only)
I did not try to play with other ratios than 2, so I did not know they did not work.
* Question
Did you prove (3,2) and (4,2) are valid mathematically? As you know, tests cannot prove absence of bugs.
Unfortunately, we (I and Mr. Hirai) cannot give mathematically proof of Adams's scheme where weight = size. This is mainly because Adams's scheme treats small trees are special and it makes mathematical analysis difficult.
As I said to you personally, we (I and Mr. Hirai) are working of the proof of Nievergelt and Reingold scheme by Coq. Because of weight = size + 1, there is not special condition and mathmatical analysis is much easier. In N&R's scheme, there is only one integer solution: (3,2).
I can prove that (3,2) and (4,2) are mathematically valid, I will post the proof soon.
* Question
-- In the benchmarks, delta=3 is faster on insert operations, but delta=4 has better overall performance, so we use delta=4. --
Smaller delta makes tree more balanced. So, intuitively delta=3 should be faster on lookup operations but slower on insert and delete operations. Do you have any idea why your result is against the intuition?
Actually, the height of the trees depends much on input data, not only on delta. The lookups are nearly identical with sorted and random data. I did more measurements and in the end I think delta = 3 is better. It is faster on inserts than delta = 4 and a bit slower on deletes, and the other operations are nearly identical.
* Question
-- Note: in contrast to Adam's paper, we perform the rebalance even in the case when (size left == delta * size right), instead when (size left > delta * size) as in the paper. Both are correct, but the former is slightly faster overall. --
If we perform rebalance for size left == delta * size right, intuitively insert and delete operations should be lower but lookup operations should be faster. Do you have any idea why your result is against the intuition?
Personally I don't like rebalance for size left == delta * size right because it is against the definition of *f-balance*.
I agree, Currently the patch rebalances only if size left > delta * size right.
* Suggestion
Both the original balance function and your balance function have unnecessary condition check. Suppose we are inserting an element to a right subtree. In this case, only left rotation would happen. But we call the balance function and choose left rotation or right rotation.
If we divide balance into balanceL and balanceR, we can remove this unnecessary condition and clarify the code.
Since the alter function requires the balance function, we cannot remove the balance function. So, we should provide balance, balanceL, and balanceR.
Thanks, a good idea. I did that and the performance increased. I mentioned your name in the patch description. Thanks, Milan