
Hi Axel, <snip>
In the context of program analysis, one tracks a lot of information about the state of a program at a given program point. This state is propagated around loops and conditionals. The implicit sharing of nodes in functional data structures are very well suited for this.
However, whenever two states need to be joined at the loop head or at the end of a conditional, the two trees that represent the state need to be joined by performing some sort of join operation on each element of a tree. For all sensible domains a `join` a = a. Given that a loop or branch of a conditional usually touches only few variables, it is prudent not to perform the join operation on all elements of the tree. Instead, a tree-difference operation is required that traverses the two trees to be joined and calculates the difference between them. In particular, whenever two references are, in fact, the same pointer, the subtree does not need to be considered. This way, a join between two trees of size n reduces to joining only m elements where m is the maximum number of elements in each tree. This is a tremendous win for n
m.
Any easy way of comparing pointers? I mean, if I have something like Tree a = N | B (Tree a) a (Tree a) and have (l::Tree a) and (r::Tree a), can I ask about the "physical equality"? The join would probably take O(m log n), because an insertion of 1 element changes the whole path to the new element, which is probably of length O(log n), so log n pointers change in the structure. You also need this time O(m log n) to construct the new data structure, surely for small m (one could hope for O(m log (n/m)). Still O(m log n) is definitely better than n for m << n :) Milan