
Hi all, I've been doing some work with the GHC API, and I'm kind of confused as to how Uniques are supposed to work. I've found that each Name has a Unique that is its primary identification (at least for internal names). That means that two identifiers are different iff their Uniques are different, even when their OccNames are the same. This also means that two identifiers are equal when their Uniques are equal, even if their OccNames are different (though I think this situations is not supposed to exist and is really undefined?). Now, from the name "Unique", one would assume that each different identifier in a compiler run gets a different Unique, even when they are in different functions or modules. However, I've found some evidence to suggest otherwise. In particular, I've been looking at applying substitutions using CoreSubst. As a part of a substitution, the set of identifiers that are in scope are included with the substitutions to perform. I don't completely understand the requirements on this InScopeSet, but I do see that any identifiers that are bound in the to-be-substituted-in expression that are also in scope, will have their Unique changed. This makes sense when you want Uniques to be really unique, since \a_1 -> \a_1 -> \a_1 will then become \a_1 -> \a_2 -> \a_2. This does not sound related to substitution, but apparantly substition always performs this de-shadowing (as I think it's called?) implicitely. Giving this some further thought, it makes some sense, since when I'm performing the substituion \x_3 -> a_4 (which has a free variable a_4), on the expression \a_4 -> x_3, this must of course result in \a_5 -> a_4, not \a_4 -> a_4. So, it seems the InScopeSet should at least contain the free variables of the substited values. Looking a bit closer, this might be exactly what the definition at CoreSubst says: "Precisely, the in-scope set must be a superset of the free vars of the substitution range that might possibly clash with locally-bound variables in the thing being substituted in." I'm not completely sure what the "substituion range" is, but I can imagine this being the expression part of a binder -> expression substitution? Anyway, back to my original point. The above suggests that Uniques are not so unique after all. If every binder everywhere would get its own Unique, the Unique replacement CoreSubst does would never be neccesary. So, how is the uniqueness of Unique really defined? I can imagine that the generation of new Uniques is avoided as much as possible for performance reasons, and thus uniqueness is guaranteed only "on demand", by running deshadowing / substitutions when things could be clashing? Any thoughts on what kind of non-uniqueness is allowed and what is not? Are there common pitfalls when working with uniques? When to apply deshadowing or other uniqueness generating techniques? On a related note, is my understanding correct that deshadowing will only provide uniqueness on each "path" through the expression tree, and not over the entire tree? i.e., the expression let x_6 = \y_7 -> y_7; z_8 = \y_7 -> \y_7 in x_6 is a deshadowed expressions, even though both of the y_7 lambda binders refer to a different variable? If so, is there any function that gives complete uniqueness over an entire tree? Gr. Matthijs