[jhc] Re: Ids in Jhc.

13 Feb 2008

      (I'm gonna forward this response to the jhc list too, if that's okay)

On Thu, Feb 14, 2008 at 12:26:49AM +0100, Lemmih wrote:
...
I've been wondering about your use of Ids in Jhc.
As far as I can see, the rules for Ids goes as following:
 * named ids are odd
 * unnamed ids are even
 * id '0' has special significance.
Can you tell me a bit about how unnamed ids are used, why id '0' has
special meaning, and how you generate new, unique unnamed ids.
Yes. that is correct.

named and unnamed ids behave the same way, except named ids get a pretty
name when printed out as opposed to just a numebr and are generally
derived from a user supplied name. Also, all 'top level' names are
always 'named' to ensure they remain globally unique.

named Ids are odd, these are looked up in a table that is implemented
via a hash table written in c in the StringTable/ directory. due to an
intentional quirk of its implementation, it only generates odd, positive
values. Conversion between a number and its string form is constant
time in both directions.

zero is a 'special' id in that it can be treated normally except it can
never appear in an expression, only in a binder. so \ v0 -> ... is okay,
but \ x -> .. v0 ... is not. This means you can always use 0 as a binder
when you know a value is not used without worrying about shadowing a
real definition. 

unnamed ids are arbitrary positive even numbers, there is no particular
need to allocate them in a special way, they are _not_ globally unique
so you just need to ensure you don't shadow any existing variables when
assigning new ones. in general, this consists of having a set of 'in
scope' names (which is often something you need to keep track of anyway)
and you choose some value that is not in the 'in scope' set and add it
to the set. this ensures ids don't shadow each other, but not that they
are unique.

negative ids are used inside of a few specialized routines internally
and should never make it out into the world. for instance, in the
unification algorithm they are used to represent alpha-renamed values so
we don't need to scan for used names before hand, however the terms
returned have their original ids restored.

hope this helps...

It is quite bothersome to me that 'Id' is just an Int and not a
newtype... that is something I have been slowing getting the code ready
to fix for a while now..

        John 

-- 
John Meacham - ⑆repetae.net⑆john⑈