Re: [Haskell-cafe] Tiger compiler: variable escaping analysis phase

22 Jun 2010

      On Tue, Jun 22, 2010 at 09:33:22AM -0300, José Romildo Malaquias wrote:
...
In the variable escaping analysis phase of the compiler, a function
findEscape looks for escaping variables and record this information in
the escape fields of the abstract syntax. To do this the entire abstract
syntax tree is traversed looking for escaping uses of every variable,
and, when found, the field is set to true.
findEscape uses a symbol table to accomplish its work, binding variable
names to a reference to boolean (the same reference used in the abstract
syntax tree). When processing a variable declaraction, for instance, it
inserts the variable name into the symbol table, binding it to its
escaping field. When processing an expression which is a simple
variable, if the variable occurs in a nested function, its binding in
the symbol table is set to true. This reflects directly in the abstract
syntax tree of the variable declaration, as the escape field in the
variable declaration and the binding in the symbol table are the same
reference to bool.
Idea of pure algorithm:

 1) Update the symbol table itself, that is: instead of using

       (a) Map Symbol (IORef Bool)

    use

       (b) Map Symbol Bool.

    This doesn't change the complexity of the algorithm as
    searching and updating have the same complexity for many data
    structures (maps and hashtables, for example).

    In reality, you don't need to use a Map at all.  Just use

       (c) Set Symbol

    and symbols not in the set do not escape.  Using (a) gives
    you O(n log k) for this step, where n is the size of the AST
    and k is the number of symbols.  On the other hand, (c) gives
    you O(n log e), where e is the number of escaping symbols.

 2) After doing the whole analysis, use a function
    'addEscapeInfo' to process the whole pure AST again adding
    information about escaped symbols.  This is O(n log e) as
    well.
...
The second option is more elegant in my point of view, but would be much
less efficient than the first one.
While O(n log e) is better than O(n log k), probably the
constants in the pure algorithm are higher than their impure
counterparts.  I guess you could also try to write:

  1) Take an 'AST e' into 'AST (STRef s Bool, e)' in O(n).

  2) Use the impure algorithm inside ST monad in O(n log k).

  3) Take 'AST (STRef s Bool, e)' into 'AST (Bool, e)' in O(n).

  4) 'runST' on the above steps to get a pure function from
     'AST e' into 'AST (Bool, e)'.

The ST monad has the same runtime overhead as IO.  Steps 1) and
3) are used to hide the ST monad from the rest of the compiler.

Cheers,

--
Felipe.

Re: [Haskell-cafe] Tiger compiler: variable escaping analysis phase

Felipe Lessa