
On 18 September 2010 12:25, Johan Tibell
Hi Simon, Thanks for the pointers! On Fri, Sep 17, 2010 at 6:29 PM, Simon Peyton-Jones
wrote: GHC already collects all RdrNames for imported things, for use when reporting unused imports. But it doesn’t collect the SrcSpan of the occurrences, nor does it collect occurrences of locally-bound things.
I found the spot where the collected RdrNames are used to generate the unused import warnings, but I don't quite understand where they are gathered. Is there an AST traversal function somewhere that gathers these RdrNames? If so, I could use it as a blue print to write my own traversal.
I suggest you write a general traversal looking like
data Gather var res
= Gather { g_empty :: res
, g_union :: res -> res -> res
, g_occ :: Located var -> res
, g_del :: Located var -> res -> res }
getExpr :: Gather v res -> HsExpr v -> res
.. and similarly for each other data type...
Could you expand a little bit on this design? Is the idea that the Gather data type carries functions to apply in different parts of the AST? What's "occ" short for, OccName? What about "del"? There are different kind of ASTs (e.g. after renaming, after type checking, etc), which one should I use if I want to gather all qualified names?
You probably want the renamed AST. The typechecked AST is essentially only a list of top-level bindings. The utility functions Simon suggest look to me like a special sort of fold. g_empty and g_union are clear. g_occ records the occurrence (hence "occ") of a variable and adjusts the fold state accordingly. g_del is most likely to be intended for binders, i.e., a place where a variable goes out of scope when going up. A particular Gather operation is of course not obliged to actually delete anything. So, maybe the "g_del" better be called "g_bind". While Gather on the surface may look polymorphic in the "v" argument, in practise it isn't really. Some parts of the AST will be bound to "error" thunks depending on whether you have a parse AST, a renamed AST or typechecked AST. You also may want different traversal modes. E.g., the renamed AST explicitly fills in which ">>", ">>=", "return", etc. are used by a "do" statement, and it depends on the kind of analysis your doing whether these desugarer-introduced nodes should be included or not. You could also use http://hackage.haskell.org/package/ghc-syb. Here's an example that customises the traversal depending on the stage: http://github.com/nominolo/ghc-syb/blob/master/utils/GHC/SYB/Utils.hs / Thomas