
Christopher Done
Hi all,
Given a TypecheckedModule, what's the most direct way given a Var expression retrieved from the AST, to determine:
1) that it's a class method e.g. `read` 2) that it's a generic call (no instance chosen) e.g. `Read a => a -> String` 3) or if it's a resolved instance, then which instance is it and which package, module and declaration is that defined in?
Starting with this file that has a TypecheckedModule in it: https://gist.github.com/chrisdone/6fcb9f1cba6324148d481fcd4eab6af6#file-ghc-...
I presume at this point that instance resolution has taken place. I'm not sure that dictionaries or chosen instances are inserted into the AST, or whether just the resolved types are inserted e.g. `Int -> String`, where I want e.g. `Read Int`, which might lead me to finding the matching instance from an InstEnv or so.
I'd like to do some analyses of Haskell codebases, and the fact that calls to class methods are opaque is a bit of a road-blocker. Any handy tips? Prior work?
It'd be neat in tooling to just hit a goto-definition key on `read` and be taken to the instance implementation rather than the class definition.
Indeed that would be great. I believe (1) is quite straightforward: You can recognize a class operation by looking at the function's IdDetails (specifically looking for ClassOpId). This contains the Class to which the method belongs. Getting back to the instance is a bit trickier. I'll admit I don't know whether there is a convenient way to do this. However, I can try to fill in some background and give a few ideas. First let's review of how typeclass evidence is represented in HsSyn (apologies if this is already known): For concreteness, let's consider the program, showList :: Show a => [a] -> String showList x = show x After typechecking this will likely turn into something like (taken from the output of -ddump-tc -fprint-typechecker-elaboration): AbsBindsSig [a_a1hj] [$dShow_a1hl] {Exported type: Hi.showList :: forall a. Show a => [a] -> String [LclId] Bind: showList_a1hk x_azo = show @ [a_a1hj] $dShow_a1hn x_azo Evidence: EvBinds{[W] $dShow_a1hn = GHC.Show.$fShow[] @[a_a1hj] [$dShow_a1hl]}} This AbsBind represents a binding abstracted over a dictionary argument ($dShow_a1hl :: Show a_a1hj). The "Evidence" section gives a list of evidence bindings which the desugarer will wrap the RHS in; in this case the typechecker has built a `Show [a_a1hj]` instance from the `Show a => Show [a]` instance defined in GHC.Show and the abstracted `$dShow_A1hl` dictionary. The `show` call site will then look something like this in HsSyn: HsApp (HsWrap (WpEvApp $dShow_a1hn) (HsWrap (WpTyApp a_a1hj) (HsVar GHC.Show.show))) (HsVar x_azo) Here the typechecker has wrapped the (show x_azo) expression in a pair of HsWrappers which apply its type and dictionary arguments. This suggests an approach to identify "generic" call sites (item (2) above): look at whether the RHS of the call site's dictionary is lambda-bound or not. In the above case we see that it is not lambda-bound but rather a concrete dictionary: `GHC.Show.$fShow[]`. You can know that this is a dictionary by looking at its IdDetails (specifically, it is of the DFunId variety). By contrast if we have a generic call-site: printIt :: Show a => a -> IO () printIt x = putStrLn $ show x We see that we the evidence binding is headed by a lambda-bound dictionary: AbsBindsSig [a_a1AP] [$dShow_a1AR] {Exported type: printIt :: forall a. Show a => a -> IO () [LclId] Bind: printIt_a1AQ x_a12W = putStrLn $ show @ a_a1AP $dShow_a1AV x_a12W Evidence: EvBinds{[W] $dShow_a1AV = $dShow_a1AR}} Of course, in the case that you have a concrete dictionary you *also* want to know the source location of the instance declaration from which it arose. I'm afraid this may be quite challenging as this isn't information we currently keep. Currently interface files don't really keep any information that might be useful to IDE tooling users. It's possible that we could add such information, although it's unclear exactly what this would look like. It would be great to hear more from tooling users regarding what information they would like to see. Also relevant here is the HIE file GSoC project [1] being worked on this summer of Zubin Duggal (CC'd).
Also, listing all functions that use throw# or functions defined in terms of throw# or FFI calls would be helpful, especially for doing audits. If I could immediately list all partial functions in a project, then list all call-sites, it would be a very convenient way when doing an audit to see whether partial functions (such as head) are used with the proper preconditions or not.
This may be non-trivial; you may be able to get something along these lines out of the strictness signature present in IdInfo. However, I suspect this will be a bit fragile (e.g. we don't even run demand analysis with -O0 IIRC). Cheers, - Ben [1] https://ghc.haskell.org/trac/ghc/wiki/HIEFiles