RE: A couple of GHC-API questions

| This is what I mean by “resolving” the types. For single-module programs | this is trivial, we can do something like | | resolve x = do rn <- hscParseIdentifier x | hscTcRnLookupRdrName rn | | For multi-module programs it becomes trickier because we also have to | resolve the types that we’ve imported from other modules. No, that is no harder **provided those types are in scope**. So suppose you have module M where import A( Foo ) f :: Int -> Int {-# LIQUID f :: { x | ..blah..Foo..blah } -> Int #-} Here I am supposing that the Liquid Haskell specification is in a pragma of M, and mentions an imported data type Foo. To resolve the string "Foo" to the Name A.Foo (or Bar.Foo, or whatever), hscTcRnLookupRdrName will work just fine. If you import Foo qualified, then of course you'll have to use a qualified name in the source module M where import qualified A( Foo ) f :: Int -> Int {-# LIQUID f :: { x | ..blah..A.Foo..blah } -> Int #-} If you *don't* import Foo at all, then it's utterly non-obvious where to look for it, so I don't suppose you are doing that. In short, why doesn’t hscTcRnLookupRdrName do the job? Incidentally, it doesn't make sense to ask if a Name is "in scope". Only RdrNames can be "in scope" or "not in scope". Simon | example from earlier, when we type-check module B we have to turn the | String “Foo” into the Name A.Foo. This is problematic because module B | imported module A qualified, so “Foo” is not in scope inside B but | “A.Foo” is. | | I believe GHC might be avoiding this issue via the .hi files, so when you | import a Type from another module, it is already using TyCons instead of | Strings. | | > Later you say "is there a simple way to ask GHC to resolve the name x | in the context of module | > m". You could mean | > * Imagine that the unqualified name "x" appeared in module m. How do I | look it up, in m's | > top-level lexical environment. | > but I don’t think that is what you mean. | | I think this is almost exactly what I mean, except that I want to be able | to look up the unqualified “x” as well as the qualified | “SomeModuleThatMayHaveBeenRenamed.x” inside m’s top-level environment. | This is more or less what the | DynamicLoading.lookupRdrNameInModuleForPlugins function that I’ve copied | and tweaked does, but it requires the presence of the original source | code. I’m hoping there may be some other function out there that does the | same thing without requiring the source code, so we can use it for the | base libraries as well. | | > I'm confused. Could you be more concrete? | > Possibly this may help? | https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/NameType | | Thanks for the reference, I’m familiar with most of the info there | already from browsing GHC’s source, but this is laid out much more | newcomer-friendly manner :) | | Hopefully I’ve made my question a bit clearer. | | Eric

I think I see where the confusion is now. In your example, hscTcRnLookupRdrName should work perfectly. I’m thinking of a different scenario, where we are trying to use the specification of a function that has been imported from another module. Suppose we have module List where data List = Nil | Cons Int List {-# LIQUID measure length :: List -> Int #-} replicate :: Int -> Int -> List {-# LIQUID replicate :: x:Int -> n:Int -> {v:List | length v = n } #-} head :: List -> Int {-# LIQUID head :: {v:List | length v > 0} -> Int #-} module Main where import qualified List as L main = print . L.head $ L.replicate 5 10 --- LiquidHaskell should be able to prove that the call to L.head in Main.main is safe. In order to do so we have to first figure out what types were given to L.head and L.replicate. When we first parse in the specifications from List, we get types that refer to the String "List" which is parsed into the RdrName Unqual (OccName tcName "List") hscTcRnLookupRdrName will *rightly complain* that this RdrName is not in scope because we’re currently in the context of Main; we have to instead look for it in the top-level environment of the List module. I think part of the confusion is coming from the fact that we don’t process each module individually, resolve all of the types, and serialize them to disk somewhere. Rather, when we check a module, we parse in ALL of the specifications it could possibly refer to (using the transitive closure of the module imports) and try to resolve all of them before moving to the actual constraint generation and solving step. Does that make more sense? Eric On July 24, 2014 at 15:54:27, Simon Peyton Jones (simonpj@microsoft.com) wrote:
| This is what I mean by “resolving” the types. For single-module programs | this is trivial, we can do something like | | resolve x = do rn <- hscParseIdentifier x | hscTcRnLookupRdrName rn | | For multi-module programs it becomes trickier because we also have to | resolve the types that we’ve imported from other modules.
No, that is no harder **provided those types are in scope**. So suppose you have
module M where import A( Foo ) f :: Int -> Int {-# LIQUID f :: { x | ..blah..Foo..blah } -> Int #-}
Here I am supposing that the Liquid Haskell specification is in a pragma of M, and mentions an imported data type Foo.
To resolve the string "Foo" to the Name A.Foo (or Bar.Foo, or whatever), hscTcRnLookupRdrName will work just fine.
If you import Foo qualified, then of course you'll have to use a qualified name in the source
module M where import qualified A( Foo ) f :: Int -> Int {-# LIQUID f :: { x | ..blah..A.Foo..blah } -> Int #-}
If you *don't* import Foo at all, then it's utterly non-obvious where to look for it, so I don't suppose you are doing that.
In short, why doesn’t hscTcRnLookupRdrName do the job?

| I think part of the confusion is coming from the fact that we don’t | process each module individually, resolve all of the types, and | serialize them to disk somewhere. Rather, when we check a module, we | parse in ALL of the specifications it could possibly refer to (using | the transitive closure of the module imports) and try to resolve all of | them before moving to the actual constraint generation and solving | step. Yes that makes more sense. But it's not a good fit with GHC's way of working. You want to take some random module M, parse it afresh to find its LIQUID bits, and then resolve what those things mean. But that *necessarily* means that you must reconstruct the top-level lexical environment, which is formed from the import and top-level declarations. Constructing this environment -- the GlobalRdrEnv -- is not a simple matter. Much of RnNames does precisely that. Alas, we do not serialise this environment into M's interface file M.hi. We could do so, and doing so would be somewhat useful for GHCi, but it's absolutely not needed in the usual way of things. So I suppose what you could do is to reconstruct it afresh using GHC's existing code to do so. If you call RnNames.rnImports on the import declarations of your module, you'll get back a GlobalRdrEnv in which you can look up the RdrNames that occur in the LIQUID declarations. Actually that'll deal only with the imported things. You also want the locally defined top-level things. To do that you could use things like getLocalNonValBinders (see the call in RnSource.rnSrcDecls), but it might be easier to take the TypeEnv computed from the module's interface file, and turn that back into a bunch of global binders. But it's not so easy. What about module M( f ) where {-# LIQUID f :: { x | ...g... } -> Int f = blah g = blah Here g is only mentioned form f's LIQUID spec. So g will have disappeared altogether from M's interface file, discarded as dead code. The natural alternative would be to serialise the LIQUID things into M.hi. That would entail parsing and renaming them in GHC, and extending the interface file format to handle it. I think it would be simpler. But it would be a bit more invasive of GHC. There is some support for serialising random stuff into interface files, called "annotations". You write {-# ANN f <Haskell expression> #-} and GHC will record in the interface file that f is associated with expression. See http://www.haskell.org/ghc/docs/latest/html/users_guide/extending-ghc.html#a... Maybe there's a neater way to do this, that neither involves changing GHC nor all the faff above. If so I'd be happy to change GHC to accommodate the neater way. I'd like GHC to accommodate extensions like LH much more easily. Simon | -----Original Message----- | From: Eric Seidel [mailto:gridaphobe@gmail.com] | Sent: 25 July 2014 02:24 | To: Simon Peyton Jones | Cc: ghc-devs@haskell.org | Subject: RE: A couple of GHC-API questions | | I think I see where the confusion is now. In your example, | hscTcRnLookupRdrName should work perfectly. I’m thinking of a different | scenario, where we are trying to use the specification of a function | that has been imported from another module. Suppose we have | | module List where | | data List = Nil | Cons Int List | {-# LIQUID measure length :: List -> Int #-} | | replicate :: Int -> Int -> List | {-# LIQUID replicate :: x:Int -> n:Int -> {v:List | length v = n } #-} | | head :: List -> Int | {-# LIQUID head :: {v:List | length v > 0} -> Int #-} | | | module Main where | import qualified List as L | | main = print . L.head $ L.replicate 5 10 | | --- | | LiquidHaskell should be able to prove that the call to L.head in | Main.main is safe. In order to do so we have to first figure out what | types were given to L.head and L.replicate. | | When we first parse in the specifications from List, we get types that | refer to the String | | "List" | | which is parsed into the RdrName | | Unqual (OccName tcName "List") | | hscTcRnLookupRdrName will *rightly complain* that this RdrName is not | in scope because we’re currently in the context of Main; we have to | instead look for it in the top-level environment of the List module. | | I think part of the confusion is coming from the fact that we don’t | process each module individually, resolve all of the types, and | serialize them to disk somewhere. Rather, when we check a module, we | parse in ALL of the specifications it could possibly refer to (using | the transitive closure of the module imports) and try to resolve all of | them before moving to the actual constraint generation and solving | step. | | Does that make more sense? | | Eric | | On July 24, 2014 at 15:54:27, Simon Peyton Jones | (simonpj@microsoft.com) wrote: | > | This is what I mean by “resolving” the types. For single-module | > | programs this is trivial, we can do something like | > | | > | resolve x = do rn <- hscParseIdentifier x hscTcRnLookupRdrName rn | > | | > | For multi-module programs it becomes trickier because we also have | > | to resolve the types that we’ve imported from other modules. | > | > No, that is no harder **provided those types are in scope**. So | > suppose you have | > | > module M where | > import A( Foo ) | > f :: Int -> Int | > {-# LIQUID f :: { x | ..blah..Foo..blah } -> Int #-} | > | > Here I am supposing that the Liquid Haskell specification is in a | > pragma of M, and mentions an imported data type Foo. | > | > To resolve the string "Foo" to the Name A.Foo (or Bar.Foo, or | > whatever), hscTcRnLookupRdrName will work just fine. | > | > If you import Foo qualified, then of course you'll have to use a | > qualified name in the source | > | > module M where | > import qualified A( Foo ) | > f :: Int -> Int | > {-# LIQUID f :: { x | ..blah..A.Foo..blah } -> Int #-} | > | > If you *don't* import Foo at all, then it's utterly non-obvious where | > to look for it, so I don't suppose you are doing that. | > | > In short, why doesn’t hscTcRnLookupRdrName do the job? |

On July 24, 2014 at 23:49:10, Simon Peyton Jones (simonpj@microsoft.com) wrote:
| I think part of the confusion is coming from the fact that we don’t | process each module individually, resolve all of the types, and | serialize them to disk somewhere. Rather, when we check a module, we | parse in ALL of the specifications it could possibly refer to (using | the transitive closure of the module imports) and try to resolve all of | them before moving to the actual constraint generation and solving | step.
Yes that makes more sense.
But it's not a good fit with GHC's way of working. You want to take some random module M, parse it afresh to find its LIQUID bits, and then resolve what those things mean. But that *necessarily* means that you must reconstruct the top-level lexical environment, which is formed from the import and top-level declarations. Constructing this environment -- the GlobalRdrEnv -- is not a simple matter. Much of RnNames does precisely that.
Alas, we do not serialise this environment into M's interface file M.hi. We could do so, and doing so would be somewhat useful for GHCi, but it's absolutely not needed in the usual way of things.
This is basically what I expected, and I wouldn’t want to argue for adding more things to the .hi files unless there are other compelling use cases.
So I suppose what you could do is to reconstruct it afresh using GHC's existing code to do so. If you call RnNames.rnImports on the import declarations of your module, you'll get back a GlobalRdrEnv in which you can look up the RdrNames that occur in the LIQUID declarations.
Actually that'll deal only with the imported things. You also want the locally defined top-level things. To do that you could use things like getLocalNonValBinders (see the call in RnSource.rnSrcDecls), but it might be easier to take the TypeEnv computed from the module's interface file, and turn that back into a bunch of global binders.
But it's not so easy. What about
module M( f ) where
{-# LIQUID f :: { x | ...g... } -> Int f = blah
g = blah
Here g is only mentioned form f's LIQUID spec. So g will have disappeared altogether from M's interface file, discarded as dead code.
Thanks for these pointers! I’m not too concerned about the dead code issue since Liquid Types generally won’t refer to top-level binders (they’re either going to be functions, which can’t appear in the types in the first place, or constants, which can be inlined).
The natural alternative would be to serialise the LIQUID things into M.hi. That would entail parsing and renaming them in GHC, and extending the interface file format to handle it. I think it would be simpler. But it would be a bit more invasive of GHC.
There is some support for serialising random stuff into interface files, called "annotations". You write {-# ANN f #-} and GHC will record in the interface file that f is associated with expression. See http://www.haskell.org/ghc/docs/latest/html/users_guide/extending-ghc.html#a...
Yes, I’ve looked at the annotations stuff in the past and have imagined converting LiquidHaskell to a GHC Plugin that pulls the types out of annotations. I *think* it could work, though there are two immediate wrinkles: 1. The docs say that annotations can only be applied to top-level binders/declarations. We’ve found it *very* helpful to annotate nested binders in some cases to reduce the inference burden (it’s also extremely helpful for debugging the code/spec). Is this restriction in place because of the aforementioned issue of not serializing the GlobalRdrEnv? If so, could the restriction be lifted with the caveat that annotations on nested binders will *not* be exported? I think that would be sufficient for us as long as we can get a hold of the Core before any sort of inlining happens. 2. The docs say that you can only annotate binders that are declared in the same module. We’ve similarly found it useful to be able to *assume* stronger types for certain imported functions, though this is of course unsound. Could we also lift the same-module restriction, again with the caveat that the annotations will *not* be exported? These are just some things to think about, I doubt I would have time to attempt serious changes to LiquidHaskell along these lines until next year at the earliest.
Maybe there's a neater way to do this, that neither involves changing GHC nor all the faff above. If so I'd be happy to change GHC to accommodate the neater way. I'd like GHC to accommodate extensions like LH much more easily.
I still think that a Template Haskell-based approach could be really nice without requiring any changes to GHC (beyond the profiling issue we’ve run into). The reason I like this idea is that it feels more lightweight to me, users would just express their specifications as Haskell values, and they (and we) wouldn’t have to worry about making sure that the right flags are passed to GHC to find imported modules etc. Has anyone ever tried to blend Template Haskell with Annotations/Plugins? I’m imagining something like the following t_head = head ::: [liquid| {v:[a] | length v > 0} -> a|] head [] = error head (x:xs) = x {-# ANN head t_head #-} Here the spec for head exists alongside the implementation as a Haskell value, so it’s immediately available for other tools to build on top of, but it’s also attached to head as an Annotation so that Plugins like a future version of the liquidhaskell verifier, or perhaps an optimization pass, can make use of it during compilation. The ANN pragma seems a bit redundant here, but I really like the idea of having these types readily available at multiple levels and I don’t quite see how to accomplish that with Template Haskell or Annotations alone. Thanks for all of the comments! Eric

| 1. The docs say that annotations can only be applied to top-level | binders/declarations. We’ve found it *very* helpful to annotate nested | binders in some cases to reduce the inference burden (it’s also | extremely helpful for debugging the code/spec). Is this restriction in | place because of the aforementioned issue of not serializing the | GlobalRdrEnv? If so, could the restriction be lifted with the caveat | that annotations on nested binders will *not* be exported? I think that | would be sufficient for us as long as we can get a hold of the Core | before any sort of inlining happens. I think the main awkwardness is that there's no very convenient way to attach the annotation to a nested binder. For built in things like strictness it's all kept in the IdInfo. You could imagine a kind of extensible IdInfo, with a field of [Dynamic], being the annotations. To look up, find the Dynamic with the expected type (e.g. LiquidHaskellSpec). Not very hard, I think, but still work. | 2. The docs say that you can only annotate binders that are declared in | the same module. We’ve similarly found it useful to be able to *assume* | stronger types for certain imported functions, though this is of course | unsound. Could we also lift the same-module restriction, again with the | caveat that the annotations will *not* be exported? I don’t think that would be hard to lift. | I still think that a Template Haskell-based approach could be really | nice without requiring any changes to GHC (beyond the profiling issue | we’ve run into). That sounds good, but I can't really say more because I don't understand what "a Template-Haskell-based approach" really is. I'd be happy to have a Skype call about it, with Ranjit and Niki, if you wanted. Simon
participants (2)
-
Eric Seidel
-
Simon Peyton Jones