Exposing newtype coercions to Haskell

Hi, this is related to http://hackage.haskell.org/trac/ghc/wiki/NewtypeWrappers#Proposal3. I tried to hack up a little prototype of this, and this “works” now: import GHC.NT newtype Age = Age Int deriving Show ageNT :: NT Age Int ageNT = createNT newtype MyList a = MyList [a] deriving Show myListNT :: NT (MyList a) [a] myListNT = createNT main = do let n = 1 :: Int let a = coerce (sym ageNT) 1 let l1 = [a] let l2 = coerce (listNT ageNT) l1 let l3 = coerce (sym myListNT) l2 print a print l2 print l3 will print Age 1 [1] MyList [1] and no unsafeCoerce and no map is used in the production of this output. The GHC.NT module provides: data NT a b coerce :: NT a b -> a -> b refl :: NT a a sym :: NT a b -> NT b a trans :: NT a b -> NT b c -> NT a c createNT :: NT a b listNT :: NT a b -> NT [a] [b] where createNT takes the place of the "deriving foo :: NT a b" syntax: At compile time, it is checked if its first type argument is a newtype of the second, and the corresponding coercion is inserted, otherwise an error is (well, will be) printed. The idea is that these coercions will be guaranteed to be free (or almost free; I am not sure if the coercion witness (NT a b) will actually be optimized away). The prototype can be found at https://github.com/nomeata/nt-coerce and is implemented as a GHC plugin compatible with GHC 7.6.3 (I chose this route as it means not having to compile GHC). You can check if it works for you via $ ghc -dcore-lint -package ghc --make test.hs && ./test The code itself is, well, at most prototype code quality and contains quite a number of hacks and other warts, mostly due to my inexperience with GHC hacking, and I could make use of some advice. Some more concrete questions: * How do I look up a module (here GHC.NT.Type) and a name therein (NT) in CoreM? I tried to go via getOrigNameCache, but passing the ModuleName name to lookupModuleEnv yielded Nothing, although it seems to me to be precisely the same name that I find in moduleEnvKeys. If I take the ModuleName from that list, it works. Similar problems with the OccEnv. Please see https://github.com/nomeata/nt-coerce/blob/edef9f4d4889dde651bb086e76c576f84e... for what I tried (and how I worked around it). This work-around also prevents building the package with cabal right now. * I did not find generic Core traversal functions like traverse :: (Monad m) => (Expr a -> m (Maybe (Expr a))) -> Expr a -> m (Expr a) which I defined in https://github.com/nomeata/nt-coerce/blob/edef9f4d4889dde651bb086e76c576f84e.... Are such traversals re-implemented everywhere or did I just not search well enough? * On the core level, once I have a TyCon, I also have all DataCons available. Is there already code present that checks if the currently compiled module really should see the data constructors, i.e. if they are exported? I see code in normaliseFfiType, but that seems to be integrated in the type checker, and it seems I need a GlobalRdrEnv; can I get such a thing in a Core2Core pass? Thanks, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Good stuff. The basic idea seems good to me. I'm not sure that a plugin is the way to go here, though it's a good start. · Exactly which programs are type-correct? Eg coerce (listNT createNT) It all depends on exactly what type args createNT is applied to. · A function f = createNT probably would not be accepted. But if ‘f’ was inlined sufficiently early (before the plugin) it might. · Giving an error message in terms of exactly the text the user wrote is harder. · I’m not sure how you expect to deal with essential NT arguments: newtype T a = MkT a a tNT :: NT a b -> NT (T a) (T b) tNT n = createNT ??? On the whole I think a ‘deriving’ form would be preferable. And then yes, you’ll need to build HEAD. To your questions: · To do these kind of things, CoreM will need more reader stuff. In particular: o The global TypeEnv o The GlobalRdrEnv · I don’t think we have generic Core traversal functions, perhaps because almost every such pass needs to deal with binders. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Joachim Breitner | Sent: 01 July 2013 13:00 | To: glasgow-haskell-users@haskell.org | Subject: Exposing newtype coercions to Haskell | | Hi, | | this is related to | http://hackage.haskell.org/trac/ghc/wiki/NewtypeWrappers#Proposal3. | I tried to hack up a little prototype of this, and this “works” now: | | import GHC.NT | | newtype Age = Age Int deriving Show | | ageNT :: NT Age Int | ageNT = createNT | | newtype MyList a = MyList [a] deriving Show | | myListNT :: NT (MyList a) [a] | myListNT = createNT | | | main = do | let n = 1 :: Int | let a = coerce (sym ageNT) 1 | let l1 = [a] | let l2 = coerce (listNT ageNT) l1 | let l3 = coerce (sym myListNT) l2 | print a | print l2 | print l3 | | will print | | Age 1 | [1] | MyList [1] | | and no unsafeCoerce and no map is used in the production of this output. | | | The GHC.NT module provides: | | data NT a b | coerce :: NT a b -> a -> b | refl :: NT a a | sym :: NT a b -> NT b a | trans :: NT a b -> NT b c -> NT a c | createNT :: NT a b | listNT :: NT a b -> NT [a] [b] | | where createNT takes the place of the "deriving foo :: NT a b" syntax: | At compile time, it is checked if its first type argument is a newtype of the second, | and the corresponding coercion is inserted, otherwise an error is (well, will be) | printed. | | The idea is that these coercions will be guaranteed to be free (or almost free; I am | not sure if the coercion witness (NT a b) will actually be optimized away). | | | The prototype can be found at | https://github.com/nomeata/nt-coerce | and is implemented as a GHC plugin compatible with GHC 7.6.3 (I chose this | route as it means not having to compile GHC). You can check if it works for you via | $ ghc -dcore-lint -package ghc --make test.hs && ./test | | The code itself is, well, at most prototype code quality and contains quite a | number of hacks and other warts, mostly due to my inexperience with GHC | hacking, and I could make use of some advice. Some more concrete questions: | | * How do I look up a module (here GHC.NT.Type) and a name therein (NT) in | CoreM? I tried to go via getOrigNameCache, but passing the ModuleName name to | lookupModuleEnv yielded Nothing, although it seems to me to be precisely the | same name that I find in moduleEnvKeys. If I take the ModuleName from that list, | it works. Similar problems with the OccEnv. | Please see | https://github.com/nomeata/nt- | coerce/blob/edef9f4d4889dde651bb086e76c576f84e8f8aec/GHC/NT/Plugin.hs#L9 | 9 | for what I tried (and how I worked around it). This work-around also prevents | building the package with cabal right now. | | * I did not find generic Core traversal functions like traverse :: (Monad m) => | (Expr a -> m (Maybe (Expr a))) -> Expr a -> m (Expr a) which I defined in | https://github.com/nomeata/nt- | coerce/blob/edef9f4d4889dde651bb086e76c576f84e8f8aec/GHC/NT/Plugin.hs#L2 | 31. | Are such traversals re-implemented everywhere or did I just not search well | enough? | | * On the core level, once I have a TyCon, I also have all DataCons available. Is | there already code present that checks if the currently compiled module really | should see the data constructors, i.e. if they are exported? I see code in | normaliseFfiType, but that seems to be integrated in the type checker, and it | seems I need a GlobalRdrEnv; can I get such a thing in a Core2Core pass? | | | Thanks, | Joachim | | -- | Joachim “nomeata” Breitner | mail@joachim-breitner.de • http://www.joachim-breitner.de/ | Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C | Debian Developer: nomeata@debian.org

Hi, Am Dienstag, den 02.07.2013, 14:11 +0000 schrieb Simon Peyton-Jones:
I'm not sure that a plugin is the way to go here, though it's a good start.
as I said, it is just to host the prototype that allows for quick experimentation and allows people to test it without having to compile GHC.
· I’m not sure how you expect to deal with essential NT arguments:
newtype T a = MkT a a
tNT :: NT a b -> NT (T a) (T b)
tNT n = createNT ???
I planned to use tNT = error "magic message to GHC.NT.Plugin" that is then replaced by GHC.NT.Plugin (and if for some reason it is missed by it it stays type safe)
To your questions:
· To do these kind of things, CoreM will need more reader stuff. In particular:
o The global TypeEnv o The GlobalRdrEnv
Ok, I guess this means that I’ll need to leave the easy land of Plugin experimentation very soon... I also noticed a problem with my logic for creating the NT-lifting function: Just having the constructors of C in scope is not sufficient to safely provide NT a b -> NT (C a) (C b) as the parameters of the constructor might wrap a in another type constructor, i.e. data Foo a = Foo (Set a) then we certainly don’t want the user to be able to write deriving fooNT :: NT a b -> NT (Foo a) (Foo b) as he can easily get the unwanted (Set a) -> (Set b) coercion for arbitrary a b from that. So in this example, the compiler should at most accept deriving fooNT :: NT (Set a) (Set b) -> NT (Foo a) (Foo b) but it feels a bit weird that the „internals“ of Foo are relevant here. Following this line of thought it means that for data Proxy a = Proxy we would allow deriving proxyNT :: NT (Proxy a) (Proxy b) without a "NT a b" parameter, as long as the Proxy data constructor is in scope. Greetings, Joachim -- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de ICQ#: 74513189 Jabber-ID: nomeata@joachim-breitner.de

| I also noticed a problem with my logic for creating the NT-lifting | function. Suppose | data C a = MkC (Foo a) | Just having the constructors of C in scope is not sufficient | to safely provide | NT a b -> NT (C a) (C b) | as the parameters of the constructor might wrap a in another type | constructor, eg | data Foo a = Foo (Set a) | | then we certainly don’t want the user to be able to write | deriving fooNT :: NT a b -> NT (Foo a) (Foo b) Dually, suppose Foo was an *abstract* type, where we can't see the constructors of Foo. But the programmer as exported fooNT :: NT a b -> NT (Foo a) (Foo b). Then we *do* want to be able to derive cNT :: NT a b -> NT (C a) (C b) Instead maybe we say deriving cNT :: NT a b -> NT (C a) (C b) using( fooNT ) listing the imported witnesses that we use. Or maybe we say simply deriving cNT :: NT a b -> NT (C a) (C b) and magically use any NT-typed things that are in scope. This clearly deserves treatment on the wiki page. The criterion "could you write it by hand?" remains a good one. Simon

Hi, Am Dienstag, den 02.07.2013, 16:28 +0000 schrieb Simon Peyton-Jones:
| I also noticed a problem with my logic for creating the NT-lifting | function. Suppose | data C a = MkC (Foo a) | Just having the constructors of C in scope is not sufficient | to safely provide | NT a b -> NT (C a) (C b) | as the parameters of the constructor might wrap a in another type | constructor, eg | data Foo a = Foo (Set a) | | then we certainly don’t want the user to be able to write | deriving fooNT :: NT a b -> NT (Foo a) (Foo b)
Dually, suppose Foo was an *abstract* type, where we can't see the constructors of Foo. But the programmer as exported fooNT :: NT a b -> NT (Foo a) (Foo b). Then we *do* want to be able to derive cNT :: NT a b -> NT (C a) (C b) Instead maybe we say deriving cNT :: NT a b -> NT (C a) (C b) using( fooNT ) listing the imported witnesses that we use. Or maybe we say simply deriving cNT :: NT a b -> NT (C a) (C b) and magically use any NT-typed things that are in scope.
Is this really the compiler’s job here? After all, the programmer would be able to write deriving cNT' :: NT (Foo a) (Foo b) -> NT (C a) (C b) cNT :: NT a b -> NT (C a) (C b) cNT = cNT' . fooNT and expose just cNT to his users, so no expressiveness is lost by not providing automatic support here.
This clearly deserves treatment on the wiki page.
Added.
The criterion "could you write it by hand?" remains a good one.
Agreed. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

| Is this really the compiler’s job here? After all, the programmer would be able to | write | | deriving cNT' :: NT (Foo a) (Foo b) -> NT (C a) (C b) | cNT :: NT a b -> NT (C a) (C b) | cNT = cNT' . fooNT | | and expose just cNT to his users, so no expressiveness is lost by not providing | automatic support here. True. But it could get pretty inconvenient. data Foo a = F1 [a] (Tree [a]) (Maybe a) [Maybe a] We could restrict you (only ) to saying deriving fooNT :: NT [a] [b] -> NT (Tree [a]) (Tree [b]) -> NT (Maybe a) (Maybe b) -> NT [Maybe a] [Maybe b] -> NT (Foo a) (Foo b) but the excitement would wear off pretty fast :-). It'd be nicer to write deriving fooNT :: NT a b -> NT (Foo a) (Foo b) and have the compiler use the other lexically-in-scope NT values to solve the problem. This is, after all, precisely what the type-class inference machinery does. There's something a bit strange about looking for all NT values that are lexically in scope; it all amounts to something pretty similar to named type-class instances. Simon

Hi, small update: I generalized the code at https://github.com/nomeata/nt-coerce/blob/9349dd3/GHC/NT/Plugin.hs a bit, it is now able to handle to create NT-values for unwarpping arbitrary newtypes and for lifting across type constructors. It does so unconditionally, i.e. does _not_ yet check whether the constructors are in scope and whether the user is allowed to cast the types occurring in the data constructors. So what works is this: NT values for newtypes without or with type arguments, and possibly recursive: newtype Age = Age Int deriving Show ageNT :: NT Age Int ageNT = deriveThisNT -- temporary syntax for deriving listNT ... newtype MyList a = MyList [a] deriving Show myListNT :: NT (MyList a) [a] myListNT = deriveThisNT newtype R a = R [R a] deriving Show rNT :: NT (R a) [R a] rNT = deriveThisNT NT values for type constructors, replacinge all or some of the type parameters: listNT :: NT a b -> NT [a] [b] listNT = deriveThisNT myListNT' :: NT a b -> NT (MyList a) (MyList b) myListNT' = deriveThisNT data F a b c = F a b c deriving Show fNT :: NT a a' -> NT (F a b c) (F a' b c) fNT = deriveThisNT What does not work is creating a NT value between a newtype and its representation if type arguments change on the way, e.g. bar :: NT (MyList Age) [Int] but the user can construct that himself: bar = myListNT' ageNT `trans` myListNT The next step would be to add the safeguards about the visibility of constructors and about the types of data constructor parameters. Especially the latter is still not clear to me: For example with data Foo a = Foo (Bar a) is it really sufficient to check whether a "barNT:: NT a b -> NT (Bar a) (Bar b)" exists? After all, I would not need barNT in the generated implementation of fooNT, so the user could “provide” this value via undefined and nobody would notice. I get the feeling that already the typing rule for TyConAppCo should expect coercions between the actual types in the data constructors rather than just between the type constructor parameters, so that the implementation barNT would actually have to use fooNT. Maybe that would simplify the two-equalities-approach. But that is just an uneducated gut feeling. BTW: Is this still on-topic on glasgow-haskell-users or should I move to ghc-dev? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Hi again, Am Mittwoch, den 03.07.2013, 10:01 +0200 schrieb Joachim Breitner:
Am Dienstag, den 02.07.2013, 16:28 +0000 schrieb Simon Peyton-Jones:
| I also noticed a problem with my logic for creating the NT-lifting | function. Suppose | data C a = MkC (Foo a) | Just having the constructors of C in scope is not sufficient | to safely provide | NT a b -> NT (C a) (C b) | as the parameters of the constructor might wrap a in another type | constructor, eg | data Foo a = Foo (Set a) | | then we certainly don’t want the user to be able to write | deriving fooNT :: NT a b -> NT (Foo a) (Foo b)
Dually, suppose Foo was an *abstract* type, where we can't see the constructors of Foo. But the programmer as exported fooNT :: NT a b -> NT (Foo a) (Foo b). Then we *do* want to be able to derive cNT :: NT a b -> NT (C a) (C b) Instead maybe we say deriving cNT :: NT a b -> NT (C a) (C b) using( fooNT ) listing the imported witnesses that we use. Or maybe we say simply deriving cNT :: NT a b -> NT (C a) (C b) and magically use any NT-typed things that are in scope.
Is this really the compiler’s job here? After all, the programmer would be able to write
deriving cNT' :: NT (Foo a) (Foo b) -> NT (C a) (C b) cNT :: NT a b -> NT (C a) (C b) cNT = cNT' . fooNT
and expose just cNT to his users, so no expressiveness is lost by not providing automatic support here.
Hmm, I notice that this is not fully thought through. A problem is that on the one hand, the operations we have on newtypes (which allow us to lift coercions between the _type_ constructor parameters) suggest that we want cNT :: NT a b -> NT (C a) (C b) while the “do it by hand” intuition suggests that cNT :: NT (Foo a) (Foo b) -> NT (C a) (C b) should be provided. But I only now notice that this function will not be easily implemented. I guess in this case it could be using NthCo to get a ~ b from Foo a ~ Foo b, but this is probably shaky. This tension between the type constructor oriented coercions and data constructor oriented policy needs a bit more thought. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

This is an exciting effort! Just a quick reaction to Simon's comments on
CoreM.
On Tue, Jul 2, 2013 at 9:11 AM, Simon Peyton-Jones
To your questions:
**
**· **To do these kind of things, CoreM will need more reader stuff. In particular:****
**o **The global TypeEnv****
**o **The GlobalRdrEnv
For my light experimentation, I have recovered these two values from the ModGuts that all plugins receive. Hopefully someone will shout out if there's pitfalls to avoid. * The mg_rdr_env field is of type GlobalRdrEnv. * compiler/main/GHC.hs defines a function compileCore with a local definition that rebuilds a TypeEnv. I extracted this:
\guts -> HscTypes.typeEnvFromEntities (CoreSyn.bindersOfBinds (mg_binds guts)) (mg_tcs guts) (mg_fam_insts guts)
and it has worked so far. HTH and good luck!

Hi, Am Dienstag, den 02.07.2013, 12:56 -0500 schrieb Nicolas Frisby:
For my light experimentation, I have recovered these two values from the ModGuts that all plugins receive. Hopefully someone will shout out if there's pitfalls to avoid.
* The mg_rdr_env field is of type GlobalRdrEnv.
strange, why did I miss that? But I can’t get it to work, even looking up an element that I took from the GRE itself returns []: let e' = head (head (occEnvElts env)) putMsgS $ showSDoc dflags (ppr e') putMsgS $ showSDoc dflags (ppr (lookupGRE_RdrName (nameRdrName (gre_name e')) env)) prints: GHC.NT.Type.NT imported from `GHC.NT.Type' at GHC/NT.hs:5:1-18 (and originally defined at GHC/NT/Type.hs:6:6-7) [] Also, trying to construct a RdrName that I can look up fails: let rdrName = mkRdrQual (mkModuleName "GHC.NT.Type") (mkTcOcc "NT") putMsgS $ showSDoc dflags (ppr (lookupGRE_RdrName rdrName env)) prints also just []. What am I doing wrong? Thanks, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On Wed, Jul 3, 2013 at 5:33 AM, Joachim Breitner
[snip]
strange, why did I miss that?
But I can’t get [the GlobalRdrEnv lookup] to work, even looking up an element that I took from the GRE itself returns []:
let e' = head (head (occEnvElts env)) putMsgS $ showSDoc dflags (ppr e') putMsgS $ showSDoc dflags (ppr (lookupGRE_RdrName (nameRdrName (gre_name e')) env))
prints:
GHC.NT.Type.NT imported from `GHC.NT.Type' at GHC/NT.hs:5:1-18 (and originally defined at GHC/NT/Type.hs:6:6-7) []
Also, trying to construct a RdrName that I can look up fails:
let rdrName = mkRdrQual (mkModuleName "GHC.NT.Type") (mkTcOcc "NT") putMsgS $ showSDoc dflags (ppr (lookupGRE_RdrName rdrName env))
prints also just [].
What am I doing wrong?
Thanks, Joachim
lookupGRE_RdrName looks something up in the environment and then prunes the results via `pickGREs`. I suspect your lookup is succeeding, but all of your results are getting pruned away.
pickGREs :: RdrName -> [GlobalRdrElt] -> [GlobalRdrElt] -- ^ Take a list of GREs which have the right OccName
-- Pick those GREs that are suitable for this RdrName
-- And for those, keep only only the Provenances that are suitable
-- *****Only used for Qual and Unqual, not Orig or Exact*****
Emphasis mine. NB that `nameRdrName` builds a RdrName via the Exact constructor. That's why your first attempt failed (I think). Your second attempt fails for a sneakier reason, something I posted an email about a little while ago (cf "invoking front-end phases from within a GHC plugin?"). The GlobalRdrEnv is keyed on the getUnique result of an OccName. That unique is derived from the unique of the FastString contained in the OccName. The uniques of FastStrings are allocated linearly via a global variable. Unfortunately, your plugin image and the hosting compiler's image have distinct copies of this global variable, so FastStrings from your plugin do not get the same unique as the similar FastStrings that were used to build the GlobalRdrEnv's keys. Your plugin creates FastStrings when calling things like `mkTcOcc`, for example. Unless we eventually include the FastStrings' global variable in the `CoreMonad.reinitializeGlobals` mechanism, I think the available workaround here is to filter `occEnvElts env` for gre_names where `occNameString` and `occNameSpace` match what you're looking for. Robustness may require also constraining the gre_prov field. Good luck.
participants (3)
-
Joachim Breitner
-
Nicolas Frisby
-
Simon Peyton-Jones