GhcPlugin-writing and "finding things"

Dear GHC-ers, I'm working on a plugin for GHC that should help compile the library with which this plugin is to ship. What this plugin does is traverse the CoreProgram(s) to find things of types defined in my library and optimizes them. I have worked out how to "find" things, but I was wondering whether the API could be improved for plugin-writers. For the sake of argument, I have the following: - module Foo: library for users to import, containing functions, ADTs etc - module Foo.Plugin: GhcPlugin that compiles out all uses of things in Foo
module Foo where
data Foo x = Foo x
runFoo :: Foo x -> x runFoo (Foo x) = x
This example is trivial and I imagine GHC will have no trouble eliminating most cases of this, but imagine more complex stuff. Now, if I want to traverse the CoreProgram in my plugin, I need to find occurrences of these, so somewhere there's stuff like:
pass tcFoo _ _ (NonRec b expr) | varType b `containsTyConAnywhere` tcFoo = {- clever stuff to compile out Foo -}
My problem is "getting" tcFoo in this example. Below is how I do it now. Maybe I'm being thick, or maybe there's just no simpler way. This is my 'plugin' function in Foo.Plugin:
plugin = Plugin $ \opts todo -> do hsc <- getHscEnv dfs <- getDynFlags fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> return m _ -> panic "Failed to (unambiguously) find 'Foo' (using findImportedModule)" onc <- getOrigNameCache let nms = lookupWithDefaultModuleEnv nms (panic "No names defined for module 'Foo'") mod find_ d occ fnd nm = maybe (fail $ "Failed to find " ++ d ++ " '" ++ nm ++ "'") fnd (lookupOccEnv nms $ occ nm) tcFind = find_ "TyCon" mkTcOcc lookupTyCon dcFind = find_ "DataCon" mkDataOcc lookupDataCon idFind = find_ "Id" mkVarOcc lookupId tcFoo <- tcFind "Foo" dcFoo <- dcFind "Foo" idRunFoo <- idFind "runFoo" return $ CoreDoPluginPass "Foo optimisation" (pass tcFoo dcFoo idRunFoo) : todo
I have the following questions: 1) Is this a/the right way to "find" those things in the plugin? 2) There seems to be a lot to gain with quasi-quoting a la Template Haskell for people writing plugins to go with a library that they wrote. Can such QQ be done? Has it been considered? 3) Is findImportedModule the right function to find my starting point to begin with? 4) What is the 'Maybe FastString' argument in findImportedModule for? I've been trying to put in the FSs of PackageIDs, but they make the lookup fail. This (dumb) example really made me nervous:
fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> do fr' <- liftIO $ findImportedModule hsc (moduleName m) (packageIdFS $ packageId m)
Here, fr' should always be a "Found ml' m'" such that ml == ml' and m == m', but... it consistently results in NotFound{} for me. Also, I find this especially round-about. Shouldn't Paths_Foo.hs (the Cabal-generated file) maybe contain variables for every module in the package? In my case it would thus contain some "modFoo :: Module" Comments and suggestions more than welcome! Regards, Philip

Have you considered using HERMIT for this? I think this is a rough approximation of what you are trying to do (using HERMIT): import HERMIT.Plugin import HERMIT.Dictionary plugin = hermitPlugin $ \ opts -> firstPhase $ run $ tryR $ innermostR $ promoteBindR compileFooBindR compileFooBindR :: RewriteH CoreBind compileFooBindR = prefixFailMsg "compileFooBindR failed: " $ do NonRec b rhs <- idR -- only match on non-recursive bindings tcFoo <- findTyConT "Foo" -- can be fully qualified name if target code doesn't import directly guardMsg (varType b `containsTyConAnywhere` tyFoo) "does not contain Foo" -- abort if binder doesn't contain Foo in type return $ NonRec b $ {- magicCompileFunction -} rhs The goal of HERMIT is to make writing these plugins easier. For instance, if you give a fully qualified name to findTyConT (or the other find functions), and HERMIT can't find the name in scope in the target module, it'll look in the package database for the appropriate interface and load it. You can even run your compilation functions interactively and view their output in a REPL. To do so, change your plugin to: plugin = hermitPlugin $ firstPhase . interactive exts exts :: Externals exts = [ external "compile-foo" (promoteBindR compileFooBindR) [ "compiles bindings involving Foo" ] ] {- compileFooBindR as before -} Then you can navigate around your AST and use the "compile-foo" command to test out your compilation. If you want to try, I'd highly recommend using the latest from github, rather than what is on hackage: https://github.com/ku-fpg/hermit Here are a few examples of larger HERMIT plugins: https://github.com/xich/hermit-syb/blob/master/hermit-syb/HERMIT/Optimizatio... https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/Reify.hs#L866 Let me know if you have questions! Andrew On Wed, Jul 23, 2014 at 11:06 AM,
Dear GHC-ers,
I'm working on a plugin for GHC that should help compile the library with which this plugin is to ship. What this plugin does is traverse the CoreProgram(s) to find things of types defined in my library and optimizes them. I have worked out how to "find" things, but I was wondering whether the API could be improved for plugin-writers.
For the sake of argument, I have the following: - module Foo: library for users to import, containing functions, ADTs etc - module Foo.Plugin: GhcPlugin that compiles out all uses of things in Foo
module Foo where
data Foo x = Foo x
runFoo :: Foo x -> x runFoo (Foo x) = x
This example is trivial and I imagine GHC will have no trouble eliminating most cases of this, but imagine more complex stuff. Now, if I want to traverse the CoreProgram in my plugin, I need to find occurrences of these, so somewhere there's stuff like:
pass tcFoo _ _ (NonRec b expr) | varType b `containsTyConAnywhere` tcFoo = {- clever stuff to compile out Foo -}
My problem is "getting" tcFoo in this example. Below is how I do it now. Maybe I'm being thick, or maybe there's just no simpler way. This is my 'plugin' function in Foo.Plugin:
plugin = Plugin $ \opts todo -> do hsc <- getHscEnv dfs <- getDynFlags fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> return m _ -> panic "Failed to (unambiguously) find 'Foo' (using findImportedModule)" onc <- getOrigNameCache let nms = lookupWithDefaultModuleEnv nms (panic "No names defined for module 'Foo'") mod find_ d occ fnd nm = maybe (fail $ "Failed to find " ++ d ++ " '" ++ nm ++ "'") fnd (lookupOccEnv nms $ occ nm) tcFind = find_ "TyCon" mkTcOcc lookupTyCon dcFind = find_ "DataCon" mkDataOcc lookupDataCon idFind = find_ "Id" mkVarOcc lookupId tcFoo <- tcFind "Foo" dcFoo <- dcFind "Foo" idRunFoo <- idFind "runFoo" return $ CoreDoPluginPass "Foo optimisation" (pass tcFoo dcFoo idRunFoo) : todo
I have the following questions:
1) Is this a/the right way to "find" those things in the plugin? 2) There seems to be a lot to gain with quasi-quoting a la Template Haskell for people writing plugins to go with a library that they wrote. Can such QQ be done? Has it been considered? 3) Is findImportedModule the right function to find my starting point to begin with? 4) What is the 'Maybe FastString' argument in findImportedModule for? I've been trying to put in the FSs of PackageIDs, but they make the lookup fail. This (dumb) example really made me nervous:
fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> do fr' <- liftIO $ findImportedModule hsc (moduleName m) (packageIdFS $ packageId m)
Here, fr' should always be a "Found ml' m'" such that ml == ml' and m == m', but... it consistently results in NotFound{} for me. Also, I find this especially round-about. Shouldn't Paths_Foo.hs (the Cabal-generated file) maybe contain variables for every module in the package? In my case it would thus contain some "modFoo :: Module"
Comments and suggestions more than welcome!
Regards, Philip
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Dear Andrew, Thanks for your suggestion. I had considered it earlier and decided against it for the extra dependencies. Maybe I was too picky there. I will give it another go. Could there possibly be a subset of hermit that plugin-writers could depend on, but that have fewer dependencies? I find it hard to explain to people why they require things like ansi-terminal if they want to use my parser-combinators. I still think this isn't an unreasonable use case to take on board for future GHC API design, though. Also, the thing with findImportedModule still scares me. Regards, Philip
-----Original Message----- From: xichekolas@gmail.com [mailto:xichekolas@gmail.com] On Behalf Of Andrew Farmer Sent: woensdag 23 juli 2014 19:22 To: Holzenspies, P.K.F. (EWI) Cc: glasgow-haskell-users@haskell.org Subject: Re: GhcPlugin-writing and "finding things"
Have you considered using HERMIT for this? I think this is a rough approximation of what you are trying to do (using HERMIT):
import HERMIT.Plugin import HERMIT.Dictionary
plugin = hermitPlugin $ \ opts -> firstPhase $ run $ tryR $ innermostR $ promoteBindR compileFooBindR
compileFooBindR :: RewriteH CoreBind compileFooBindR = prefixFailMsg "compileFooBindR failed: " $ do NonRec b rhs <- idR -- only match on non-recursive bindings tcFoo <- findTyConT "Foo" -- can be fully qualified name if target code doesn't import directly guardMsg (varType b `containsTyConAnywhere` tyFoo) "does not contain Foo" -- abort if binder doesn't contain Foo in type return $ NonRec b $ {- magicCompileFunction -} rhs
The goal of HERMIT is to make writing these plugins easier. For instance, if you give a fully qualified name to findTyConT (or the other find functions), and HERMIT can't find the name in scope in the target module, it'll look in the package database for the appropriate interface and load it.
You can even run your compilation functions interactively and view their output in a REPL. To do so, change your plugin to:
plugin = hermitPlugin $ firstPhase . interactive exts
exts :: Externals exts = [ external "compile-foo" (promoteBindR compileFooBindR) [ "compiles bindings involving Foo" ] ]
{- compileFooBindR as before -}
Then you can navigate around your AST and use the "compile-foo" command to test out your compilation.
If you want to try, I'd highly recommend using the latest from github, rather than what is on hackage:
https://github.com/ku-fpg/hermit
Here are a few examples of larger HERMIT plugins:
https://github.com/xich/hermit-syb/blob/master/hermit- syb/HERMIT/Optimization/SYB.hs#L28 https://github.com/conal/lambda- ccc/blob/master/src/LambdaCCC/Reify.hs#L866
Let me know if you have questions!
Andrew
Dear GHC-ers,
I'm working on a plugin for GHC that should help compile the library with which this plugin is to ship. What this plugin does is traverse the CoreProgram(s) to find things of types defined in my library and optimizes them. I have worked out how to "find" things, but I was wondering whether the API could be improved for plugin-writers.
For the sake of argument, I have the following: - module Foo: library for users to import, containing functions, ADTs etc - module Foo.Plugin: GhcPlugin that compiles out all uses of things in Foo
module Foo where
data Foo x = Foo x
runFoo :: Foo x -> x runFoo (Foo x) = x
This example is trivial and I imagine GHC will have no trouble eliminating most cases of this, but imagine more complex stuff. Now, if I want to traverse the CoreProgram in my plugin, I need to find occurrences of these, so somewhere there's stuff like:
pass tcFoo _ _ (NonRec b expr) | varType b `containsTyConAnywhere` tcFoo = {- clever stuff to compile out Foo -}
My problem is "getting" tcFoo in this example. Below is how I do it now. Maybe I'm being thick, or maybe there's just no simpler way. This is my 'plugin' function in Foo.Plugin:
plugin = Plugin $ \opts todo -> do hsc <- getHscEnv dfs <- getDynFlags fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> return m _ -> panic "Failed to (unambiguously) find 'Foo' (using findImportedModule)" onc <- getOrigNameCache let nms = lookupWithDefaultModuleEnv nms (panic "No names defined for module 'Foo'") mod find_ d occ fnd nm = maybe (fail $ "Failed to find " ++ d ++ " '" ++ nm ++ "'") fnd (lookupOccEnv nms $ occ nm) tcFind = find_ "TyCon" mkTcOcc lookupTyCon dcFind = find_ "DataCon" mkDataOcc lookupDataCon idFind = find_ "Id" mkVarOcc lookupId tcFoo <- tcFind "Foo" dcFoo <- dcFind "Foo" idRunFoo <- idFind "runFoo" return $ CoreDoPluginPass "Foo optimisation" (pass tcFoo dcFoo idRunFoo) : todo
I have the following questions:
1) Is this a/the right way to "find" those things in the plugin? 2) There seems to be a lot to gain with quasi-quoting a la Template Haskell for people writing plugins to go with a library that they wrote. Can such QQ be done? Has it been considered? 3) Is findImportedModule the right function to find my starting
4) What is the 'Maybe FastString' argument in findImportedModule for? I've been trying to put in the FSs of PackageIDs, but they make the lookup fail. This (dumb) example really made me nervous:
fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing mod <- case fr of Found ml m -> do fr' <- liftIO $ findImportedModule hsc (moduleName m) (packageIdFS $ packageId m)
Here, fr' should always be a "Found ml' m'" such that ml == ml' and m == m', but... it consistently results in NotFound{} for me. Also, I find
On Wed, Jul 23, 2014 at 11:06 AM,
wrote: point to begin with? this especially round-about. Shouldn't Paths_Foo.hs (the Cabal-generated file) maybe contain variables for every module in the package? In my case it would thus contain some "modFoo :: Module" Comments and suggestions more than welcome!
Regards, Philip
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Philip You are right: there are some missing pieces. * First you need to ask where your plugin's special library module "Foo" is in the file system. This is what findImportedModule is for, and it seems quite reasonable. However, it (or some variant) should be available to you in CoreM. * Next, suppose you special library module defines a special type "T". You need to get its Name. For this you a CoreM variant of IfaceEnv.lookupOrig. The function CoreMonad.getOrigNameCache is far too low level and should be killed. Instead, CoreMonad should expose lookupOrig :: Module -> OccName -> CoreM Name It should be an easy function to write, using IfaceEnv.lookupOrig; maybe a tiny bit of refactoring. * Next you want to get from T's Name to T's TyCon. Here CoreMonad is fine: it offers lookupThing :: Name -> CoreM TyThing This function calls TcEnv.tcLookupGlobal, which will automatically load Foo.hi if need be. So your code should look like foo_mod <- findImportedModule "Foo" t_name <- lookupOrig foo_mod (mkTcOcc "T") t_tycon <- lookupThing t_name corresponding to these three steps. I suspect that the error cases of findImported module should be dealt with via exceptions in CoreM, to de-clutter the code. Some of the above suggests a bit of cleaning up of the CoreM API. Would someone like to undertake that? I can advise, but I don't want to lead. Simon | -----Original Message----- | From: Glasgow-haskell-users [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of p.k.f.holzenspies@utwente.nl | Sent: 23 July 2014 17:07 | To: glasgow-haskell-users@haskell.org | Subject: GhcPlugin-writing and "finding things" | | Dear GHC-ers, | | I'm working on a plugin for GHC that should help compile the library | with which this plugin is to ship. What this plugin does is traverse | the CoreProgram(s) to find things of types defined in my library and | optimizes them. I have worked out how to "find" things, but I was | wondering whether the API could be improved for plugin-writers. | | For the sake of argument, I have the following: | - module Foo: library for users to import, containing functions, ADTs | etc | - module Foo.Plugin: GhcPlugin that compiles out all uses of things in | Foo | | > module Foo where | > | > data Foo x = Foo x | > | > runFoo :: Foo x -> x | > runFoo (Foo x) = x | | | This example is trivial and I imagine GHC will have no trouble | eliminating most cases of this, but imagine more complex stuff. Now, if | I want to traverse the CoreProgram in my plugin, I need to find | occurrences of these, so somewhere there's stuff like: | | > pass tcFoo _ _ (NonRec b expr) | > | varType b `containsTyConAnywhere` tcFoo | > = {- clever stuff to compile out Foo -} | | My problem is "getting" tcFoo in this example. Below is how I do it | now. Maybe I'm being thick, or maybe there's just no simpler way. This | is my 'plugin' function in Foo.Plugin: | | > plugin = Plugin $ \opts todo -> do | > hsc <- getHscEnv | > dfs <- getDynFlags | > fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing | > mod <- case fr of | > Found ml m -> return m | > _ -> panic "Failed to (unambiguously) find 'Foo' (using | findImportedModule)" | > onc <- getOrigNameCache | > let nms = lookupWithDefaultModuleEnv nms (panic "No names defined | for module 'Foo'") mod | > find_ d occ fnd nm | > = maybe | > (fail $ "Failed to find " ++ d ++ " '" ++ nm ++ "'") | > fnd | > (lookupOccEnv nms $ occ nm) | > tcFind = find_ "TyCon" mkTcOcc lookupTyCon | > dcFind = find_ "DataCon" mkDataOcc lookupDataCon | > idFind = find_ "Id" mkVarOcc lookupId | > tcFoo <- tcFind "Foo" | > dcFoo <- dcFind "Foo" | > idRunFoo <- idFind "runFoo" | > return $ CoreDoPluginPass "Foo optimisation" (pass tcFoo dcFoo | > idRunFoo) : todo | | I have the following questions: | | 1) Is this a/the right way to "find" those things in the plugin? | 2) There seems to be a lot to gain with quasi-quoting a la Template | Haskell for people writing plugins to go with a library that they | wrote. Can such QQ be done? Has it been considered? | 3) Is findImportedModule the right function to find my starting point | to begin with? | 4) What is the 'Maybe FastString' argument in findImportedModule for? | I've been trying to put in the FSs of PackageIDs, but they make the | lookup fail. This (dumb) example really made me nervous: | | > fr <- liftIO $ findImportedModule hsc (mkModuleName "Foo") Nothing | > mod <- case fr of | > Found ml m -> do | > fr' <- liftIO $ findImportedModule hsc (moduleName m) | > (packageIdFS $ packageId m) | | Here, fr' should always be a "Found ml' m'" such that ml == ml' and m | == m', but... it consistently results in NotFound{} for me. Also, I | find this especially round-about. Shouldn't Paths_Foo.hs (the Cabal- | generated file) maybe contain variables for every module in the | package? In my case it would thus contain some "modFoo :: Module" | | Comments and suggestions more than welcome! | | Regards, | Philip | | | | | | | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (3)
-
Andrew Farmer
-
p.k.f.holzenspies@utwente.nl
-
Simon Peyton Jones