Generating valid Haskell code using the GHC API pretty printer

Hello everyone, we are trying to use the GHC API for a source-to-source transformation on Haskell programs. The result of parsing and typechecking a module enables us to apply the transformation, but writing the transformed module back using the pretty printer (Outputable) generates invalid Haskell code. For one thing, since even the names defined in the current module are fully qualified, the resulting code is not valid anymore. This can be worked around, but there is another issue: Simply reading the following program and then writing it out using the pretty printer renders the resulting code invalid.
module Main where main = do if True then putStrLn "longlonglonglonglonglongline" else return () longlonglonglonglonglonglonglonglonglonglonglongname $ "test" longlonglonglonglonglonglonglonglonglonglonglongname = putStrLn
The result looks like this:
Main.main = do if GHC.Bool.True then System.IO.putStrLn "longlonglonglonglonglongline" else GHC.Base.return () Main.longlonglonglonglonglonglonglonglonglonglonglongname GHC.Base.$ "test" Main.longlonglonglonglonglonglonglonglonglonglonglongname = System.IO.putStrLn
There are two different problems in this output: 1) the indentation of "if ... then ... else" violates the "do"-block layout rule 2) the indentation of the long function call is invalid It looks like those problems could be avoided if the pretty printer could be configured to consistently use "do { ... ; ... }" notation, but we have been unable to figure out how. Is there a canonical way to use the GHC API to pretty print to valid Haskell code? Kind regards, Jan Appended is our current code to execute the transformation above (the module to be read is expected in a file "dummy.hs" for simplicity). Please excuse if this might not be a minimal example. module Main where import GHC import GHC.Paths import Outputable main = do x <- runGhc (Just libdir) $ do dflags <- getSessionDynFlags setSessionDynFlags (dflags { hscTarget = HscNothing, ghcLink = NoLink }) target <- guessTarget "dummy.hs" Nothing setTargets [target] load LoadAllTargets graph <- getModuleGraph let unparsedmod = head graph parsedmod <- parseModule unparsedmod typecheckedmod <- typecheckModule parsedmod let Just renamedsource = renamedSource typecheckedmod (group,_,_,_,_) = renamedsource moduledings = (ms_mod unparsedmod) return (showSDoc (ppr group)) putStr $ x ++ "\n" -- If you're happy and you know it, syntax error!

Hallo Kiel!-)
we are trying to use the GHC API for a source-to-source transformation on Haskell programs. The result of parsing and typechecking a module enables us to apply the transformation, but writing the transformed module back using the pretty printer (Outputable) generates invalid Haskell code.
It would help to know more about your intended application. For instance, if there is a need for user-visible output, the requirements are very different from those of a simple preprocessor. In the former case, pretty-printing isn't going to get you far, even if you do write your own pretty printer - the AST doesn't have enough information to reconstruct the input program (one would need to combine the lexer's token stream with the modified AST to print out modified code - similarly to what was done in HaRe). In the latter case, you might not need to pretty-print, just modify the input before passing it on to later phases of GHC. Since you don't seem to care about formatting or comments, I suspect it is the latter - have you looked into Template Haskell as an alternative? As Thomas pointed out, there is some control over what gets qualified and what doesn't; the two indentation items you list look like plain bugs in search of a ticket and patch. Please record them on GHC's trac. Claus

I've fixed GHC's pretty-printer to print do-notation using braces and semi-colons, which is much more robust. I hope that's useful SImon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Jan Schaumlöffel | Sent: 22 July 2009 14:28 | To: glasgow-haskell-users@haskell.org | Subject: Generating valid Haskell code using the GHC API pretty printer | | Hello everyone, | | we are trying to use the GHC API for a source-to-source transformation | on Haskell programs. The result of parsing and typechecking a module | enables us to apply the transformation, but writing the transformed | module back using the pretty printer (Outputable) generates invalid | Haskell code. | | For one thing, since even the names defined in the current module are | fully qualified, the resulting code is not valid anymore. | | This can be worked around, but there is another issue: Simply reading | the following program and then writing it out using the pretty printer | renders the resulting code invalid. | | > module Main where | > main = do | > if True then putStrLn "longlonglonglonglonglongline" | > else return () | > longlonglonglonglonglonglonglonglonglonglonglongname $ "test" | > longlonglonglonglonglonglonglonglonglonglonglongname = putStrLn | | The result looks like this: | | > Main.main = do if GHC.Bool.True then | > System.IO.putStrLn "longlonglonglonglonglongline" | > else | > GHC.Base.return () | > Main.longlonglonglonglonglonglonglonglonglonglonglongname | > GHC.Base.$ | > "test" | > Main.longlonglonglonglonglonglonglonglonglonglonglongname = System.IO.putStrLn | | There are two different problems in this output: | | 1) the indentation of "if ... then ... else" violates the "do"-block | layout rule | 2) the indentation of the long function call is invalid | | It looks like those problems could be avoided if the pretty printer | could be configured to consistently use "do { ... ; ... }" notation, | but we have been unable to figure out how. Is there a canonical way | to use the GHC API to pretty print to valid Haskell code? | | Kind regards, | Jan | | Appended is our current code to execute the transformation above (the | module to be read is expected in a file "dummy.hs" for simplicity). | Please excuse if this might not be a minimal example. | | module Main where | | import GHC | import GHC.Paths | import Outputable | | main = do | x <- runGhc (Just libdir) $ do | dflags <- getSessionDynFlags | setSessionDynFlags (dflags { hscTarget = HscNothing, | ghcLink = NoLink }) | | target <- guessTarget "dummy.hs" Nothing | setTargets [target] | load LoadAllTargets | | graph <- getModuleGraph | let unparsedmod = head graph | | parsedmod <- parseModule unparsedmod | typecheckedmod <- typecheckModule parsedmod | let Just renamedsource = renamedSource typecheckedmod | (group,_,_,_,_) = renamedsource | moduledings = (ms_mod unparsedmod) | | return (showSDoc (ppr group)) | putStr $ x ++ "\n" | | -- | If you're happy and you know it, syntax error!

Hello, sorry for the delay in replying, I have been away from my computer during the last weeks. Simon Peyton-Jones wrote:
I've fixed GHC's pretty-printer to print do-notation using braces and semi-colons, which is much more robust. I hope that's useful
This is certainly useful, and it seems to fix the issues for me. Thank you very much! Thomas Schilling wrote:
That said, if you're trying to do source-to-source transformations you probably want to keep the original layout as much as possible. GHC's pretty-printer isn't designed for that. GHC's syntax tree has very accurate source locations, so you could start from there and build your own pretty printer.
Niklas Broberg wrote:
In general, unless you actually want to use any other components of the GHC API, e.g. evaluate your code, then I see no reason to use the GHC API for source manipulation. haskell-src-exts simply does that better (and definitely better than haskell-src). But I couldn't tell if that's enough for the original poster's needs. :-)
Well, the application is a debugger for Concurrent Haskell programs. It reads a program and replaces certain function calls with calls to location-aware wrappers that allow stepping the program and highlighting the original source location. The result of this transformation is then compiled and executed; it is not intended to be seen by the user (who is shown the original source). So I guess what we need is not a "pretty printer" but a "valid printer". We need accurate original source locations and we need to access the syntax tree after typechecking to be able to tell which functions to replace. The GHC API looked like the way to go and it does so even more with Simon's patch. Thanks for your input, Jan -- If you're happy and you know it, syntax error!
participants (3)
-
Claus Reinke
-
Jan Schaumlöffel
-
Simon Peyton-Jones