
I did send this mail a week ago or so to the haskell-libraries mailing list, but this was probably the wrong list. I have in the meantime tested with all cabbal files on haddock, and have fixed some issues. This patch adds a pretty printer for GenericPackageDescription. It fixes ticket http://hackage.haskell.org/trac/hackage/ticket/647. I added a new module Distribution.PackageDescription.PrettyPrint which exports writeGenericPackageDescription and showGenericPackageDescription. I use the field printers from Parse.hs, so that I had to add export statements for them in Distribution.PackageDescription.Parse. I decided to not use a class based approach because: 1. their is already the Text class, which has a parse I didn't want to write, and 2. and more important, I didn't want to use -XFlexibleInstances because I guess cabal should be portable. I have not removed the old printer code, although this may be a good idea (maybe mark it as deprecated first). Then it would make sense as well to rename writeGenericPackageDescription to writePackageDescription and showGenericPackageDescription to showPackageDescription, to be symmetric. I tested with the code from Appendix 1 and all the cabal files on hackage. The test succeds when writing and reading a package description results in an equal description. This is more rigorous then required, cause different GenericPackageDescr.. can have the same semantic, so for example the order of exe definitions is not relevant. But for our needs (GUI editor of cabal files for leksah), e.g. preserving of order is important and it will help to test the parser and printer to make this approach work. Furthermore I want to print out the cabal file as short as possible, e.g. not printing all empty fields like the old printer did. Testing the code I found the following problems: 1. Found up an obvious "age old" bug: hunk ./Distribution/PackageDescription/Parse.hs 359 - ghcProfOptions (\val binfo -> binfo{ghcSharedOptions=val}) + ghcSharedOptions (\val binfo -> binfo{ghcSharedOptions=val}) 2. The order of elements are reversed by the parser. E.g executables: (repos, flags, lib, exes, tests) <- getBody return (repos, flags, lib, exes ++ [(exename, flds)], tests) since the rest is parsed in the line before, the element needs to be prepended. so I changed to: (repos, flags, lib, exes, tests) <- getBody return (repos, flags, lib, (exename, flds): exes, tests) The same applies to test fields. The opposite order problem applies to storeXFields.., which is called in the order of processing, so I changed it to storeXFieldsPD (f@('x':'-':_),val) pkg = Just pkg{ customFieldsPD = (customFieldsPD pkg) ++ [(f,val)]} 3. The next problem is that we print nothing for empty fields , which works ok, except for compiler opts, where empty opts differ from not given opts, so I added this line to optsField in ParseUtils, which causes empty opts to be parsed, as if no opts are given: update f opts [] = [(f,opts)] 4. The next problem is that we loose final newlines with showFreeText. Astonishing enough the culpid is lines from the List module , which evals lines "a\n" to ["a"]. By the way even the claim that "'unlines' is an inverse operation to 'lines'" from the List module is not true, as unlines (lines "a") -> "a\n" (see Appendix 2). So I use my own unlines' now!. 5. This is connected to the CondTree structure. When I write a Cabl file like: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string extensions: ForeignFunctionInterface ... the extension ForeignFunctionInterface in the conditional branch is superflous, as it is specified in any case, so the printer can simplify it to: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string ... However the parse result for the two cabal files will differ. Now I gave up here, and added a variable simplifiedPrinting which disables the printer smartness for now. However, I hope someone can make the parser smarter in this respect. 6. If free text starts with a line with a dot, the dot must be in a new line. 7. Version tags have to be printed, which was not the case before (why?). 8. VersionRangeParens needs to be printed. 9. Compiler options have to be put in a canonical order after parsing, which is easy as CompilerFlavors do derive Ord. Kind regards Jürgen Nicklisch-Franken PS. I have added a remark about the new test suite syntax below. ------------------- Appendix1: Test code main = do allCabalFiles <- ... res <- mapM test allCabalFiles if and res then putStrLn "all test succeeded" else putStrLn $ "tests failed: " ++ show (length (filter (== False) res)) ++ " of: " ++ show (length res) test :: FilePath -> IO Bool test cabalFilePath = catch (do gpd <- readPackageDescription silent cabalFilePath writeGenericPackageDescription tempPath gpd gpd2 <- readPackageDescription silent tempPath if gpd == gpd2 then do putStr "." return True else do putStrLn $ "failed for " ++ cabalFilePath return False) (\e -> do putStrLn $ "runtime error " ++ show e ++ " for " ++ cabalFilePath return False) ------------------ Appendix2: lines and unlines revised: -- | 'unlines' is an inverse operation to 'lines'. -- It joins lines, after appending a terminating newline to each. unlines' :: [String] -> String unlines' [] = [] unlines' [x] = x unlines' (x:xs) = x ++ ('\n' : unlines' xs) -- | 'lines' breaks a string up into a list of strings at newline -- characters. The resulting strings do not contain newlines. lines' :: String -> [String] lines' [] = [""] lines' s = let (l, s') = break (== '\n') s in l : case s' of [] -> [] (_:s'') -> lines' s'' ---------------- Appendix 3: I find the test suite syntax problematic, for an exe I have to remeber the combination: test-suite foo type: exitcode-stdio-1.0 main-is: main.hs , and for a module: test-suite bar type: library-1.0 test-module: Bar So its the combination of exitcode-stdio- and main-is or library- and test-module which does the trick. Wouldn't it be easier to write something like: test-suite foo test-version: 1.0 main-is: main.hs , and for a module: test-suite bar test-version: 1.0 test-module: Bar

On Tue, 2010-10-12 at 00:30 +0200, Jürgen Nicklisch-Franken wrote:
I did send this mail a week ago or so to the haskell-libraries mailing list, but this was probably the wrong list.
I have in the meantime tested with all cabbal files on haddock, and have fixed some issues.
This patch adds a pretty printer for GenericPackageDescription. It fixes ticket http://hackage.haskell.org/trac/hackage/ticket/647.
Fantastic! Thanks for your contribution.
I added a new module Distribution.PackageDescription.PrettyPrint which exports writeGenericPackageDescription and showGenericPackageDescription. I use the field printers from Parse.hs, so that I had to add export statements for them in Distribution.PackageDescription.Parse.
Right.
I decided to not use a class based approach because: 1. their is already the Text class, which has a parse I didn't want to write, and 2. and more important, I didn't want to use -XFlexibleInstances because I guess cabal should be portable.
Yep.
I have not removed the old printer code, although this may be a good idea (maybe mark it as deprecated first). Then it would make sense as well to rename writeGenericPackageDescription to writePackageDescription and showGenericPackageDescription to showPackageDescription, to be symmetric.
Yes.
I tested with the code from Appendix 1 and all the cabal files on hackage. The test succeds when writing and reading a package description results in an equal description. This is more rigorous then required, cause different GenericPackageDescr.. can have the same semantic, so for example the order of exe definitions is not relevant. But for our needs (GUI editor of cabal files for leksah), e.g. preserving of order is important and it will help to test the parser and printer to make this approach work. Furthermore I want to print out the cabal file as short as possible, e.g. not printing all empty fields like the old printer did.
Right. Preserving order and omitting blank fields makes perfect sense.
Testing the code I found the following problems:
1. Found up an obvious "age old" bug: hunk ./Distribution/PackageDescription/Parse.hs 359 - ghcProfOptions (\val binfo -> binfo{ghcSharedOptions=val}) + ghcSharedOptions (\val binfo -> binfo{ghcSharedOptions=val})
Yes, I actually just fixed this one a week or two ago when I was going through my patch queue. So this change clashes with your patch, but that's the only clash so it's not a bit problem.
2. The order of elements are reversed by the parser. E.g executables: (repos, flags, lib, exes, tests) <- getBody return (repos, flags, lib, exes ++ [(exename, flds)], tests) since the rest is parsed in the line before, the element needs to be prepended. so I changed to: (repos, flags, lib, exes, tests) <- getBody return (repos, flags, lib, (exename, flds): exes, tests) The same applies to test fields.
Fine.
The opposite order problem applies to storeXFields.., which is called in the order of processing, so I changed it to storeXFieldsPD (f@('x':'-':_),val) pkg = Just pkg{ customFieldsPD = (customFieldsPD pkg) ++ [(f,val)]}
Fine.
3. The next problem is that we print nothing for empty fields , which works ok, except for compiler opts, where empty opts differ from not given opts, so I added this line to optsField in ParseUtils, which causes empty opts to be parsed, as if no opts are given: update f opts [] = [(f,opts)]
That line was there already, you added the line: + update _ opts l | and (map null opts) = l update f opts [] = [(f,opts)] update f opts ((f',opts'):rest) | f == f' = (f, opts' ++ opts) : rest
4. The next problem is that we loose final newlines with showFreeText. Astonishing enough the culpid is lines from the List module , which evals lines "a\n" to ["a"]. By the way even the claim that "'unlines' is an inverse operation to 'lines'" from the List module is not true, as unlines (lines "a") -> "a\n" (see Appendix 2). So I use my own unlines' now!.
Yes. I've noticed this before. If only we had had QuickCheck when designing the list module we might have picked more sensible properties.
5. This is connected to the CondTree structure. When I write a Cabl file like: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string extensions: ForeignFunctionInterface ... the extension ForeignFunctionInterface in the conditional branch is superflous, as it is specified in any case, so the printer can simplify it to: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string ... However the parse result for the two cabal files will differ. Now I gave up here, and added a variable simplifiedPrinting which disables the printer smartness for now. However, I hope someone can make the parser smarter in this respect.
I think it is better to stick to the syntax when pretty printing. If we want to do semantic simplifications that should be done separately. So should simplifiedPrinting be True or False? Currently it says: -- | Recompile with false for regression testing simplifiedPrinting :: Bool simplifiedPrinting = False
6. If free text starts with a line with a dot, the dot must be in a new line.
Right. In future we're going to allow free text without the silly dot stuff. I wonder how we'll handle things then.
7. Version tags have to be printed, which was not the case before (why?).
Actually that was on purpose. We must not print tags as they make no sense at all. Tags are a crazy idea that doesn't work at all. Some old packages use tags and there's not a lot we can do about that, but we have to ignore them or we'll go insane. So I cannot accept this bit of the patch. If you have another suggestion I'm interested to hear. Perhaps we should ignore them when parsing too. Would that work?
8. VersionRangeParens needs to be printed.
Fine. We already have functions to simplify version ranges if we need to do that.
9. Compiler options have to be put in a canonical order after parsing, which is easy as CompilerFlavors do derive Ord.
Why is this? Can we not preserve the order of the original cabal file? Would you like to send in the test code as a patch, to add to the Cabal testsuite? That way we can make sure the round-trip parser/pretty-printer property remains true in future. Ideally we would make the test program use the hackage tarball index and check all of them.
PS. I have added a remark about the new test suite syntax below.
Appendix 3: I find the test suite syntax problematic, for an exe I have to remeber the combination:
test-suite foo type: exitcode-stdio-1.0 main-is: main.hs
, and for a module:
test-suite bar type: library-1.0 test-module: Bar
So its the combination of exitcode-stdio- and main-is or library- and test-module which does the trick. Wouldn't it be easier to write something like:
test-suite foo test-version: 1.0 main-is: main.hs , and for a module: test-suite bar test-version: 1.0 test-module: Bar
You mean you want to make the test type implicit, depending on which fields are specified? I think it's important to specify the test type because there are not necessarily just two. There are two now but the point of the design is to allow for test protocols that we have not yet thought of, as we get more experience with the system. Those new protocols may well reuse the same fields, so the fields used may not uniquely determine the test type. Note that the current Cabal code will remind you of the right fields, depending on the test type that you specify. It also tells you which test types there are. Duncan

Hi Duncan, it took me a long time to return to the issue, but I'm very busy at work (which by fortune is an industrial Haskell project). Since only few points are in question, I wipe away all the other stuff to make it more clearly arranged.
5. This is connected to the CondTree structure. When I write a Cabl file like: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string extensions: ForeignFunctionInterface ... the extension ForeignFunctionInterface in the conditional branch is superflous, as it is specified in any case, so the printer can simplify it to: ... library extensions: ForeignFunctionInterface if flag(use_xft) build-depends: X11-xft >= 0.2, utf8-string ... However the parse result for the two cabal files will differ. Now I gave up here, and added a variable simplifiedPrinting which disables the printer smartness for now. However, I hope someone can make the parser smarter in this respect.
I think it is better to stick to the syntax when pretty printing. If we want to do semantic simplifications that should be done separately.
So should simplifiedPrinting be True or False? Currently it says:
-- | Recompile with false for regression testing simplifiedPrinting :: Bool simplifiedPrinting = False
I think we should then keep False as default.
6. If free text starts with a line with a dot, the dot must be in a new line.
Right. In future we're going to allow free text without the silly dot stuff. I wonder how we'll handle things then.
7. Version tags have to be printed, which was not the case before (why?).
Actually that was on purpose. We must not print tags as they make no sense at all. Tags are a crazy idea that doesn't work at all. Some old packages use tags and there's not a lot we can do about that, but we have to ignore them or we'll go insane.
So I cannot accept this bit of the patch. If you have another suggestion I'm interested to hear. Perhaps we should ignore them when parsing too. Would that work?
Ignoring them when parsing would make the tests succeed in all cases. (Anyhow, all tests succeed now with newest Hackage cabal files and latest Cabal version, were my tag printing patch has disappeared?!)
9. Compiler options have to be put in a canonical order after parsing, which is easy as CompilerFlavors do derive Ord.
Why is this? Can we not preserve the order of the original cabal file?
That's true. I can take out reordering without problems.(attached patch)
Would you like to send in the test code as a patch, to add to the Cabal testsuite? That way we can make sure the round-trip parser/pretty-printer property remains true in future. Ideally we would make the test program use the hackage tarball index and check all of them.
I can do this, but were do I find the Cabal self testsuite? I attach my test file in case you want to help with this. I have to download the file from Hackage manually.
PS. I have added a remark about the new test suite syntax below.
Appendix 3: I find the test suite syntax problematic, for an exe I have to remeber the combination:
test-suite foo type: exitcode-stdio-1.0 main-is: main.hs
, and for a module:
test-suite bar type: library-1.0 test-module: Bar
So its the combination of exitcode-stdio- and main-is or library- and test-module which does the trick. Wouldn't it be easier to write something like:
test-suite foo test-version: 1.0 main-is: main.hs , and for a module: test-suite bar test-version: 1.0 test-module: Bar
You mean you want to make the test type implicit, depending on which fields are specified? I think it's important to specify the test type because there are not necessarily just two. There are two now but the point of the design is to allow for test protocols that we have not yet thought of, as we get more experience with the system. Those new protocols may well reuse the same fields, so the fields used may not uniquely determine the test type.
Note that the current Cabal code will remind you of the right fields, depending on the test type that you specify. It also tells you which test types there are.
Ok, I just can say that the combination of test type and version in combination with the other fields was a little difficult for me to get. So I add the patch to get rid of reordering compiler options. The priniting of version tags has gone anyway, after I updated to newest version. Jürgen
participants (2)
-
Duncan Coutts
-
Jürgen Nicklisch-Franken