[Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement

Hello, I'd like to propose bugfixes, documentation fixes and a performance improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't effect the expected behaviour of the PP library. I've written a QuickCheck test suite for the pretty printer (to test the improvement), and found two bugs and some misconceptions/ ambiguities in the documentation. Additionally, there is a microbenchmark for the suggested improvement. Both are available at http://code.haskell.org/~bhuber/Text/ PrettyPrint/. Note that the QuickCheck tests need access to all top- level names in HughesPJ (i.e. ignore the export list). In summary, I propose to * fix a bug in fillNB and one in fillNB/sepNB * correct documentation on laws and invariants. * add more efficient implementations of vcat,hsep,hcat More specifically: (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical composition (3) Lazy implementations of vcat,hcat and hsep (4) Law <t2> isn't always true (5) Invariant 5 should be made stronger (6) Change the comment about negative indentation The details follow below (maybe a little long). best regards, benedikt = = = = = = = = ======================================================================== == Details (Long) == Bug Fixes -------------------------------------- (1) Bugfix fillNB Law <l1> states that
sep (ps++[empty]++qs) = sep (ps ++ qs) ...ditto hsep, hcat, vcat, fill...
In the current implementation, this fails for the paragraph fill variants.
render' $ fill True [ text "c", text "c",empty, text "c", text "b"] where render' = renderStyle (Style PageMode 7 1.4)
c c c b
The reason is a missing test for Empty in fillNB 2) Bugfix: overlap and f?(cat|sep) The specification for cat/sep: * oneLiner (hcat/hsep ps) `union` vcat ps [*] But currently cat, sep, fcat and fsep attempt to overlap the second line with the first one, i.e. they use `foldr ($$) empty ps' instead of `foldr ($+$) empty ps' [*]. I assume this is a mistake. This bug can lead to situations, where the line in the right argument of Union is actually longer:
prettyDoc$ cat [ text "a", nest 2 ( text "b") ]
text "a"; union (text "b"; empty) (nilabove; nest 1; text "b"; empty)
renderStyle (Style PageMode 1 1) $ cat [ text "a", nest 2 ( text "b") ]
"a b"
In the implementation, we call `nilAbove False' instead of `nilAbove True' (see patch). Performance Improvements -------------------------------------- 3) Improving the space/time performance of vcat, hsep, hcat vcat (hsep,cat) is implemented in an unneccessarily strict way. We only get some output after all of vcat's arguments are evaluated and checked against being Empty. This can be improved by only checking the right argument of foldr against being Empty, and then applying an Empty-filter on the resulting Doc. Space improvement is obvious. The microbenchmark (code.haskell.org/~bhuber/Text/PrettyPrint/ HughesPJPerfCheck.hs) suggests that the improvements in time are remarkable too. Documentation fixes -------------------------------------- The QuickCheck tests revealed that 4) Law <t2> isn't always true <t2> text "" <> x = x, if x non-empty only holds if x does not start with nest (obviously, because of law <n6>). 5) The invariant 5 should be extended: A @NoDoc@ may only appear on the first line of the left argument of an union. Therefore, the right argument of an union can never be equivalent to the empty set. because this assumption is used when rendering. 6) Change the comment about negative indentation In the source code we have: -- (spaces n) generates a list of n spaces -- It should never be called with 'n' < 0, but that can happen for reasons I don't understand If we compose a <> b, and the first line of b is deeply nested, but other lines of b are not, then, because <> eats the nest, the pretty printer will try to layout some of b's lines with negative indentation: doc |0123345 ------------------ d1 |a d2 | b |c d1<>d2 |ab c| Here is the reason why this is rather unavoidable: In John Hughes paper, there is a line stating that "composing layouts does not change the layouts being composed". Another one states that "<> should eat up nests" But this leads to negative indentation - imho a user error. == End ==

Thank you for doing this Benedict. I've added your more detailed comments to ticket #2393 so that they are preserved. Ian: would you like to apply? I'm not sure how to integrate the QuickCheck tests, but I bet you know. Benedict: while you are in the area, would you like to take a swing at http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176? Simon | -----Original Message----- | From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Benedikt Huber | Sent: 24 June 2008 13:12 | To: libraries@haskell.org | Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement | | Hello, | | I'd like to propose bugfixes, documentation fixes and a performance | improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't | effect the expected behaviour of the PP library. | | I've written a QuickCheck test suite for the pretty printer (to test | the improvement), and found two bugs and some misconceptions/ | ambiguities in the documentation. Additionally, there is a | microbenchmark for the suggested improvement. | Both are available at http://code.haskell.org/~bhuber/Text/ | PrettyPrint/. Note that the QuickCheck tests need access to all top- | level names in HughesPJ (i.e. ignore the export list). | | In summary, I propose to | * fix a bug in fillNB and one in fillNB/sepNB | * correct documentation on laws and invariants. | * add more efficient implementations of vcat,hsep,hcat | | More specifically: | | (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) | | (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical | composition | | (3) Lazy implementations of vcat,hcat and hsep | | (4) Law <t2> isn't always true | | (5) Invariant 5 should be made stronger | | (6) Change the comment about negative indentation |

Hi Benedikt,
Great work, and many thanks!
One ticket that might be interesting, while you are in the area, is:
http://hackage.haskell.org/trac/ghc/ticket/1217
The code is attached, I went through all the API review months and
months ago, and everyone agreed that the function should be added, but
under the name "zeroWidthText".
I'm guessing it isn't too much effort to tack this on while you are
working on it? I am currently limited to Hugs only (not enough disk
space to install GHC and no SSH access...), so am unable to patch any
libraries myself.
Thanks
Neil
On 6/24/08, Simon Peyton-Jones
Thank you for doing this Benedict. I've added your more detailed comments to ticket #2393 so that they are preserved.
Ian: would you like to apply? I'm not sure how to integrate the QuickCheck tests, but I bet you know.
Benedict: while you are in the area, would you like to take a swing at http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176?
Simon
| -----Original Message----- | From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Benedikt Huber | Sent: 24 June 2008 13:12 | To: libraries@haskell.org | Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement | | Hello, | | I'd like to propose bugfixes, documentation fixes and a performance | improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't | effect the expected behaviour of the PP library. | | I've written a QuickCheck test suite for the pretty printer (to test | the improvement), and found two bugs and some misconceptions/ | ambiguities in the documentation. Additionally, there is a | microbenchmark for the suggested improvement. | Both are available at http://code.haskell.org/~bhuber/Text/ | PrettyPrint/. Note that the QuickCheck tests need access to all top- | level names in HughesPJ (i.e. ignore the export list). | | In summary, I propose to | * fix a bug in fillNB and one in fillNB/sepNB | * correct documentation on laws and invariants. | * add more efficient implementations of vcat,hsep,hcat | | More specifically: | | (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) | | (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical | composition | | (3) Lazy implementations of vcat,hcat and hsep | | (4) Law <t2> isn't always true | | (5) Invariant 5 should be made stronger | | (6) Change the comment about negative indentation |
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hello Neil, Tuesday, June 24, 2008, 10:09:20 PM, you wrote:
working on it? I am currently limited to Hugs only (not enough disk space to install GHC and no SSH access...)
is it prison or university? ;) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Neil Mitchell schrieb:
Hi Benedikt,
Great work, and many thanks!
One ticket that might be interesting, while you are in the area, is:
http://hackage.haskell.org/trac/ghc/ticket/1217
The code is attached, I went through all the API review months and months ago, and everyone agreed that the function should be added, but under the name "zeroWidthText". Hi Neil,
Thanks ! Concerning #1217: http://www.haskell.org/pipermail/libraries/2007-March/007028.html I'm not quite sure what you are asking for - I'm not the one to apply the patches ;) From a technical point of view, zeroWidthText wouldn't cause any bugs in the current implementation, as far as I can see. Two things should be considered though: 1) when defining zeroTextWidth as (textBeside_ (Str s) 0 Empty), the client looses the information that (Str s) is a zero-width TextDetail. When combining the text again (see string_txt for example), it might be good to know that. I do not suggest it, but it /might/ be worth to consider using a different constructor for zero width texts. Of course, this modifies the API, as TextDetails is exported. Btw, is there any use for the (PStr :: String -> TextDetails) constructor ? 2) One has to take precautions that future versions of the library do not use the text length for optimizations. For example the law
text "" <> s = elide_nests s must not be used by testing against the text length (which is the obvious and fasted way to do so). Some regression tests would be a good idea. --
Btw: the people commenting in the thread linked above seemed to be using different pretty printer libraries, so maybe some comments from actual users would be good too. cheers, benedikt
I'm guessing it isn't too much effort to tack this on while you are working on it? I am currently limited to Hugs only (not enough disk space to install GHC and no SSH access...), so am unable to patch any libraries myself.
Thanks
Neil
On 6/24/08, Simon Peyton-Jones
wrote: Thank you for doing this Benedict. I've added your more detailed comments to ticket #2393 so that they are preserved.
Ian: would you like to apply? I'm not sure how to integrate the QuickCheck tests, but I bet you know.
Benedict: while you are in the area, would you like to take a swing at http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176?
Simon
| -----Original Message----- | From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Benedikt Huber | Sent: 24 June 2008 13:12 | To: libraries@haskell.org | Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement | | Hello, | | I'd like to propose bugfixes, documentation fixes and a performance | improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't | effect the expected behaviour of the PP library. | | I've written a QuickCheck test suite for the pretty printer (to test | the improvement), and found two bugs and some misconceptions/ | ambiguities in the documentation. Additionally, there is a | microbenchmark for the suggested improvement. | Both are available at http://code.haskell.org/~bhuber/Text/ | PrettyPrint/. Note that the QuickCheck tests need access to all top- | level names in HughesPJ (i.e. ignore the export list). | | In summary, I propose to | * fix a bug in fillNB and one in fillNB/sepNB | * correct documentation on laws and invariants. | * add more efficient implementations of vcat,hsep,hcat | | More specifically: | | (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) | | (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical | composition | | (3) Lazy implementations of vcat,hcat and hsep | | (4) Law <t2> isn't always true | | (5) Invariant 5 should be made stronger | | (6) Change the comment about negative indentation |
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hi Benedikt,
I'm not quite sure what you are asking for - I'm not the one to apply the patches ;)
The patch needs redoing with zeroWidthText, rather than zeroWidth. I can't build a patch, but you are probably at the right point to do so, and include it into the other patches you are doing. Also, since you seem to understand the low-level details very well, it was a general request for comment.
Two things should be considered though:
1) when defining zeroTextWidth as (textBeside_ (Str s) 0 Empty), the client looses the information that (Str s) is a zero-width TextDetail. When combining the text again (see string_txt for example), it might be good to know that.
I do not suggest it, but it /might/ be worth to consider using a different constructor for zero width texts. Of course, this modifies the API, as TextDetails is exported.
That seems like a bad idea, as I certainly don't know enough details about the library to write the correct implementation in every case, and to be sure there are no pattern-match errors.
2) One has to take precautions that future versions of the library do not use the text length for optimizations.
For example the law
text "" <> s = elide_nests s must not be used by testing against the text length (which is the obvious and fasted way to do so). Some regression tests would be a good idea.
True, but I guess that it is currently safe?
Btw: the people commenting in the thread linked above seemed to be using different pretty printer libraries, so maybe some comments from actual users would be good too.
I think the discussion proceeded enough to be fairly sure that no one objected to it, and lots of people wanted it as part of a general pretty printing library. I think it also showed that lots of serious pretty-printing users had moved away from HughesPJ - its possible your fixes and maintenance might tempt them back. Thanks Neil
I'm guessing it isn't too much effort to tack this on while you are working on it? I am currently limited to Hugs only (not enough disk space to install GHC and no SSH access...), so am unable to patch any libraries myself.
Thanks
Neil
On 6/24/08, Simon Peyton-Jones
wrote: Thank you for doing this Benedict. I've added your more detailed
Ian: would you like to apply? I'm not sure how to integrate the
QuickCheck tests, but I bet you know.
Benedict: while you are in the area, would you like to take a swing at
http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176?
Simon
| -----Original Message----- | From: libraries-bounces@haskell.org
[mailto:libraries-bounces@haskell.org] On Behalf Of Benedikt Huber
| Sent: 24 June 2008 13:12 | To: libraries@haskell.org | Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes,
comments to ticket #2393 so that they are preserved. performance improvement
| | Hello, | | I'd like to propose bugfixes, documentation fixes and a performance | improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't | effect the expected behaviour of the PP library. | | I've written a QuickCheck test suite for the pretty printer (to test | the improvement), and found two bugs and some misconceptions/ | ambiguities in the documentation. Additionally, there is a | microbenchmark for the suggested improvement. | Both are available at http://code.haskell.org/~bhuber/Text/ | PrettyPrint/. Note that the QuickCheck tests need access to all top- | level names in HughesPJ (i.e. ignore the export list). | | In summary, I propose to | * fix a bug in fillNB and one in fillNB/sepNB | * correct documentation on laws and invariants. | * add more efficient implementations of vcat,hsep,hcat | | More specifically: | | (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) | | (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical | composition | | (3) Lazy implementations of vcat,hcat and hsep | | (4) Law <t2> isn't always true | | (5) Invariant 5 should be made stronger | | (6) Change the comment about negative indentation |
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Wed, 2008-06-25 at 16:03 +0200, Benedikt Huber wrote:
Btw, is there any use for the (PStr :: String -> TextDetails) constructor ?
Previously it was used for PackedString (or some predecessor of it) but at some point it got changed to String. There is really very little advantage to PackedString over String because PackedString is very slow. However we should not get rid of the PStr constructor, we should change it to use the new Unicode string type that is in the works. That will be fast and memory efficient (using a similar underlying representation as ByteStrings). So yeah, I suggest we leave it for now and change it to the new fast unicode text type at some point in the future when that is ready. Duncan

Thank you for doing this Benedict. I've added your more detailed comments to ticket #2393 so that they are preserved.
Ian: would you like to apply? I'm not sure how to integrate the QuickCheck tests, but I bet you know. I'm glad this is useful - and I'm impressed by the complexity of the
Simon Peyton-Jones schrieb: pretty printer library. I'm tackling the other tickets now and adding more QuickCheck test.
Benedict: while you are in the area, would you like to take a swing at http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176? In my opinion, ticket #667 can be marked invalid (see comments).
Concerning ticket #1337, we have to change the formal specification of fill (it doesn't match the implementation): -- Current Specification: -- fill [] = empty -- fill [p] = p -- fill (p1:p2:ps) = oneLiner p1 <#> nest (length p1) -- (fill (oneLiner p2 : ps)) -- `union` -- p1 $$ fill ps Problem 1: We want to `unnest' the second argument of (p1 $$ fill ps), but not the first one In the definition above we have e.g.
getSecondLayout $ fillDef False [text "a", text "b", text "a"]
text "ab"; nilabove; nest -1; text "a"; empty |ab| |.a|
Problem 2: The overlapping $$ should only be used for those layouts of p1 which aren't one liners (otherwise violating the invariant "Left union arg has shorter first line"). I suggest the following specification (i believe it almost matches the current implementation, modulo [fillNB: fix bug #1337] (see below): -- Revised Specification: -- fill g docs = fill' 0 docs -- gap g = if g then 1 else 0 -- fill' n [] = [] -- fill' n [p] = [p] -- fill' n (p1:p2:ps) = -- oneLiner p1 <g> (fill' (n+length p1+gap g) (oneLiner p2 : ps)) -- `union` -- (p1 $*$ nest (-n) (fill' g ps)) -- -- $*$ is defined for layouts (One-Layout Documents) as -- -- layout1 $*$ layout2 | isOneLiner layout1 = layout1 $+$ layout2 -- | otherwise = layout1 $$ layout2 I've also implemented the specification in HughesPJQuickCheck.hs, and checked them against the patched pretty printer. Concerning Bug #1337: ~~~~~~~~~~~~~~~~~~~~~ If the above formal specification is fine, it is easy to fix: elide the nests of (oneLiner p2) [see attached patch, record bug #1337].
PrettyPrint(0) $ ./Bug1337 ....ab ...c
The (long) explanation follows below. I'll look into the other tickets too. thorkil: Can you help me with a simplified test case (pretty printer only) of Bug #1176 ? Also, I didn't find any regressions tests for the pretty printer - I'll create some if someone points me in the right direction. best regards, benedikt =========================================================== Explanation of Bug #1337: Consider
fcat [ nest 1 $ text "a", nest 2 $ text "b", text "c"]
--> expected: (nest 1; text "a"; text "b"; nest -3; "c") --> actual : (nest 1; text "a"; text "b"; nest -5; "c") Reduction: === (nest 1; text a) <> (fill (-2) (p2:ps)) ==> (nest 2 (text "b") $+$ text "c") ==> (nest 2 (text "b")) `nilabove` (nest (-3) (text "c")) ==> (nest 1; text a; text b; nest -5 c) The problem is that if we decide to layout (p1:p2:ps) as | p1 p2 | ps (call it layout A), then we want to have
(p1 <> p2) $+$ ps.
But following law <n6> this means that
fcat_A [p1:nest k p2:ps]
is equivalent to
fcat_A [p1,p2,ps]
so the nest of p2 has to be removed. This is somewhat similar to bug #667, but easier to fix from a semantic point of view: p1,p2 and ps are distinct layouts - we only have to preserve the individual layouts, and no combinations of them.
Simon
| -----Original Message----- | From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Benedikt Huber | Sent: 24 June 2008 13:12 | To: libraries@haskell.org | Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement | | Hello, | | I'd like to propose bugfixes, documentation fixes and a performance | improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't | effect the expected behaviour of the PP library. | | I've written a QuickCheck test suite for the pretty printer (to test | the improvement), and found two bugs and some misconceptions/ | ambiguities in the documentation. Additionally, there is a | microbenchmark for the suggested improvement. | Both are available at http://code.haskell.org/~bhuber/Text/ | PrettyPrint/. Note that the QuickCheck tests need access to all top- | level names in HughesPJ (i.e. ignore the export list). | | In summary, I propose to | * fix a bug in fillNB and one in fillNB/sepNB | * correct documentation on laws and invariants. | * add more efficient implementations of vcat,hsep,hcat | | More specifically: | | (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys) | | (2) Bugfix [f](cat|sep): do not allow overlapping ($$) in vertical | composition | | (3) Lazy implementations of vcat,hcat and hsep | | (4) Law <t2> isn't always true | | (5) Invariant 5 should be made stronger | | (6) Change the comment about negative indentation |
New patches:
[fillNB bug, lazy vcat
benedikt.huber@gmail.com**20080624113715] {
hunk ./Text/PrettyPrint/HughesPJ.hs 410
+
+** because of law n6, t2 only holds if x doesn't
+** start with `nest'.
+
hunk ./Text/PrettyPrint/HughesPJ.hs 429
-<m1> (text s <> x) $$ y = text s <> ((text "" <> x)) $$
- nest (-length s) y)
+<m1> (text s <> x) $$ y = text s <> ((text "" <> x) $$
+ nest (-length s) y)
hunk ./Text/PrettyPrint/HughesPJ.hs 490
+-- lazy list versions
+hcat = reduceAB . foldr (beside_' False) empty
+hsep = reduceAB . foldr (beside_' True) empty
+vcat = reduceAB . foldr (above_' True) empty
hunk ./Text/PrettyPrint/HughesPJ.hs 495
-hcat = foldr (<>) empty
-hsep = foldr (<+>) empty
-vcat = foldr ($$) empty
+beside_' :: Bool -> Doc -> Doc -> Doc
+beside_' _ p Empty = p
+beside_' g p q = Beside p g q
+
+above_' :: Bool -> Doc -> Doc -> Doc
+above_' _ p Empty = p
+above_' g p q = Above p g q
+
+reduceAB :: Doc -> Doc
+reduceAB (Above Empty _ q) = q
+reduceAB (Beside Empty _ q) = q
+reduceAB doc = doc
hunk ./Text/PrettyPrint/HughesPJ.hs 556
- * The arugment of @TextBeside@ is never @Nest@.
+ * The argument of @TextBeside@ is never @Nest@.
hunk ./Text/PrettyPrint/HughesPJ.hs 564
- * The right argument of a union cannot be equivalent to the empty set
- (@NoDoc@). If the left argument of a union is equivalent to the
- empty set (@NoDoc@), then the @NoDoc@ appears in the first line.
+ * A @NoDoc@ may only appear on the first line of the left argument of an
+ union. Therefore, the right argument of an union can never be equivalent
+ to the empty set (@NoDoc@).
hunk ./Text/PrettyPrint/HughesPJ.hs 578
- -- Arg of a NilAbove is always an RDoc
-nilAbove_ :: Doc -> Doc
+-- Invariant: Args to the 4 functions below are always RDocs
+nilAbove_ :: RDoc -> RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 583
-textBeside_ :: TextDetails -> Int -> Doc -> Doc
+textBeside_ :: TextDetails -> Int -> RDoc -> RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 586
- -- Arg of Nest is always an RDoc
-nest_ :: Int -> Doc -> Doc
+nest_ :: Int -> RDoc -> RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 589
- -- Args of union are always RDocs
-union_ :: Doc -> Doc -> Doc
+union_ :: RDoc -> RDoc -> RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 765
- nilAboveNest False k (reduceDoc (vcat ys))
+ nilAboveNest True k (reduceDoc (vcat ys))
hunk ./Text/PrettyPrint/HughesPJ.hs 810
-fillNB g Empty k (y:ys) = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
+fillNB g Empty k (Empty:ys) = fillNB g Empty k ys
+fillNB g Empty k (y:ys) = fillNBE g k y ys
+fillNB g p k ys = fill1 g p k ys
+
+fillNBE g k y ys = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
hunk ./Text/PrettyPrint/HughesPJ.hs 816
- nilAboveNest False k (fill g (y:ys))
+ nilAboveNest True k (fill g (y:ys))
hunk ./Text/PrettyPrint/HughesPJ.hs 821
-fillNB g p k ys = fill1 g p k ys
-
-
hunk ./Text/PrettyPrint/HughesPJ.hs 1039
--- It should never be called with 'n' < 0, but that can happen for reasons I don't understand
--- Here's a test case:
--- ncat x y = nest 4 $ cat [ x, y ]
--- d1 = foldl1 ncat $ take 50 $ repeat $ char 'a'
--- d2 = parens $ sep [ d1, text "+" , d1 ]
--- main = print d2
--- I don't feel motivated enough to find the Real Bug, so meanwhile we just test for n<=0
+-- returns the empty string on negative argument.
+--
hunk ./Text/PrettyPrint/HughesPJ.hs 1045
-{- Comments from Johannes Waldmann about what the problem might be:
+{-
+Concerning negative indentation:
+If we compose a <> b, and the first line of b is deeply nested, but other lines of b are not,
+then, because <> eats the nest, the pretty printer will try to layout some of b's lines with
+negative indentation:
hunk ./Text/PrettyPrint/HughesPJ.hs 1051
- In the example above, d2 and d1 are deeply nested, but `text "+"' is not,
- so the layout function tries to "out-dent" it.
-
- when I look at the Doc values that are generated, there are lots of
- Nest constructors with negative arguments. see this sample output of
- d1 (obtained with hugs, :s -u)
-
- tBeside (TextDetails_Chr 'a') 1 Doc_Empty) (Doc_NilAbove (Doc_Nest
- (-241) (Doc_TextBeside (TextDetails_Chr 'a') 1 Doc_Empty)))))
- (Doc_NilAbove (Doc_Nest (-236) (Doc_TextBeside (TextDetails_Chr 'a') 1
- (Doc_NilAbove (Doc_Nest (-5) (Doc_TextBeside (TextDetails_Chr 'a') 1
- Doc_Empty)))))))) (Doc_NilAbove (Doc_Nest (-231) (Doc_TextBeside
- (TextDetails_Chr 'a') 1 (Doc_NilAbove (Doc_Nest (-5) (Doc_TextBeside
- (TextDetails_Chr 'a') 1 (Doc_NilAbove (Doc_Nest (-5) (Doc_TextBeside
- (TextDetails_Chr 'a') 1 Doc_Empty))))))))))) (Doc_NilAbove (Doc_Nest
+doc |0123345
+------------------
+d1 |a
+d2 | b
+ |c
+d1<>d2 |ab
+ c|
}
[fillNB: fix bug #1337
benedikt.huber@gmail.com**20080625114659] {
hunk ./Text/PrettyPrint/HughesPJ.hs 814
-fillNBE g k y ys = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
+fillNBE g k y ys = nilBeside g (fill1 g (elideNest . oneLiner . reduceDoc $ y) k1 ys)
hunk ./Text/PrettyPrint/HughesPJ.hs 820
-
+elideNest (Nest k d) = d
+elideNest d = d
}
Context:
[Fix warnings
Ian Lynagh

Hello Benedikt, Thanks for all this. From memory, notes, and (last resort) trac search, the HughesPJ-related trac tickets on the GHC trac are: #669 negative indentation in Text.PrettyPrint.HughesPJ #1176 Infinite loop when printing error message #1217 Add zeroText to Text.PrettyPrint.HughesPJ #1337 Fix wrong indentation from Text.PrettyPrint.HughesPJ fill (fcat and fsep) I have added comments to #669 and #1176 that I hope speak for themselves. The problem reported in #1337 has a fix (Fix of wrong indentation from HughesPJ fill (fcat and fsep)), it may still provide some inspiration. On Wednesday 25 June 2008 14:41, Benedikt Huber wrote:
... thorkil: Can you help me with a simplified test case (pretty printer only) of Bug #1176 ?
#1176 is a case of #1337 for the internal GHC version of the HughesPJ library and I expect a fix of #1337 to be applicable to the internal GHC library, hence fixing #1176. So I expect the failing cases for #1337 to fail for #1176 as well, but being, unfortunately, somewhat less easily exercised.
Also, I didn't find any regressions tests for the pretty printer - I'll create some if someone points me in the right direction.
The patch (Patch with HughesPJ tests with automated high-level coverage check) attached to #1337 has some tests, but also some rather bombastic things that attempts to ensure some coverage of the tests. This was created at a point in time where hpc was not available.
...
Thanks and best regards Thorkil

Thorkil Naur schrieb:
Hello Benedikt,
Thanks for all this. From memory, notes, and (last resort) trac search, the HughesPJ-related trac tickets on the GHC trac are:
#669 negative indentation in Text.PrettyPrint.HughesPJ #1176 Infinite loop when printing error message #1217 Add zeroText to Text.PrettyPrint.HughesPJ #1337 Fix wrong indentation from Text.PrettyPrint.HughesPJ fill (fcat and fsep)
Hello, I've added a comment to #2393 commenting those bugs: #669 wontfix, #1217 and #1337 fixed in the patch bundle, #1176 propably fixed. So, there is one open issue: the specification of fill. Note that the old specification is flawed and needs to be fixed anyway. Here's the cleaned up, documented version. It looks implementation oriented, but is far simpler than the actual implementation. -- Revised Specification: -- {- start paragraph fill at column 0 -} -- fill g docs = fill' 0 docs -- -- {- base cases -} -- fill' col [] = [] -- fill' col [p] = [p] -- -- {- either put p2 beside p1, or below p1 -} -- fill' col (p1:p2:ps) = -- {- precondition for p1 beside p2: p1,p2 span only one line -} -- {- As we build (p1 <> p2), remove the nesting of p2 -} -- oneLiner p1 -- <g> -- fill' (col + length p1 + gap g) -- (elideNests (oneLiner p2) : ps) -- `union` -- {- put p2 below p1; p2 should be aligned with the first -- argument of fill, which is col columns to the left of p1 -} -- p1 -- $*$ -- nest (-col) (fill' 0 (p2:ps)) -- -- {- width of space -} -- gap g = if g then 1 else 0 -- -- -- $*$ is defined for layouts (One-Layout Documents) as -- -- layout1 $*$ layout2 | isOneLiner layout1 = layout1 $+$ layout2 -- | otherwise = layout1 $$ layout2 -- -- without the the first case, we would violate the one-line lookahead -- invariant The specification of fill' - left alone elideNests and $*$ - does not add any complexity, it is exactly what a paragraph fill should do. elideNests is an improvement - it makes it possible to have nesting in the arguments (otherwise, fill fails with nested arguments). $*$ is a "feature" in the current implementation: It allows overlapping in certain situations. Consider fill |a| , |...c| |b| |...d| Without $*$ ($+$) we would get |a| |b| |...c| |...d| Using $*$ we get |a| |b..c| |...d| I do not know wheter this might be useful - it is an extra feature rarely used I suppose ?? Note that the specification of fill has to be changed anyway, either with $+$ or with $*$.
I have added comments to #669 and #1176 that I hope speak for themselves. The problem reported in #1337 has a fix (Fix of wrong indentation from HughesPJ fill (fcat and fsep)), it may still provide some inspiration. Thank you thorkil, I think your patch was correct; the patches in #2393 achieve the same effect.
On Wednesday 25 June 2008 14:41, Benedikt Huber wrote:
... thorkil: Can you help me with a simplified test case (pretty printer only) of Bug #1176 ?
#1176 is a case of #1337 for the internal GHC version of the HughesPJ library and I expect a fix of #1337 to be applicable to the internal GHC library, hence fixing #1176. So I expect the failing cases for #1337 to fail for #1176 as well, but being, unfortunately, somewhat less easily exercised.
Could you have a look at Bug1176a.hs attached to #1337 ? Is this what happened ?
Also, I didn't find any regressions tests for the pretty printer - I'll create some if someone points me in the right direction.
The patch (Patch with HughesPJ tests with automated high-level coverage check) attached to #1337 has some tests, but also some rather bombastic things that attempts to ensure some coverage of the tests. This was created at a point in time where hpc was not available.
Ah yes, hpc, thanks for the pointer. Using the quickcheck test suite I found some unused functions and pattern matches. I've added comments to the most important ones. As you seem to know the pretty printer library well, it would be great if you could comment on the revised specification of fill. best regards, benedikt

Thorkil Naur schrieb:
Hello Benedikt,
Thanks for all this. From memory, notes, and (last resort) trac search,
Hello Benedikt, On Thursday 26 June 2008 20:40, Benedikt Huber wrote: the
HughesPJ-related trac tickets on the GHC trac are:
#669 negative indentation in Text.PrettyPrint.HughesPJ #1176 Infinite loop when printing error message #1217 Add zeroText to Text.PrettyPrint.HughesPJ #1337 Fix wrong indentation from Text.PrettyPrint.HughesPJ fill (fcat and fsep)
Hello, I've added a comment to #2393 commenting those bugs: #669 wontfix,
I suggest to add documentation as suggested by Simon PJ and then close it.
... #1176 propably fixed.
I don't think so: The infinite loop is gone, but the GHC version of the pretty-printing library still has the error reported as #1337 against the external library (see the comments that I added to #1176.).
So, there is one open issue: the specification of fill. Note that the old specification is flawed and needs to be fixed anyway.
Here's the cleaned up, documented version. It looks implementation oriented, but is far simpler than the actual implementation.
-- Revised Specification:
-- {- start paragraph fill at column 0 -} -- fill g docs = fill' 0 docs -- -- {- base cases -} -- fill' col [] = [] -- fill' col [p] = [p] -- -- {- either put p2 beside p1, or below p1 -} -- fill' col (p1:p2:ps) = -- {- precondition for p1 beside p2: p1,p2 span only one line -} -- {- As we build (p1 <> p2), remove the nesting of p2 -} -- oneLiner p1 -- <g> -- fill' (col + length p1 + gap g) -- (elideNests (oneLiner p2) : ps) -- `union` -- {- put p2 below p1; p2 should be aligned with the first -- argument of fill, which is col columns to the left of p1 -} -- p1 -- $*$ -- nest (-col) (fill' 0 (p2:ps)) -- -- {- width of space -} -- gap g = if g then 1 else 0 -- -- -- $*$ is defined for layouts (One-Layout Documents) as -- -- layout1 $*$ layout2 | isOneLiner layout1 = layout1 $+$ layout2 -- | otherwise = layout1 $$ layout2 -- -- without the the first case, we would violate the one-line lookahead -- invariant
The specification of fill' - left alone elideNests and $*$ - does not add any complexity, it is exactly what a paragraph fill should do.
elideNests is an improvement - it makes it possible to have nesting in the arguments (otherwise, fill fails with nested arguments).
$*$ is a "feature" in the current implementation: It allows overlapping in certain situations.
Consider fill |a| , |...c| |b| |...d| Without $*$ ($+$) we would get |a| |b| |...c| |...d| Using $*$ we get |a| |b..c| |...d|
I do not know wheter this might be useful - it is an extra feature rarely used I suppose ??
Note that the specification of fill has to be changed anyway, either with $+$ or with $*$. ...
On Wednesday 25 June 2008 14:41, Benedikt Huber wrote:
... thorkil: Can you help me with a simplified test case (pretty printer only) of Bug #1176 ?
#1176 is a case of #1337 for the internal GHC version of the HughesPJ
library
and I expect a fix of #1337 to be applicable to the internal GHC library, hence fixing #1176. So I expect the failing cases for #1337 to fail for #1176 as well, but being, unfortunately, somewhat less easily exercised. Could you have a look at Bug1176a.hs attached to #1337 ? Is this what happened ?
This looks like the same thing, yes, so a correstion to the GHC version of the pretty-printing library like the one you supplied for #1337 seems called for.
... As you seem to know the pretty printer library well, it would be great if you could comment on the revised specification of fill.
The new specification certainly explained things for me that I had not understood earlier, so it is definitely an improvement.
...
Best regards Thorkil

On Tue, Jun 24, 2008 at 02:11:34PM +0200, Benedikt Huber wrote:
I'd like to propose bugfixes, documentation fixes and a performance improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't effect the expected behaviour of the PP library.
Thanks Benedikt; I've applied your patch. Thanks Ian
participants (7)
-
Benedikt Huber
-
Bulat Ziganshin
-
Duncan Coutts
-
Ian Lynagh
-
Neil Mitchell
-
Simon Peyton-Jones
-
Thorkil Naur