
This is another Haskell style question. I had some trouble with the pretty printer that comes with GHC, so I translated one written in Standard ML. I have already translated the program into C, so rewriting it in Haskell was quick and easy for me. The Standard ML version uses a reference cell to keep track of the space available on a line. I threaded the value of the reference cell through the computation using a where clause to define two mutually recursive equations. The fixed point implicit in the where clause ties the knot in the circular definitions in a way that allows the output string to be efficiently computed front to back. I showed the code to a colleague, who found the circular definitions opaque. He suggested a better style is to use monads, and describe the computation in a mode that is closer to its origin form in Standard ML. What style do to you prefer, a knot-tying or a monad-based style? I have enclosed the pretty printer. The printing function is the subject of the controversy. John A simple pretty printer The alogithm is by Lawrence C. Paulson, who simplified an algorithm by Derek C. Oppen. Derek C. Oppen, Prettyprinting, ACM Transactions on Programming Languages and Systems, Vol 2, No. 4, October 1980, Pages 465-483. A pretty printer based on ML programs with the following copyright (**** ML Programs from Chapter 8 of ML for the Working Programmer, 2nd edition by Lawrence C. Paulson, Computer Laboratory, University of Cambridge. (Cambridge University Press, 1996) Copyright (C) 1996 by Cambridge University Press. Permission to copy without fee is granted provided that this copyright notice and the DISCLAIMER OF WARRANTY are included in any copy. DISCLAIMER OF WARRANTY. These programs are provided `as is' without warranty of any kind. We make no warranties, express or implied, that the programs are free of error, or are consistent with any particular standard of merchantability, or that they will meet your requirements for any particular application. They should not be relied upon for solving a problem whose incorrect solution could result in injury to a person or loss of property. If you do use the programs or functions in such a manner, it is at your own risk. The author and publisher disclaim all liability for direct, incidental or consequential damages resulting from your use of these programs or functions. ****)
module Pretty(Pretty, pr, blo, str, brk) where
data Pretty = Str !String | Brk !Int -- Int is the number of breakable spaces | Blo ![Pretty] !Int !Int -- First int is the indent, second int -- is the number of chars and spaces for strings and breaks in block
Constructors Strings
str :: String -> Pretty str = Str
Break points
brk :: Int -> Pretty brk = Brk
Indentation blocks
blo :: Int -> [Pretty] -> Pretty blo indent es = Blo es indent (sum es 0) where sum [] k = k sum (e:es) k = sum es (size e + k) size (Str s) = length s size (Brk n) = n size (Blo _ _ n) = n
Pretty prints the constructed object
pr :: Int -> Pretty -> ShowS pr margin e s = s1 where (_, s1) = printing margin [e] margin 0 (margin, s)
The state of the computation is maintained as a pair consisting of an integer and a string. The integer is the number of unused character positions in the current line of output. The printer adds content to the front of the given string.
printing :: Int -> [Pretty] -> Int -> Int -> (Int, String) -> (Int, String) printing _ [] _ _ p = p printing margin (e:es) blockspace after (space, s) = (space1, s1) where (space2, s1) = -- Result of first item case e of Str str -> -- Place a string (space - length str, showString str s2) Brk n -> -- Place breakable space if n + breakdist es after <= space then blanks n (space, s2) -- Don't break else (space3, showChar '\n' s3) -- Break where (space3, s3) = blanks (margin - blockspace) (margin, s2) Blo bes indent _ -> -- Place a block printing margin bes (space - indent) (breakdist es after) (space, s2) (space1, s2) = -- Result of the remaining items printing margin es blockspace after (space2, s)
Find the distance to the nearest breakable space.
breakdist :: [Pretty] -> Int -> Int breakdist (Str s : es) after = length s + breakdist es after breakdist (Brk _ : _) _ = 0 breakdist (Blo _ _ n : es) after = n + breakdist es after breakdist [] after = after
Place spaces
blanks :: Int -> (Int, String) -> (Int, String) blanks n (space, s) | n <= 0 = (space, s) | otherwise = blanks (n - 1) (space - 1, showChar ' ' s)

John D. Ramsdell wrote:
This is another Haskell style question.
I had some trouble with the pretty printer that comes with GHC, so I translated one written in Standard ML. I have already translated the program into C, so rewriting it in Haskell was quick and easy for me.
Concerning the choice of a pretty printer, the one bundled in GHC is close to John Hughes. The Design of a Pretty-printing Library. http://citeseer.ist.psu.edu/hughes95design.html but there's also Philip Wadler. A prettier printer. http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf (probably available as a library on hackage). Btw, both papers are marvelous introductions to the derivation of programs from their specification. Compared to that, I'm missing the specification part for your pretty printer. How's it supposed to lay out?
The Standard ML version uses a reference cell to keep track of the space available on a line. I threaded the value of the reference cell through the computation using a where clause to define two mutually recursive equations. The fixed point implicit in the where clause ties the knot in the circular definitions in a way that allows the output string to be efficiently computed front to back.
I showed the code to a colleague, who found the circular definitions opaque. He suggested a better style is to use monads, and describe the computation in a mode that is closer to its origin form in Standard ML.
What style do to you prefer, a knot-tying or a monad-based style? I have enclosed the pretty printer. The printing function is the subject of the controversy.
Neither, I think that the code mixes too many concerns. You need neither knot tying nor monads for efficient string concatenation, a simple difference list type DString = Data.DList String = String -> String will do. (There's a small difference list library Data.DList available on hackage). If ++ is too inefficient, then simply switch to a different String implementation with a faster ++. Introducing a difference list means to replace the output type (Int, String) -> (Int, String) of printing not by Int -> (String -> (Int, String)) -- state monad with state String but by Int -> (Int, String -> String) -- difference list Furthermore, I guess that this can probably be replaced by Int -> (String -> String) (Int -> Int, String -> String) or made entirely abstract type X = (Int, String) -> (Int, String) blanks :: Int -> X
blanks n (space, s) | n <= 0 = (space, s) | otherwise = blanks (n - 1) (space - 1, showChar ' ' s)
string :: String -> X string s (space,t) = (space - length s, s ++ t) or something like that. I don't know what your printer is supposed to do, so I can't say for sure.
module Pretty(Pretty, pr, blo, str, brk) where
data Pretty = Str !String | Brk !Int -- Int is the number of breakable spaces | Blo ![Pretty] !Int !Int -- First int is the indent, second int -- is the number of chars and spaces for strings and breaks in block
Drop those strictness annotations from !String and ![Pretty], they won't do any good. The !Int are only useful if they will be unboxed, but I wouldn't bother right now.
Indentation blocks
blo :: Int -> [Pretty] -> Pretty blo indent es = Blo es indent (sum es 0) where sum [] k = k sum (e:es) k = sum es (size e + k) size (Str s) = length s size (Brk n) = n size (Blo _ _ n) = n
size is of independent value, I'd make it a top-level function. Oh, and the sum won't be tail-recursive (until ghc's strictness analyzer figures it out). I'd like to point you to http://haskell.org/haskellwiki/Performance/Accumulating_parameter for an explanation of why, but the information there is rather inaccurate. For the moment, I could only find http://monad.nfshost.com/wordpress/?p=19 last section of http://blog.interlinked.org/tutorials/haskell_laziness.html but isn't there a short text that describes in detail why foldl' is different from foldl and why foldr is "better" in many cases? I thought this faq would have been cached already :) In any case, I'd simply write blo indent es = Blo es indent . sum . map size $ es ( sum is a function from the Prelude.) Regards, apfelmus

but isn't there a short text that describes in detail why foldl' is different from foldl and why foldr is "better" in many cases? I thought this faq would have been cached already :)
Perhaps you're thinking of http://haskell.org/haskellwiki/Stack_overflow ? -Brent

Brent Yorgey wrote:
but isn't there a short text that describes in detail why foldl' is different from foldl and why foldr is "better" in many cases? I thought this faq would have been cached already :)
Perhaps you're thinking of http://haskell.org/haskellwiki/Stack_overflow ?
Ah, that looks better, although it's a bit messy for my taste. I've scribbled a hopefully gentler explanation at http://en.wikibooks.org/wiki/Haskell/Performance_Introduction . Regards, apfelmus

Thank you for your interesting reply. I found it enlightening. Compared to that, I'm missing the specification part for your pretty
printer. How's it supposed to lay out?
The specification is in Paulson's book. The pretty printer is used with S-Expressions, and the block layout generates compact, indented output that is good when there is much data to be displayed. I played around with the Hughes pretty printer, but it wasn't obvious how to get the output I knew I could from the Paulson pretty printer. I confess I did not spend much time tinkering with the Hughes pretty printer. I failed to mention in my original note that the code was written so that the Haskell version lines up with the Standard ML version. The close connection between the Haskell and Standard ML versions is part of the reason I was able to quickly generate the code, and be confident in its correctness. It also explains why I did not use the sum function in the Prelude, or your map trick when writing the blo function.
What style do to you prefer, a knot-tying or a monad-based style? I have enclosed the pretty printer. The printing function is the subject of the controversy.
... a simple difference list ... will do. Hmm. Doesn't the output type (Int, String) -> (Int, String) show the implementation is using the essence of a difference list? Remember, the resulting function prepends something the string it is given in the second element of the pair, and returns that string. Introducing a difference list means to replace the output type
(Int, String) -> (Int, String)
of printing by
Int -> (Int, String -> String) -- difference list
My first attempt at writing the printing function had a type similar to this one. I found myself composing difference lists of type ShowS. The performance was noticabily slow, specially as compared with the implementation displayed in my message. Perhaps the use of Data.DList would repair this performance problem. I don't mean to suggest that ShowS style difference lists cannot be used to make the function easier to understand--all I'm saying is my attempt to do so did not work out well.
module Pretty(Pretty, pr, blo, str, brk) where
data Pretty = Str !String | Brk !Int -- Int is the number of breakable spaces | Blo ![Pretty] !Int !Int -- First int is the indent, second int -- is the number of chars and spaces for strings and breaks in block
Drop those strictness annotations from !String and ![Pretty], they won't
do any good. The !Int are only useful if they will be unboxed, but I wouldn't bother right now.
I thought that the annotations ensure the first element of the list is evaluated. The Char and Pretty lists are generated with seqrev, so everything gets evaluated before constructor is applied to data. -- A reverse that evaluates the list elements. seqrev :: [a] -> [a] seqrev = foldl (\xs x -> x `seq` xs `seq` (x:xs)) [] The trouble is the constructors are not exported directly, but instead through str, brk, and blo, functions which are not strict. It's the best I could do, as near as I can tell. It seems rather hard to avoid lazyness in the current version of Haskell when it's not wanted. I hope one of the proposals for deep strictness makes it into Haskell prime. In my application, there is one datastructure, such that if every value tracable from an instance of the datastructure is evaluated, I am quite sure my program will be free from memory leaks due to dragging. I wish I could tell compilers to make it so. John

On Sat, 2007-11-17 at 13:30 -0500, John D. Ramsdell wrote: ...
It seems rather hard to avoid lazyness in the current version of Haskell when it's not wanted. I hope one of the proposals for deep strictness makes it into Haskell prime. In my application, there is one datastructure, such that if every value tracable from an instance of the datastructure is evaluated, I am quite sure my program will be free from memory leaks due to dragging. I wish I could tell compilers to make it so.
Use strict constructors. Using strict data types is most probably the appropriate way to deal with many cases where you would want a deep seq. In particular, head strict lists would not be a bad addition. There is a library out there with several strict data types.

John D. Ramsdell wrote:
Compared to that, I'm missing the specification part for your pretty
printer. How's it supposed to lay out?
The specification is in Paulson's book. The pretty printer is used with S-Expressions, and the block layout generates compact, indented output that is good when there is much data to be displayed. ... The close connection between the Haskell and Standard ML versions is part of the reason I was able to quickly generate the code, and be confident in its correctness.
Unfortunately, I don't have Paulson's book (or any other ML book :) at home. I'm too lazy to figure out the specification from the source code, can you provide a link or an explanation? I'm asking because I'd be astonished if one couldn't write an elegant Haskell version that's clearly correct and efficient at the same time. And such things are easiest to produce from scratch.
.... a simple difference list ... will do.
Hmm. Doesn't the output type (Int, String) -> (Int, String) show the implementation is using the essence of a difference list? Remember, the resulting function prepends something the string it is given in the second element of the pair, and returns that string.
Yes, of course. But the true virtue is to disentangle it from the rest, i.e. to use an abstract string type with fast concatenation.
Int -> (Int, String -> String) -- difference list
My first attempt at writing the printing function had a type similar to this one. I found myself composing difference lists of type ShowS. The performance was noticabily slow, specially as compared with the implementation displayed in my message. Perhaps the use of Data.DList would repair this performance problem.
I don't mean to suggest that ShowS style difference lists cannot be used to make the function easier to understand--all I'm saying is my attempt to do so did not work out well.
Dlist a = [a] -> [a] so this would be no different from ShowS.
Drop those strictness annotations from !String and ![Pretty], they won't
do any good. The !Int are only useful if they will be unboxed, but I wouldn't bother right now.
I thought that the annotations ensure the first element of the list is evaluated. The Char and Pretty lists are generated with seqrev, so everything gets evaluated before constructor is applied to data.
-- A reverse that evaluates the list elements. seqrev :: [a] -> [a] seqrev = foldl (\xs x -> x `seq` xs `seq` (x:xs)) []
The trouble is the constructors are not exported directly, but instead through str, brk, and blo, functions which are not strict. It's the best I could do, as near as I can tell.
O_O, everything strict? But why would you ever want that? Well if it's for "performance" reasons, I'd have to point out that the pretty printer has an algorithmic flaw: there are two calls to (breakdist es after), one from the Brk case and one from the Blo case. The former one is safe since breakdist doesn't look further than to the next Brk in es . But the latter one will lead to a quadratic run-time in the worst case, for example on the following input replicate n (Blk [Brk _] _ _) = [Blk [Brk _] _ _, Blk [Brk _] _ _, ..] -- n elements (Fill in any numbers you like for the placeholders _ ). That's a bit degenerate but you can spice it up with as many Str as you like, just don't add any additional Brk . Of course, a memoization scheme will fix this but I'd rather develop a new algorithm from scratch.
It seems rather hard to avoid lazyness in the current version of Haskell when it's not wanted. I hope one of the proposals for deep strictness makes it into Haskell prime. In my application, there is one datastructure, such that if every value tracable from an instance of the datastructure is evaluated, I am quite sure my program will be free from memory leaks due to dragging. I wish I could tell compilers to make it so.
As Derek said, strict data types are probably the easiest way to go here. Or you can use custom strict constructors, like str s = s `deepSeq` Str s or something. But again, I don't know why you would want that at all. Regards, apfelmus

On Nov 17, 2007 3:04 PM, apfelmus
Unfortunately, I don't have Paulson's book (or any other ML book :) at home. I'm too lazy to figure out the specification from the source code,
I guess the code is too opaque, as my colleague claimed. The layout the algorithm generates condensed indented blocks. Within a block, it inserts a newline when the distance to the next break point plus the current position is greater than the space remaining on the current line. Thus if S-Expression lists are rendered as blocks with indent two, and every element in a list is separated by a break point of length one, with the proper margin, you would see: (defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (three thing) (four thing)) As an exercise, the book asks you to implement group indent, where if any break point in a group inserts a newline, they all do. So with that layout, one would get: (defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (there thing) (four thing)) The C version I wrote supports this layout, but I didn't bother with that extension for the Haskell version. On the strictness annotations, my reasons for them are the usual ones, primarily to prevent memory leaks due to dragging, but a performance boost is always welcome. At some point, I plan to profile the code with and without the annotations, and find out where they are needed. John

John D. Ramsdell wrote:
On Nov 17, 2007 3:04 PM, apfelmus
{-# LANGUAGE BangPatterns #-}
-- Author: Chris Kuklewicz -- -- This is a rewrite of John D. Ramsdell Pretty.hs code on -- haskell-cafe mailing list
-- Changelog from Pretty.hs -- All Pretty elements have a length field (lazy at the moment) -- Inlined logic of 'blanks' and used new 'prepend' instead -- Replaced blocksize by startColumn == margin - blocksize -- Replaced space by colunmIn == margin-space -- Documented what 'after' means module Blocks(Pretty,str,brk,spc,blo,cat,pr) where
-- All of the len's are non-negative, guaranteed by smart constructors data Pretty = Str { len :: Int, string :: String} | Brk { len :: Int } | Blo { len :: Int, indentBy :: Int, parts :: [Pretty] }
str s = Str (length s) s
brk n | n < 0 = error ("Cannot have negative width brk! n = " ++ show n) | otherwise = Brk n
spc = brk 1
blo indent es | indent < 0 = error ("Cannot have negative width blo! indent = " ++ show indent) | otherwise = Blo (sum (map len es)) indent es
cat = blo 0
{-# INLINE pr #-} pr :: Int -> Pretty -> (String->String) pr margin e sRest = let {startColumn = 0; after = 0; columnIn = 0} in snd (printing margin startColumn after [e] (columnIn, sRest))
{-# INLINE printing #-} printing :: Int -> Int -> Int -> [Pretty] -> (Int,String) -> (Int,String) -- margin is the desired maximum line length and must be non-negative -- startColumn, columnIn, column', and columnOut are all non-negative, -- but any of them may be greater than margin printing margin | margin < 0 = error ("Cannot have non-positive margin! margin == "++show margin) | otherwise = block where
-- startColumn is the "current amount of indent after newline" -- after is how much must be laid after the (e:es) and before the next break, non-negative block !startColumn !after = layout where
-- (e:es) are the items to layout before 'sIn' -- columnIn is the starting column for laying out 'e' -- columnOut is the column after the (e:es) have been laid out layout [] columnIn'sIn'pair = columnIn'sIn'pair layout (e:es) (!columnIn,sIn) = (columnOut,sOut) where
(columnOut,s') = layout es (column',sIn)
-- column' is the column to use after when laying out es, after laying out e (column',sOut) = case e of Str n str -> (columnIn+n, showString str s') Brk n | columnIn + n + breakDist es after <= margin -> (columnIn+n, prepend n ' ' s') | 0 <= startColumn -> (startColumn, '\n':prepend startColumn ' ' s') | otherwise -> (0, '\n':s') Blo _n indent es' -> let startColumn' = indent + columnIn after' = breakDist es after in block startColumn' after' es' (columnIn,s')
-- Trivial helper function to prepend 'n' copies of character 'c' {-# INLINE prepend #-} prepend n c s | n < 0 = error ("prepend called with "++show n++" < 0 !") | otherwise = loop n where loop 0 = s loop n = c : loop (pred n)
-- after >=0 implies breakDist _ after >= 0 -- Note that contained Blo's are assumed to layout in one line without using any internal breaks. breakDist :: [Pretty] -> Int -> Int breakDist esIn !after = accum esIn 0 where accum [] beforeBrk = beforeBrk + after accum (Brk {}:_) beforeBrk = beforeBrk accum (e : es) beforeBrk = accum es (beforeBrk + len e)
test1 = putStrLn $ pr 5 (blo 3 [str "Hello",spc,str "World!" ,blo 3 [str "Goodbye",spc,str "Space!"] ,spc,cat [str "The",spc,str "End"]]) ""
test2 = putStrLn $ pr 12 (blo 3 [str "Hello",spc,str "World!",spc ,blo 3 [str "Goodbye",spc,str "Space!"] ,spc,cat [str "The",spc,str "End"]]) ""
test3 = putStrLn $ pr 12 (blo 3 [str "Hello",spc,str "World!" ,blo 3 [str "Goodbye",spc,str "Space!"] ,spc,cat [str "The",spc,str "End"]]) ""
{- *Blocks> test1 Hello World!Goodbye Space! The End *Blocks> test2 Hello World! Goodbye Space! The End *Blocks> test3 Hello World!Goodbye Space! The End -}

The data dependency is circular. The "case e of" Str and Brk are not-circular: layout examines the input parameters to determine column'. Then column' is used to compute columnOut and s'. Then the current data is prepended to s'. The Blo case is the circular one. Pushing the circular definitions in to this case (and using the mapSnd helper) results in:
layout :: [Pretty] -> (Int,String) -> (Int,String) layout [] columnIn'sIn'pair = columnIn'sIn'pair layout (e:es) (!columnIn,sIn) = case e of Str n str -> mapSnd (showString str) $ layout es (columnIn+n,sIn) Brk n | columnIn + n + breakDist es after <= margin -> mapSnd (prepend n ' ') $ layout es (columnIn+n,sIn) | 0 <= startColumn -> mapSnd (('\n':).prepend startColumn ' ') $ layout es (startColumn,sIn) | otherwise -> mapSnd ('\n':) $ layout es (0,sIn) Blo _n indent es' -> let startColumn' = indent + columnIn after' = breakDist es after (columnOut,s') = layout es (column',sIn) (column',sOut) = block startColumn' after' es' (columnIn,s') in (columnOut,sOut)
mapSnd f (a,b) = (a,f b)
The circular usage of column' and s' can be unwound by "importing Control.Monad.Fix(fix)" and writing a definition with "fix" that explicitly feeds back the s':
Blo _n indent es' -> let startColumn' = indent + columnIn after' = breakDist es after withS ~(_,s') = let (column',sOut) = block startColumn' after' es' (columnIn,s') (columnOut,s'') = layout es (column',sIn) in ((columnOut,sOut),s'') in fst (fix withS)
In withS above, the column' is created by the call to block and consumed by the call to layout. The s'' is fed back to become s' by the use of fix. The actual answer is the fst component. It is also possible to avoid the lazy '~' matching by using "snd":
Blo _n indent es' -> let startColumn' = indent + columnIn after' = breakDist es after withS ans's' = let (column',sOut) = block startColumn' after' es' (columnIn,snd ans's') (columnOut,s'') = layout es (column',sIn) in ((columnOut,sOut),s'') in fst (fix withS)
Whether any of these three versions is clearer to the previous message is a matter of taste. Cheers, Chris

Chris,
You answer was quite a bit more than I expected for a simple style
question. Thanks.
On Nov 19, 2007 12:27 PM, ChrisK
The data dependency is circular.
Yes, thus the need for the knot. I gather your answer to my style question is you prefer knot tying over monads for this particular problem. By the way, it seems that the second line of your code was garbled, but it's easy to figure out what you meant. John

ChrisK wrote:
The data dependency is circular.
Yes and no. The input and outputs pairs are dependent on each other, but the integer doesn't depend on the string. Thus, I'm pretty sure that (Int, String) -> (Int, String) can be refactored into Int -> (Int, String -> String) This is related to attribute grammars, I finally found the reference Designing and Implementing Combinator Languages. S. Doaitse Swierstra, Pablo R. Azero Alcocer, João Saraiva http://people.cs.uu.nl/doaitse/Papers/1999/AFP3.pdf I'd even add after to the result of the functions in order to avoid the O(n^2) degenerate case. In any case, I prefer Wadler's combinators. With line being more rigid than Brk , nest and group basically factor the monolithic Blk which makes more laws and available and hence gives a more elegant implementation. Regards, apfelmus

John D. Ramsdell wrote:
On Nov 17, 2007 3:04 PM, apfelmus
wrote: Unfortunately, I don't have Paulson's book (or any other ML book :) at home. I'm too lazy to figure out the specification from the source code,
I guess the code is too opaque, as my colleague claimed.
The layout the algorithm generates condensed indented blocks. Within a block, it inserts a newline when the distance to the next break point plus the current position is greater than the space remaining on the current line. Thus if S-Expression lists are rendered as blocks with indent two, and every element in a list is separated by a break point of length one, with the proper margin, you would see:
(defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (three thing) (four thing))
As an exercise, the book asks you to implement group indent, where if any break point in a group inserts a newline, they all do. So with that layout, one would get:
(defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (there thing) (four thing))
The C version I wrote supports this layout, but I didn't bother with that extension for the Haskell version.
Thanks. The interesting case of nested blocks still needs to be specified, but with this description in mind and judging from the code, I guess it behaves as follows: either a block fits entirely on the remaining line (no line breaks inside), or it begins a new line. Now, the quest of programming is to make this description executable by computers while keeping it understandable by humans. This is straightforward to do with Wadler's pretty printer combinators (package "wl-pprint" on http://hackage.haskell.org ) data S = Atom String | List [S] -- S-expressions layout :: Int -> [S] -> Doc layout indent [] = empty layout indent (x:xs) = foldr1 (<>) (render x : map f xs) where f x@(Atom _) = group line <> render x f x@(List _) = group (line <> render x) render (Atom s ) = text s render (List xs) = nest indent $ parens $ layout xs The semantics of Doc are (for more, see Wadler's paper): Doc is a document with a set of different layouts, where the only difference between them is that some line primitives are rendered as ("\n" ++ replicate currentIndentation ' ') and some are rendered as a single space. Now, group x adds a new layout to the set x , namely the layout where all line in x have been flattened to a single space. This way, the definition of f directly reflects the alternative "either a block fits entirely on the remaining line or it begins a new line". Your group indent extension is even easier, namely just layout2 :: Int -> [S] -> Doc layout2 indent = sep . map render where render (Atom s ) = text s render (List xs) = nest indent $ parens $ layout2 xs with the functions sep = group . foldr (<$>) empty x <$> y = x <> line <> y from the library.
On the strictness annotations, my reasons for them are the usual ones, primarily to prevent memory leaks due to dragging, but a performance boost is always welcome. At some point, I plan to profile the code with and without the annotations, and find out where they are needed.
That seems excessive. Can you really prove that this will prevent space leaks? I doubt that. Laziness is still "useful" (example: early bail-out in your breakdist ) if only the data structures are fully evaluated as opposed to every function being strict, so it's an interesting idea. But that doesn't guarantee zero space leaks, since sumFromTo :: Int -> Int -> Int sumFromTo a b = f a b 0 where f a b k = if a == b then k else f (a+1) b (k+a) is one. Also, you won't be able to conveniently use lists as "suspended loops" which would be a pity. Regards, apfelmus

On Nov 19, 2007 11:42 AM, apfelmus
Thanks. The interesting case of nested blocks still needs to be specified, but with this description in mind and judging from the code, I guess it behaves as follows: either a block fits entirely on the remaining line (no line breaks inside), or it begins a new line.
Yeah, breakdist does not look inside a block when computing the break distance.
On the strictness annotations, my reasons for them are the usual ones, primarily to prevent memory leaks due to dragging, but a performance boost is always welcome. At some point, I plan to profile the code with and without the annotations, and find out where they are needed.
That seems excessive. Can you really prove that this will prevent space
leaks? I doubt that.
Ooh, I think I overstated my case. I meant to say that for my application, there are just a few data structures, such that if data traceable from them is strictly evaluated, I'm pretty sure I will have no space leaks. Currently it's just an intuition, but when the application is mature, I'll profile and validate this intuition. All I know is it was dog slow without any annotations, and spaying them on the suspect data structures cured that problem. Only careful profiling will tell which of those annotations can be removed. John
participants (5)
-
apfelmus
-
Brent Yorgey
-
ChrisK
-
Derek Elkins
-
John D. Ramsdell