
John D. Ramsdell wrote:
On Nov 17, 2007 3:04 PM, apfelmus
wrote: Unfortunately, I don't have Paulson's book (or any other ML book :) at home. I'm too lazy to figure out the specification from the source code,
I guess the code is too opaque, as my colleague claimed.
The layout the algorithm generates condensed indented blocks. Within a block, it inserts a newline when the distance to the next break point plus the current position is greater than the space remaining on the current line. Thus if S-Expression lists are rendered as blocks with indent two, and every element in a list is separated by a break point of length one, with the proper margin, you would see:
(defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (three thing) (four thing))
As an exercise, the book asks you to implement group indent, where if any break point in a group inserts a newline, they all do. So with that layout, one would get:
(defthingy name-of-thingy (one thing) (two thing) (a-big-thing-made-bigger) (there thing) (four thing))
The C version I wrote supports this layout, but I didn't bother with that extension for the Haskell version.
Thanks. The interesting case of nested blocks still needs to be specified, but with this description in mind and judging from the code, I guess it behaves as follows: either a block fits entirely on the remaining line (no line breaks inside), or it begins a new line. Now, the quest of programming is to make this description executable by computers while keeping it understandable by humans. This is straightforward to do with Wadler's pretty printer combinators (package "wl-pprint" on http://hackage.haskell.org ) data S = Atom String | List [S] -- S-expressions layout :: Int -> [S] -> Doc layout indent [] = empty layout indent (x:xs) = foldr1 (<>) (render x : map f xs) where f x@(Atom _) = group line <> render x f x@(List _) = group (line <> render x) render (Atom s ) = text s render (List xs) = nest indent $ parens $ layout xs The semantics of Doc are (for more, see Wadler's paper): Doc is a document with a set of different layouts, where the only difference between them is that some line primitives are rendered as ("\n" ++ replicate currentIndentation ' ') and some are rendered as a single space. Now, group x adds a new layout to the set x , namely the layout where all line in x have been flattened to a single space. This way, the definition of f directly reflects the alternative "either a block fits entirely on the remaining line or it begins a new line". Your group indent extension is even easier, namely just layout2 :: Int -> [S] -> Doc layout2 indent = sep . map render where render (Atom s ) = text s render (List xs) = nest indent $ parens $ layout2 xs with the functions sep = group . foldr (<$>) empty x <$> y = x <> line <> y from the library.
On the strictness annotations, my reasons for them are the usual ones, primarily to prevent memory leaks due to dragging, but a performance boost is always welcome. At some point, I plan to profile the code with and without the annotations, and find out where they are needed.
That seems excessive. Can you really prove that this will prevent space leaks? I doubt that. Laziness is still "useful" (example: early bail-out in your breakdist ) if only the data structures are fully evaluated as opposed to every function being strict, so it's an interesting idea. But that doesn't guarantee zero space leaks, since sumFromTo :: Int -> Int -> Int sumFromTo a b = f a b 0 where f a b k = if a == b then k else f (a+1) b (k+a) is one. Also, you won't be able to conveniently use lists as "suspended loops" which would be a pity. Regards, apfelmus