
John D. Ramsdell wrote:
Compared to that, I'm missing the specification part for your pretty
printer. How's it supposed to lay out?
The specification is in Paulson's book. The pretty printer is used with S-Expressions, and the block layout generates compact, indented output that is good when there is much data to be displayed. ... The close connection between the Haskell and Standard ML versions is part of the reason I was able to quickly generate the code, and be confident in its correctness.
Unfortunately, I don't have Paulson's book (or any other ML book :) at home. I'm too lazy to figure out the specification from the source code, can you provide a link or an explanation? I'm asking because I'd be astonished if one couldn't write an elegant Haskell version that's clearly correct and efficient at the same time. And such things are easiest to produce from scratch.
.... a simple difference list ... will do.
Hmm. Doesn't the output type (Int, String) -> (Int, String) show the implementation is using the essence of a difference list? Remember, the resulting function prepends something the string it is given in the second element of the pair, and returns that string.
Yes, of course. But the true virtue is to disentangle it from the rest, i.e. to use an abstract string type with fast concatenation.
Int -> (Int, String -> String) -- difference list
My first attempt at writing the printing function had a type similar to this one. I found myself composing difference lists of type ShowS. The performance was noticabily slow, specially as compared with the implementation displayed in my message. Perhaps the use of Data.DList would repair this performance problem.
I don't mean to suggest that ShowS style difference lists cannot be used to make the function easier to understand--all I'm saying is my attempt to do so did not work out well.
Dlist a = [a] -> [a] so this would be no different from ShowS.
Drop those strictness annotations from !String and ![Pretty], they won't
do any good. The !Int are only useful if they will be unboxed, but I wouldn't bother right now.
I thought that the annotations ensure the first element of the list is evaluated. The Char and Pretty lists are generated with seqrev, so everything gets evaluated before constructor is applied to data.
-- A reverse that evaluates the list elements. seqrev :: [a] -> [a] seqrev = foldl (\xs x -> x `seq` xs `seq` (x:xs)) []
The trouble is the constructors are not exported directly, but instead through str, brk, and blo, functions which are not strict. It's the best I could do, as near as I can tell.
O_O, everything strict? But why would you ever want that? Well if it's for "performance" reasons, I'd have to point out that the pretty printer has an algorithmic flaw: there are two calls to (breakdist es after), one from the Brk case and one from the Blo case. The former one is safe since breakdist doesn't look further than to the next Brk in es . But the latter one will lead to a quadratic run-time in the worst case, for example on the following input replicate n (Blk [Brk _] _ _) = [Blk [Brk _] _ _, Blk [Brk _] _ _, ..] -- n elements (Fill in any numbers you like for the placeholders _ ). That's a bit degenerate but you can spice it up with as many Str as you like, just don't add any additional Brk . Of course, a memoization scheme will fix this but I'd rather develop a new algorithm from scratch.
It seems rather hard to avoid lazyness in the current version of Haskell when it's not wanted. I hope one of the proposals for deep strictness makes it into Haskell prime. In my application, there is one datastructure, such that if every value tracable from an instance of the datastructure is evaluated, I am quite sure my program will be free from memory leaks due to dragging. I wish I could tell compilers to make it so.
As Derek said, strict data types are probably the easiest way to go here. Or you can use custom strict constructors, like str s = s `deepSeq` Str s or something. But again, I don't know why you would want that at all. Regards, apfelmus