Hey Ben,

thanks for that. I didn’t realise I opened pretty much the Pandora’s box, hehe. I have now found (whilst reading Bartosz’s notes) the numerous tickets & discussions around the library, but what you say in this email neatly summarise the crux of the problem, as far as I could see. Rather than saying silly things, I would rather spend a bit of time reading everything that has been written by folks way smarter than me on the subject and get back to you folks later ;)

Alfredo

On 18 April 2017 at 14:21, Ben Gamari <ben@smart-cactus.org> wrote:
Alfredo Di Napoli <alfredo.dinapoli@gmail.com> writes:

> Hey Simon,
>
> thanks for chiming in. I had the same suspect as well, and in fact my first
> temptation was to use dlists to avoid costly concatenation (I guess a
> Builder shares pretty much the same idea, which is avoid right-appends). I
> eventually bailed out as that Pretty module had some functions like sepX or
> fillNBE (I might be wrong, going by memory only here) which needed to
> *inspect* the current [Doc] we were carrying around, and thus dlists
> couldn’t accomplish this in O(1), but forcing it back to a list was
> necessary. Am I right in thinking that using a Builder here will suffer the
> same malady? Ideally I would like constant time for both appends, left
> concat & inspection (as in pattern-matching inspection) but I don’t think
> what I’m asking exists, at least not in its functional declination anyway ;)
>
> That’s why I was thinking to give Deques (or more generally, Sequences) a
> go: they don’t have O(1) appends but at least can be inspected in O(1).
> Question has to be answered whether this will yield improvement or not,
> which I guess depends upon the ratio of inspection / appending we do in the
> pretty printing. In that case using dlists or builders might be better.
> Last but not least, although the current implementation doesn’t backtrack
> anyway, I’m wondering wether switching to a different representation for a
> Doc (namely a huge function composition of Token(s), as described in the
> paper) could be beneficial as well.
>
> Do you guys have any opinions? Yesterday I extracted Pretty.hs from the
> sourcetree and I’m now planning to produce a criterion benchmark and
> compare different implementations, althought it’s a bit hard to predict the
> real world usage as I don’t have access to a representative Doc document as
> produced by GHC, so my benchs could be all ill-founded.
>
Note that GHC's `Pretty` module is just a slightly modified version of
the `pretty` package. The differences are fairly minimal (although
important for performance):

 * It uses FastString in place of String, giving us fast `length` (in
   https://ghc.haskell.org/trac/ghc/ticket/8809#comment:60
   I propose that we extend `pretty` with a typeclass for string types)

 * GHC's variant still has a known stack overflow bug that was fixed
   upstream. Unfortunately, compiler performance regressed when we
   attempted to port upstream's patch (#10735)

Ideally we would fix these and just use the `pretty` library itself.

In addition to these issues, it would be quite helpful if `pretty`
gained a special-case for the infinite band-width case (which is what we
use in the code generator). The point here is that we should need to do
no layout computation in the infinite band case: merely place line
breaks precisely where the user asks. This should result in a noticeable
improvement in code generation performance (IIRC Bartosz noted rather
significant amounts of time spent pretty-printing assembler).

Cheers,

- Ben