Code generation and optimisation for compiling Haskell

Although I'm far from being an expert Haskell programmer, I think I'm ready to look into some of the details of how it's compiled. I've a copy of Modern Compiler Design (Grune, Bal, Jacobs and Langendoen) - I first learned a lot of lexical and parsing stuff from it quite a few years ago. Today, I started looking through the functional languages section - I've read it before, but never absorbed much of it. Graph reduction, lambda lifing, etc - it seems pretty simple. Far too simple. It's hard to believe that decent performance is possible if all the work is done by a run-time graph reduction engine. Simon Peyton Jones has written a couple of books on implementing functional languages which are available for free download. At a glance, they seem to covers similar topics in much more detail. However, they're from 1987 and 1992. Considering SPJs period "of despair" when he couldn't get practical performance for monadic I/O, these seem very dated. Some time ago, I made a note to look up the book "Functional Programming and Parallel Graph Rewriting" (I forget why) but again that's from the early 90's. I've also got a note to look up Urban Boquists thesis. SPJ also has some papers on compilation - http://research.microsoft.com/en-us/um/people/simonpj/papers/papers.html#com... - and the papers on optimisation by program transformation have caught my eye. Are there any current text-books that describe the techniques used by compilers like GHC to generate efficient code from functional languages? It's OK to assume some knowledge of basic compiler theory - the important stuff is code generation and optimisation techniques for lazy functional languages in particular. Also, what papers should I read? Am I on the right lines with the ones I've mentioned above?

Based on your stated background, the best start would be the (longer)
paper on the Spineless Tagless G-machine [1]. It describes how graph
reduction is actually implemented efficiently. Since then there have
been two major changes to this basic implementation: Use of eval/apply
(a different calling convention) [2] and constructor tagging [3]
(which drastically reduces the number of indirect branches from the
original STG approach).
In addition to this fairly low-level stuff, there are very powerful
optimisations performed at a higher level. For a taste see stream
fusion [4].
If you're done with these, feel free to ask for more. :)
[1]: http://research.microsoft.com/apps/pubs/default.aspx?id=67083
[2]: http://research.microsoft.com/en-us/um/people/simonpj/papers/eval-apply/
[3]: http://research.microsoft.com/en-us/um/people/simonpj/papers/ptr-tag/index.h...
[4]: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7401
On 10 January 2012 17:25, Steve Horne
Although I'm far from being an expert Haskell programmer, I think I'm ready to look into some of the details of how it's compiled. I've a copy of Modern Compiler Design (Grune, Bal, Jacobs and Langendoen) - I first learned a lot of lexical and parsing stuff from it quite a few years ago. Today, I started looking through the functional languages section - I've read it before, but never absorbed much of it.
Graph reduction, lambda lifing, etc - it seems pretty simple. Far too simple. It's hard to believe that decent performance is possible if all the work is done by a run-time graph reduction engine.
Simon Peyton Jones has written a couple of books on implementing functional languages which are available for free download. At a glance, they seem to covers similar topics in much more detail. However, they're from 1987 and 1992. Considering SPJs period "of despair" when he couldn't get practical performance for monadic I/O, these seem very dated.
Some time ago, I made a note to look up the book "Functional Programming and Parallel Graph Rewriting" (I forget why) but again that's from the early 90's. I've also got a note to look up Urban Boquists thesis.
SPJ also has some papers on compilation - http://research.microsoft.com/en-us/um/people/simonpj/papers/papers.html#com... - and the papers on optimisation by program transformation have caught my eye.
Are there any current text-books that describe the techniques used by compilers like GHC to generate efficient code from functional languages? It's OK to assume some knowledge of basic compiler theory - the important stuff is code generation and optimisation techniques for lazy functional languages in particular.
Also, what papers should I read? Am I on the right lines with the ones I've mentioned above?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.

On 11/01/2012 15:20, Thomas Schilling wrote:
Based on your stated background, the best start would be the (longer) paper on the Spineless Tagless G-machine [1]. Thanks for the tips. I haven't read much yet, but considering [1], I guess I shouldn't have dismissed SPJs early 90's stuff so quickly.
Should be interesting.

On Tue, Jan 10, 2012 at 9:25 AM, Steve Horne
Also, what papers should I read? Am I on the right lines with the ones I've mentioned above?
Thomas Schilling gave you a good response with papers so I will give you a different perspective on where to look. Most of the Haskell implementations were written by academics studying languages and compilers. This is good but it also implies that the implementors are likely to share biases and assumptions. I know of one Haskell compiler in particular that was written by someone who did not know Haskell when starting the project. The compiler was developed to be different than GHC. That person was John Meacham. He created JHC, a work in progress, so you might want to study his compiler and implementation notes as they should provide a different perspective on how to tackle Haskell implementation and optimization. http://repetae.net/computer/jhc/ I hope that helps, Jason

On 1/12/12 8:50 PM, Jason Dagit wrote:
On Tue, Jan 10, 2012 at 9:25 AM, Steve Horne
wrote: Also, what papers should I read? Am I on the right lines with the ones I've mentioned above?
Thomas Schilling gave you a good response with papers so I will give you a different perspective on where to look.
Most of the Haskell implementations were written by academics studying languages and compilers. This is good but it also implies that the implementors are likely to share biases and assumptions. I know of one Haskell compiler in particular that was written by someone who did not know Haskell when starting the project. The compiler was developed to be different than GHC. That person was John Meacham. He created JHC, a work in progress, so you might want to study his compiler and implementation notes as they should provide a different perspective on how to tackle Haskell implementation and optimization.
JHC is also notable as a point of contrast because GHC strives to have a uniform representation in order to simplify adding high-level optimizations, whereas JHC is especially focused on the low-level optimizations obtainable by using non-uniform representations. More and more of these representational issues have been creeping into GHC over the years, so you should definitely take a look at JHC to get a different perspective on the space of possibilities than just those illuminated by the trajectory of GHC. On the other end of things, if your heart lies in the compiler itself rather than the generated code per se, you should definitely take a look at UHC. We often talk about "Haskell" as if it were a language, when in fact it is a family of related languages with subtly different features and properties. One of the principal goals of EHC/UHC is to design a compiler as a series of small passes in order to better disentangle the issues surrounding trying to compile an entire family of languages. They also have some novel code for dealing with the parsing end of the compiler, which is worth exploring separately from the overall design. -- Live well, ~wren

JHC itself is based upon Boquist's GRIN language described in his PhD
thesis: Code Optimization Techniques for Lazy Functional Languages
http://mirror.seize.it/papers/Code%20Optimization%20Techniques%20for%20Lazy%...
On 13 January 2012 01:50, Jason Dagit
On Tue, Jan 10, 2012 at 9:25 AM, Steve Horne
wrote: Also, what papers should I read? Am I on the right lines with the ones I've mentioned above?
Thomas Schilling gave you a good response with papers so I will give you a different perspective on where to look.
Most of the Haskell implementations were written by academics studying languages and compilers. This is good but it also implies that the implementors are likely to share biases and assumptions. I know of one Haskell compiler in particular that was written by someone who did not know Haskell when starting the project. The compiler was developed to be different than GHC. That person was John Meacham. He created JHC, a work in progress, so you might want to study his compiler and implementation notes as they should provide a different perspective on how to tackle Haskell implementation and optimization.
http://repetae.net/computer/jhc/
I hope that helps, Jason
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.
participants (4)
-
Jason Dagit
-
Steve Horne
-
Thomas Schilling
-
wren ng thornton