Learning about haskell compilers

older
Re: [Haskell-cafe] Killer pickler...

Creighton Hogg

20 Dec 2005 20 Dec '05

4:36 p.m.

Hi guys, I was wondering where I should get started in learing about how to implement a haskell compiler? Are there papers, wiki entries, or other things people think would be helpful or should I just start looking at the source of one of the compilers?

Show replies by date

Jared Updike

20 Dec 20 Dec

4:59 p.m.

You are braver than me, but I must confess I've had the same desire. Here's a great place to start: Simon Peyton Jones, David Lester, Implementing functional languages: a tutorial http://research.microsoft.com/Users/simonpj/Papers/pj-lester-book/ It's long, (sort of old) and written in Miranda (TM of Research Ltd.) which is sort of a precursor to Haskell, but it should get you a lot closer to understanding how lazy, pure functional languages work inside. You could also learn from the code and documentation of the various implementations of Haskell: GHC, hugs, nhc, and - jhc (http://repetae.net/john/computer/jhc/) - YHC (http://www-users.cs.york.ac.uk/~ndm/yhc/) (an nhc derivatibe) including a portable bytecode compiler Or you could use GHC and hs-plugins to sort of embed a haskell compiler into any Haskell program. Jared. -- jupdike@gmail.com http://www.updike.org/~jared/ reverse ")-:" On 12/20/05, Creighton Hogg wrote:

...

Hi guys, I was wondering where I should get started in learing about how to implement a haskell compiler? Are there papers, wiki entries, or other things people think would be helpful or should I just start looking at the source of one of the compilers? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Neil Mitchell

7:40 p.m.

...

You could also learn from the code and documentation of the various implementations of Haskell: GHC, hugs, nhc, and

- YHC (http://www-users.cs.york.ac.uk/~ndm/yhc/) (an nhc derivatibe) including a portable bytecode compiler

With Yhc, there is also a quite useful wiki at http://www.haskell.org/hawiki/Yhc. In particular the runtime system is documented reasonably thorougly in http://www.haskell.org/hawiki/Yhc/RTS . There are also C, Python and Java implementations of the runtime system, which might help you figure out how it works. Thanks Neil

John Meacham

21 Dec 21 Dec

12:58 a.m.

On Tue, Dec 20, 2005 at 10:36:36AM -0600, Creighton Hogg wrote:

...

I was wondering where I should get started in learing about how to implement a haskell compiler?

Well, the way I learned was by just making the decision that one way or another, I was going to write a haskell compiler and then reading and reading and coding and tossing code and recoding until I succeeded. :) it is sort of how I learn, choose something I have no idea how to do, and do it anyway learning what I need to along the way out of neccesity. in any case, assuming you know something about basic compilers, lexing, parsing, and code generation and are interested in how you get parsed haskell source into a form that you can generate code from, I'd highly recommend the following (in order) (of course, this is just my experience) * The haskell report itself: this will give the rules on how to desugar 'complicated' haskell into simple haskell. things like getting rid of list comprehensions and do notation and turning complicated pattern matching into simple pattern matching (the most complicated of the desugaring transformations). I'd recommend desugaring and expanding type synonyms as soon as possible for your first attempt even though it gives worse error messages because it _greatly_ simplifies the later steps and you will likely go through several iterations of the typechecker before you get one you are happy with. you can always extend your typechecker later and postpone certain desugarings until you reach a balance of good error messages you are comfortable with. * typing haskell in haskell: http://www.cse.ogi.edu/~mpj/thih/ this will tell you how to go from haskell source to explicitly typed haskell source, which is now in a form suitable for transforming into an intermediate language. the explicitly typed, desugared haskell can be converted directly to core. It may take several readings to fully understand and looking at the source to hatchet could help considerably (it did for me) * then you have a couple choices, basically, You want to learn about ghc core. there are a lot of papers on ghc optimizations that are great because they give examples of how core is used which really helps get an intuitive understanding of why core is a good intermediate language: recommend are: Secrets of the GHC inliner. http://www.research.microsoft.com/~simonpj/Papers/inlining/index.htm this is by far the most important, the simplifier is the key to any other optimizations and this paper goes into a lot of the details other things leave out. http://haskell.org/ghc/docs/papers/unboxed-values.ps.gz very good paper. my main regret about jhc development is that I didn't incorperate unboxed values right from the start thinking of them as an optimization rather than an essential part of expressing program properties. I just accumulated hack-after-hack which would have gone away with unboxed values in the core. http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/float.ps.gz&pub=ACM let floating, not only an important optimization, but will get you thinking about what it means to preserve lazyness and some of the trickiness of not breaking performance by accident. http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/comp-by-trans.ps.gz&pub=18 great overview, either read this, or all of the above. (or ideally everything) starting out with this one is probably the best idea because it will give you an idea what you are in for. * next you have your simplified optimized core and you want to turn it into machine code, there are various choices you can make, the most well understood is G-machine based implementations like nhc or the more advanced STG machine used in ghc. the lester book mentioned in another email is a great introduction and highly recommended even if the source there is a lot of practical info out there on how to implement these and you have pretty much a straight shot from core to G-machine code either to be compiled directly or interpreted. for jhc I used boquist's GRIN: http://www.cs.chalmers.se/~boquist/phd/ which is not the simplest way of getting things working right away, but could be very interesting. there are various other abstract machines out there, the Lazy Virtual Machine used by Helium described in Daan Leijen's Phd thesis is quite interesting, and might make a better first target than G-machine code. so, I'd recommend, at least _understand_ the G-machine, enough to write an interpreter for it. (if you already have core, translating and interpreting G-machine code shouldn't be more than a few hundreds of lines of code). then read about the LVM, STG, and GRIN, and any other ones anyone recommends? and decide where you want to go from there. whichever one inspires you. Hopefully this helps you, this is just the route I would recommend based on my experience with jhc, I read a whole lot of papers and books, and there are a ton more great ones out there but I think this would be the easiest path to a simple haskell compiler that is based on concepts extensible to a full-scale non-toy implementation. I left out details that should be covered in any traditional compiler book and just mentioned the issues specific to a haskell compiler. I am sure someone can fill in the holes if needed. I wish you good luck, I started jhc a month after hearing about haskell and boy was it a great crash course in the language having to learn how to program in it and its details at the same time. The great news is that Haskell is a _superlative_ language to write compilers for any language in and is quite fun to boot. writing a compiler you won't find anywhere that haskell itself is lacking so can concentrate on the task at hand. John -- John Meacham - ⑆repetae.net⑆john⑈

Bernard Pope

4:58 a.m.

On Tue, 2005-12-20 at 16:58 -0800, John Meacham wrote:

...

On Tue, Dec 20, 2005 at 10:36:36AM -0600, Creighton Hogg wrote:

...
I was wondering where I should get started in learing about how to implement a haskell compiler?

Warning: a whole Haskell compiler is a LOT of work. Nonetheless there are examples of mostly-single-person compilers and interpreters out there, so it is possible to do one on your own. Though I don't think reading their source code is necessarily the best way to get started. I agree with what John said, especially this:

...

there are various other abstract machines out there, the Lazy Virtual Machine used by Helium described in Daan Leijen's Phd thesis is quite interesting, and might make a better first target than G-machine code.

If you want to write a compiler, targeting LVM is (I believe) the easiest way to get something working. You could get the source code for hatchet from him to give you a front end. Another approach is to write a simple interpreter for a small functional language and add features in bit-by-bit, as your enthusiasm dictates. That way, you get the satisfaction of having something work early on. If you write a compiler it might take weeks or months before it does anything interesting. Then you can custom build your language with whatever features take your fancy. For instance you can add a better record system, or play with meta-programming facilities. I started a little project like this a while ago, called baskell, which you can get from here: http://www.cs.mu.oz.au/~bjpop/code.html It has a rudimentary type checker, and a little REPL interface. Feel free to hack it to pieces. Cheers, Bernie.

Creighton Hogg

5:14 a.m.

On Wed, 21 Dec 2005, Bernard Pope wrote:

...

On Tue, 2005-12-20 at 16:58 -0800, John Meacham wrote:

...
On Tue, Dec 20, 2005 at 10:36:36AM -0600, Creighton Hogg wrote:

...
I was wondering where I should get started in learing about how to implement a haskell compiler?

Warning: a whole Haskell compiler is a LOT of work.

Oh I figured it would be, but I'm not really planning on implementing all of Haskell 98 by myself.

...

Nonetheless there are examples of mostly-single-person compilers and interpreters out there, so it is possible to do one on your own. Though I don't think reading their source code is necessarily the best way to get started.

I agree with what John said, especially this:

...
there are various other abstract machines out there, the Lazy Virtual Machine used by Helium described in Daan Leijen's Phd thesis is quite interesting, and might make a better first target than G-machine code.

If you want to write a compiler, targeting LVM is (I believe) the easiest way to get something working. You could get the source code for hatchet from him to give you a front end.

Another approach is to write a simple interpreter for a small functional language and add features in bit-by-bit, as your enthusiasm dictates. That way, you get the satisfaction of having something work early on. If you write a compiler it might take weeks or months before it does anything interesting. Then you can custom build your language with whatever features take your fancy. For instance you can add a better record system, or play with meta-programming facilities. I started a little project like this a while ago, called baskell, which you can get from here:

http://www.cs.mu.oz.au/~bjpop/code.html

It has a rudimentary type checker, and a little REPL interface. Feel free to hack it to pieces.

I really like your idea of implementing an interpreter for just some kindof functional language. That sounds like it'd be pretty instructive and have less frustration factor than a whole compiler. Thanks!

Bulat Ziganshin

3:38 p.m.

New subject: Re[2]: Learning about haskell compilers

Hello Creighton, Wednesday, December 21, 2005, 8:14:22 AM, you wrote: CH> I really like your idea of implementing an interpreter for CH> just some kindof functional language. That sounds like it'd CH> be pretty instructive and have less frustration factor than CH> a whole compiler. Thanks! see at the Parsec library. it contains as examples implementations of several functional and imperative languages and all implementations is just about 10-30kb Parsec, parsing combinators, and monads is very Haskellish things http://www.nomaware.com/monads/monad_tutorial.zip http://members.chello.nl/hjgtuyl/tourdemonad.html http://www.cs.nott.ac.uk/~gmh//pearl.hs http://www.cs.nott.ac.uk/~gmh//monparsing.pdf http://www.cs.nott.ac.uk/~gmh//parsing.pdf http://www.cs.nott.ac.uk/~gmh//pearl.pdf -- Best regards, Bulat mailto:bulatz@HotPOP.com

Creighton Hogg

5:11 a.m.

On Tue, 20 Dec 2005, John Meacham wrote:

...

On Tue, Dec 20, 2005 at 10:36:36AM -0600, Creighton Hogg wrote:

...
I was wondering where I should get started in learing about how to implement a haskell compiler? <Snip Absolute Awesomeness>

Wow! That was a great response, with more references than I could have ever hoped for. I am totally saving that e-mail. I have learned the basics of compilers. I just finished a class on it that I took basically for the purpose of being able to work on a functional language like haskell. I really want to be able to help with all the work in Haskell and help improve both its performance and capabilities. Haskell & lisp are the two languages I love most and I really want to be able to use them in my own native land of physics.

7137

Age (days ago)

7138

Last active (days ago)

List overview

Download

7 comments

7 participants

participants (7)

Bernard Pope
Bulat Ziganshin
Creighton Hogg
Creighton Hogg
Jared Updike
John Meacham
Neil Mitchell