
OK, thank you all for your responses. I'm impressed how fast you were.
2009/3/8 Rick R
http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours
I have it. However, Scheme is strict, which is the reason I didn't complete my reading. Maybe I am wrong.
Or go for the Real Thing: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-35.html#%_sec_5.5
I just opened the page. I'll see what I can take. Thank you.
2009/3/8 Jeremy Shaw
I can say from experience that the spineless-tagless-gmachine paper you referenced does contain all the information needed to implement an STG->C compiler. But, you should expect to spend a lot of time alternating between trying to implement the thing and rereading the paper.
So I just didn't tried hard enough. Time for a re-read, then!
Also, you may find that it is easier to do STG->Assembly at first (via harpy or something). With assembly it is easier to express exactly what you want to happen. With C you have to try to figure out how to trick the C compiler into doing your bidding.
Scary. I know only C.
Also, maybe it is more satisfying to output assembly, because then you really feel like you are writing a compiler ;)
Well, maybe later.
I am sure others will say that LLVM is a better option than C or assembly. I know almost nothing about LLVM, so they may be right.
Neither do I. But I definitely try this. One day.
-- jeremy
2009/3/8 Austin Seipp
Excerpts from Loup Vaillant's message of Sat Mar 07 17:32:48 -0600 2009:
Ideally, I would very much like to compile to C. [...] -> have some kind of FFI with C (I suppose it imply some support for unboxed values).
Unfortunately, after working on LHC, I can verifiably say (like all the researchers who wrote the papers which I *didn't* read beforehand,) that for a lot of purposes, C is an unsuitable fit for a compilers' target language. It works pretty well if you can make the source language execution model fit well enough, but if you can't, you've a price to pay (both in sanity and/or performance.)
Yep.
However, since your goal is *not* efficiency, you will be happy to know that the issues with compiling to C (like garbage collection and exception handling) can be worked around viably.
I hope so.
For garbage collection, please see.
"Accurate Garbage Collection in an Uncooperative Environment" - http://citeseer.ist.psu.edu/538613.html [...] http://code.haskell.org/ddc/ddc-head/runtime/
Thank you, I'll look into that.
You can find a lot of info on C-- here; I recommend the paper at the bottom to start off:
http://research.microsoft.com/en-us/um/people/simonpj/papers/c--/
I'll look at that. But I am not sure it is the simplest path for me: I know C, and I don't know C-- —yet.
http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/
I have it. Maybe I should study it a bit more.
As for the FFI, I don't really have any papers on 'how to implement an FFI', but Matthias Blume's paper might be of interest:
"No-Longer-Foreign: Teaching an ML compiler to speak c "natively"" http://ttic.uchicago.edu/~blume/papers/nlffi-entcs.pdf
libffi may also be worth investigating: http://sourceware.org/libffi/
I didn't know them, thank you.
I have chosen the STG machine because I thought i would be easier to get an FFI and case exprs out of it. Maybe I am wrong, and template instantiation is really easier. There is also the BRISK machine, and GRIN. But the paper I read were less readable to me than the three mentioned.
How far into GRIN have you looked?
Not far enough, it seems.
It is one of the intermediate languages for lhc/jhc, and for a low-level intermediate representation it works quite well I think; it is a strict, monadic intermediate language - being strict, we must still be able to model closures ('suspendeded applications' or thunks,) so we model closures/partial applications in a sort of 'defunctionalized' way (making grin a sort of 'defunctionalized intermediate language' really,) by turning them into something along the lines of an algebraic data type in GRIN. A good benefit of this is that you actually /don't/ have to write eval/apply yourself, because the compiler is supposed to generate it *for* you.
Here is my biggest problem with GRIN: it is not STG by far. I formatted my brain one way, so I find difficult to reformat it another way
In fact, this is a core part of GRIN - the generation of eval by the compiler is critical for many important optimizations to take place. It also makes the runtime a bit simpler.
If I find GRIN is overall simpler, I am happy.
This comes at the price that GRIN is by design a whole-program intermediate form; in order to pull off any of these sophisticated transformations everything must be in memory at once. But as we have seen with LHC/JHC, it can make a *huge* difference (apps that are 10x smaller *and* 10x faster than ones created by GHC.)
I intend my final language to be quite simple. That imply a relatively small standard library. So, maybe having all the program in memory will not be such a big deal.
Having dealt with GRIN as I work on LHC (when time now permits...) I can say that it's a reasonable strategy and in practice it turns out pretty well, but if you're not into optimizing the code, then STG might be a better fit.
OK, maybe I will choose STG for now, then.
I can't really give you that much advice on this area so much as you'll just have to choose. But GRIN does have at least two excellent references:
Code Optimisation Techniques for Lazy Functional Languages - U. Boquist, http://www.cs.chalmers.se/~boquist/phd/index.html
The GRIN Project: A Highly Optimising Back End For Lazy Functional Languages, U. Boquist, http://www.cs.chalmers.se/~boquist/ifl96-abstract.html
Oh, I forgot about the second paper, thank you. I find it a bit clearer than the thesis. Having the GRIN BNF helps a lot.
Then again you said you're not interested in optimization, so perhaps I'm just preaching to the totally wrong choir... :)
No, you're not. A long (looong) term goal of mine is having crazy optimizations. I will definitely explore that path.
Austin
2009/3/8 wren ng thornton
Loup Vaillant wrote:
-> support algebraic data types and case expressions (unless I can get away with encoding them as functions),
Which you always can, [snip]
The only trick is that you need to have closures (in order to build the Foos) and you need to have first-class functions (in order to pass the destructors in). If you're familiar with the STG machine, this is what they do under the covers anyways. At the core level, all you need is some primitive to force evaluation.
Ah, that is what I was missing. I thought I had to implement seq on top of case, not the other way around. Maybe that can make Template instantiation viable
~wren
Anyway, thank you all very much. Time for some reading. Cheers, Loup