Google Summer of Code idea of project & application

Dear members of Haskell Cafe, My name is Damien Desfontaines, and I'm currently following a Theoretical Computer Science Major at the École Normale Supérieure in Paris, which is one of the most selective universities in France. To complete my curriculum, I am to find a three-month internship related to something I have learnt, including lambda-calculus, compilers, graph theory, languages theory, and complexity & calculability theories. I wish to make it useful for the open-source community, for example by working for the Haskell group through the Google Summer of Code Program. The project I suggest is mainly inspired by Ticket #1555 [1] : I think that would be a great idea to make it possible to call some Haskell code into OCamL. In particular, this would contribute to the spreading of Haskell in countries where OCamL is proeminent, mainly France and Italy. The idea would be the following : building a translator which would turn Haskell code into (purely functional) OCamL code, in order to enable the use of Haskell functions and libraries within OCamL programs, in a "human-readable" way (the OCamL source code generated would ideally be understandable enough to be manually modified). I am well aware that one of the main issues lies in dealing with the conversion between the two types systems, especially because of Haskell's lazy evaluation and typeclasses. However, such a challenge really motivates me to spend hours looking for the most efficient and usable solution, I would really enjoy spending three months on such problems. However, I sincerely think I have the skills needed to go through such a project. Indeed, I have a long experience with OCamL. I used this language to automatically write the undecidable Gödel proposition in Peano arithmetics, using Tarski's original proof [2]. I have written a mini-Java compiler with it [3], a MISP microprocessor simulator along with three friends [4], and many other things for studies and personal entertainment. My experience of Haskell is shorter, I learnt it with two books, "Learn You a Haskell for Great Good !" and "Real World Haskell" (you are the co-author of this last book, aren't you ? Well, congratulations, it's really well-written and precise !). I use it mainly for mathematical-oriented works, and for coding competitions such as Google Jam or Prologin. I also have a little experience with open-source software : I have been using Debian Linux for 4 years, so I know the basics of UNIX administration. I know how to write a bug report, a man-page, a README, etc. I spend a lot of time helping others on french IRC channels, Linux- or programmation-oriented. I contribute to the Weboob project [5] by writing documentation and new modules. This, in my opinion, makes me quite autonomous, and able to solve issues by myself, by reading documentation, manuals or language specifications. Apart from that, I have never been engaged in development of "large" projects such as Haskell - but I am ready and excited to start ! I am confident I can interact efficiently with the Haskell community: if you would like me to create a blog syndicated to Planet Haskell, in which I would explain my work every one or two weeks, I can do that. I know how to read a man-page, a language specification or a system documentation, so I consider myself as quite autonomous to answer questions I could ask myself during a project. Furthermore, please let me know if you think my skills could be useful to another Haskell project. I look forward to the opportunity to discuss this idea with one or several developers, Damien Desfontaines [1] http://hackage.haskell.org/trac/summer-of-code/ticket/1555 [2] http://www.eleves.ens.fr/home/desfonta/godel/godel.ml [3] http://www.eleves.ens.fr/home/desfonta/ProjetCompil/ [4] https://github.com/kyoDralliam/CircuitSimulator [5] http://weboob.org/

On 19/03/2012, at 8:01 AM, Damien Desfontaines wrote:
The project I suggest is mainly inspired by Ticket #1555 [1] : I think that would be a great idea to make it possible to call some Haskell code into OCamL. In particular, this would contribute to the spreading of Haskell in countries where OCamL is proeminent, mainly France and Italy. The idea would be the following : building a translator which would turn Haskell code into (purely functional) OCamL code, in order to enable the use of Haskell functions and libraries within OCamL programs, in a "human-readable" way (the OCamL source code generated would ideally be understandable enough to be manually modified).
You might want to consider targeting F# as well as (or instead of) OCaml. I've had nothing but trouble with GODI, to the point where I gave up on OCaml entirely. On the other hand, F# came with Mono... F# has built-in support for lazy evaluation (although it is not the default), so this might simplify your task. Indeed, F# has comprehensions too, so the main impedance mismatch would be typeclasses. This would make an F# target a sensible half-way point for an OCaml target.

2012/3/19 Richard O'Keefe
On 19/03/2012, at 8:01 AM, Damien Desfontaines wrote:
The project I suggest is mainly inspired by Ticket #1555 [1] : I think that would be a great idea to make it possible to call some Haskell code into OCamL. In particular, this would contribute to the spreading of Haskell in countries where OCamL is proeminent, mainly France and Italy. The idea would be the following : building a translator which would turn Haskell code into (purely functional) OCamL code, in order to enable the use of Haskell functions and libraries within OCamL programs, in a "human-readable" way (the OCamL source code generated would ideally be understandable enough to be manually modified).
You might want to consider targeting F# as well as (or instead of) OCaml. I've had nothing but trouble with GODI, to the point where I gave up on OCaml entirely. On the other hand, F# came with Mono...
Thank you for answering that fast and for your advices. I'm afraid I have absolutely no experience with F#. I guess I can learn it in several months, I heard it is derived from OCaml, but I think I would be really less efficient working with a brand new language such as F# instead of OCaml, which I already master. But the real question is : what would be the most useful ? I am quite convinced that I could work faster and most efficiently with OCaml than with F#, but if the community considers that such a project would be far more useful if I would "translate" Haskell into F# instead of OCaml, I can adapt.
F# has built-in support for lazy evaluation (although it is not the default), so this might simplify your task. Indeed, F# has comprehensions too, so the main impedance mismatch would be typeclasses. This would make an F# target a sensible half-way point for an OCaml target.
OCaml has a built-in module for lazy evaluation as well (even if it is not in
the Pervasives (= default) module, and a syntax for list comprehensions as well.
So, in my opinion, the main challenge would be dealing with typeclasses, exactly
like in F#.
2012/3/19 Stephen Tetley
Hi Damien
A translator might be a lot of work.
Matthew Naylor had a translator between Haskell and Clean [1], which performed well according to [2]. The translator was his Master project in the UK so I think that means it would represent approximately a years work.
Thanks for your answer. I must admit that I do not really realize how much work such a project represents. I will probably need the help of someone who is more experienced than me to decide my timeline, and perhaps to restrict the final goal of my work (perhaps to a syntaxic subset of Haskell ?). If someone is interested in mentoring me for this work, I would be glad to discuss those technical details with him. Damien D.

Damien Desfontaines
Thanks for your answer. I must admit that I do not really realize how much work such a project represents. I will probably need the help of someone who is more experienced than me to decide my timeline, and perhaps to restrict the final goal of my work (perhaps to a syntaxic subset of Haskell ?).
I'll be a bit blunt, in the interest of encouraging you to be realistic before going too far down a doomed path. I can't imagine anyone at all thinking that a translator from a toy subset of Haskell into a different language would be useful in any way whatsoever. The goal of GSoC is to find a well-defined project that's reasonable for a summer, and is USEFUL to a language community. Restricting the project to some syntactic subset of Haskell is what people are *afraid* will happen, and why you've gotten some not entirely enthusiastic answers. It just won't do us any good, especially when there's no visible community of people ready to pick up the slack and finish the project later. One possible way out of this trap would be if, perhaps, the variant of Haskell you picked were actually GHC's core language. That could actually have a lot of advantages, such as avoid parsing entirely, removing type classes, laziness (I think... GHC did make the swap to strict core, didn't it?), and many other advanced type system features entirely, and being at least a potentially useful result that works with arbitrary code and all commonly used Haskell language extensions on top of the entire language. At least you are back into plausible territory. It still seems far too ambitious for GSoC, though. And I remain unconvinced how useful it really is likely to be. I'll grant there are other people that care a lot more about ML than I do. -- Chris Smith

On Mon, Mar 19, 2012 at 17:06, Chris Smith
One possible way out of this trap would be if, perhaps, the variant of Haskell you picked were actually GHC's core language. That could
(...)
It still seems far too ambitious for GSoC, though. And I remain unconvinced how useful it really is likely to be. I'll grant there are other people that care a lot more about ML than I do.
Core to F# gets GHC on the CLR, which if nothing else makes the Windows support situation for GHC look quite a lot better. (I am still bemused by MSR not being interested in better Windows support for GHC....) -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On scoping the project: be clear about the actual goal. If you want to take existing Haskell libraries and use them in OCaml, then you pretty much have to deal with the full language. You should start by using as much as you can of an existing compiler, or by getting an unmodified compiler to convert source code to "Core" syntax. As just one example, a recent thread concerned implementing lock-free containers. I don't expect converting one of those to OCaml to be easy... If, however, you want to make it possible for someone to write code in a sublanguage of Haskell that is acceptable to a Haskell compiler and convert just *that* to OCaml, you might be able to produce something useful much quicker. It has long been my opinion that the TV series "Butt-Ugly Martians" was inspired by OCaml syntax. I keep being attracted by other features of OCaml, but unable to bring myself to use its syntax. (I have much the same problem with F#, and if SML.NET -- www.cl.cam.ac.uk/research/tsg/SMLNET/ -- had been updated in the last nearly 6 years I would not be looking at F# at all.) Never mind popularising Haskell to OCaml users; this could be a way of popularising OCaml to Haskell users. One of the reasons Harrop gives for preferring F# to OCaml is that F# supports true multicore concurrency, while OCaml apparently still doesn't.

On Mon, Mar 19, 2012 at 7:52 PM, Richard O'Keefe
As just one example, a recent thread concerned implementing lock-free containers. I don't expect converting one of those to OCaml to be easy...
If you translate to core first, then the only missing bit is the atomic compare-and-swap primop that these structures will depend on. Maybe that exists in OCaml, or maybe not... I wouldn't know. If not, it would be perfectly okay to refuse to translate the atomic compare-and-swap primop that lockless data structures will use. That said, though, there are literally *hundreds* of GHC primops for tiny little things like comparing different sized integers and so forth, that would need to be implemented.... all on top of the interesting task of doing language translation. That should be kept in mind when estimating the task.
If, however, you want to make it possible for someone to write code in a sublanguage of Haskell that is acceptable to a Haskell compiler and convert just *that* to OCaml, you might be able to produce something useful much quicker.
I'm quite sure, actually, that implementing a usable sublanguage of Haskell in this way would be a much larger project even than translating core. A usable sublanguage of Haskell would need a parser, which could be a summer project all on its own if done well with attention to errors and a sizeable test suite. It would need an implementation of lazy evaluation, which can be quite tricky to get right in a thread-safe and efficient way. It would need type checking and type inference that's just different enough from OCaml that you'd probably have to write a new HM+extensions type checker and inference engine on your own, and *that* could again be far more than a summer project on its own, if you plan to build something of production quality. It would need a whole host of little picky features that involve various kinds of desugarings that represent man-decades worth of work just on their own. After a bit of thought, I'm pretty confident that the only reasonable way to approach this project is to let an existing compiler tackle the task of converting from Haskell proper to a smaller language that's more reasonable to think about (despite the problems with lots of primops... at least those are fairly mechanical). Not because of all the advanced language features or libraries, but just because re-implementing the whole front end of a compiler for even a limited but useful subset of Haskell is a ludicrously ambitious and risky project for GSoC. -- Chris Smith

On Mon, Mar 19, 2012 at 9:12 PM, Chris Smith
On Mon, Mar 19, 2012 at 7:52 PM, Richard O'Keefe
wrote: As just one example, a recent thread concerned implementing lock-free containers. I don't expect converting one of those to OCaml to be easy...
If you translate to core first, then the only missing bit is the atomic compare-and-swap primop that these structures will depend on. Maybe that exists in OCaml, or maybe not... I wouldn't know. If not, it would be perfectly okay to refuse to translate the atomic compare-and-swap primop that lockless data structures will use. That said, though, there are literally *hundreds* of GHC primops for tiny little things like comparing different sized integers and so forth, that would need to be implemented.... all on top of the interesting task of doing language translation. That should be kept in mind when estimating the task.
If, however, you want to make it possible for someone to write code in a sublanguage of Haskell that is acceptable to a Haskell compiler and convert just *that* to OCaml, you might be able to produce something useful much quicker.
I'm quite sure, actually, that implementing a usable sublanguage of Haskell in this way would be a much larger project even than translating core. A usable sublanguage of Haskell would need a parser, which could be a summer project all on its own if done well with attention to errors and a sizeable test suite. It would need an implementation of lazy evaluation, which can be quite tricky to get right in a thread-safe and efficient way. It would need type checking and type inference that's just different enough from OCaml that you'd probably have to write a new HM+extensions type checker and inference engine on your own, and *that* could again be far more than a summer project on its own, if you plan to build something of production quality. It would need a whole host of little picky features that involve various kinds of desugarings that represent man-decades worth of work just on their own.
After a bit of thought, I'm pretty confident that the only reasonable way to approach this project is to let an existing compiler tackle the task of converting from Haskell proper to a smaller language that's more reasonable to think about (despite the problems with lots of primops... at least those are fairly mechanical). Not because of all the advanced language features or libraries, but just because re-implementing the whole front end of a compiler for even a limited but useful subset of Haskell is a ludicrously ambitious and risky project for GSoC.
One can get a prototype "up and running" with a relatively low amount of effort by translating either GHC's core or stg. While core isn't fully strict, it is a much easier input language than Haskell. Stg is even lower level and easier to translate to imperative machines. I've read two papers where translators were built on or in GHC using this approach in a period that I would assume to be similar to what GSoC provides. In the case of the JVM, performance was an issue and may not be with CLR. The JVM lacking tailcalls and having a GC tuned to the wrong use patterns made optimization hard. I guess the way closures/thunks are implemented on the JVM side can be problematic for GC too; making it easy to run out of permgen space. After reading some papers and talking to the relevant folks a bit, I think the hardest part of translating Haskell to the JVM is building the RTS support. I assume the same will be true, but with different details, in the case of .NET/CLR. Both of the projects I'm thinking of just worked on Haskell 98, but to be good for "real programs" you want to support lots of RTS features. Once you've solved the RTS problems well enough to get people's attention the hurdle becomes the semantics of the FFI. You'll want to be compatible with the other VM languages. To mirror the critiques of others on this thread, I too have concerns about the community impact of the proposed translator. I'd also like to see this proposal written against an existing FOSS project. For example, if the proposal was to dust off LambdaVM (http://wiki.brianweb.net/LambdaVM/LambdaVM), update it to ghc HEAD and make reasonable progress on the implementation that seems much more useful to me. With that said, actually finishing a Haskell -> .NET or Haskell -> JVM implementation to the point of usable seems to be a PhD worth of work, not a single summer of work even if F# or Scala is the target language. I could also imagine the proposal being tweaked to talk about some improvement in the internals of GHC that make targeting JVM/CLR easier, although I don't personally know enough about GHC internals to suggest anything. I hope that helps, Jason

Hi Damien A translator might be a lot of work. Matthew Naylor had a translator between Haskell and Clean [1], which performed well according to [2]. The translator was his Master project in the UK so I think that means it would represent approximately a years work. [1] http://www-users.cs.york.ac.uk/~mfn/hacle/hacle.pdf [2] http://www.st.cs.ru.nl/papers/2010/groj10-Haskell_front_end_Clean.pdf The paper [2] covers the Clean groups own interop between Haskell and Clean and points to more related work alongside Matthew Naylor's. Best wishes Stephen
participants (6)
-
Brandon Allbery
-
Chris Smith
-
Damien Desfontaines
-
Jason Dagit
-
Richard O'Keefe
-
Stephen Tetley