
On Mon, Mar 19, 2012 at 9:12 PM, Chris Smith
On Mon, Mar 19, 2012 at 7:52 PM, Richard O'Keefe
wrote: As just one example, a recent thread concerned implementing lock-free containers. I don't expect converting one of those to OCaml to be easy...
If you translate to core first, then the only missing bit is the atomic compare-and-swap primop that these structures will depend on. Maybe that exists in OCaml, or maybe not... I wouldn't know. If not, it would be perfectly okay to refuse to translate the atomic compare-and-swap primop that lockless data structures will use. That said, though, there are literally *hundreds* of GHC primops for tiny little things like comparing different sized integers and so forth, that would need to be implemented.... all on top of the interesting task of doing language translation. That should be kept in mind when estimating the task.
If, however, you want to make it possible for someone to write code in a sublanguage of Haskell that is acceptable to a Haskell compiler and convert just *that* to OCaml, you might be able to produce something useful much quicker.
I'm quite sure, actually, that implementing a usable sublanguage of Haskell in this way would be a much larger project even than translating core. A usable sublanguage of Haskell would need a parser, which could be a summer project all on its own if done well with attention to errors and a sizeable test suite. It would need an implementation of lazy evaluation, which can be quite tricky to get right in a thread-safe and efficient way. It would need type checking and type inference that's just different enough from OCaml that you'd probably have to write a new HM+extensions type checker and inference engine on your own, and *that* could again be far more than a summer project on its own, if you plan to build something of production quality. It would need a whole host of little picky features that involve various kinds of desugarings that represent man-decades worth of work just on their own.
After a bit of thought, I'm pretty confident that the only reasonable way to approach this project is to let an existing compiler tackle the task of converting from Haskell proper to a smaller language that's more reasonable to think about (despite the problems with lots of primops... at least those are fairly mechanical). Not because of all the advanced language features or libraries, but just because re-implementing the whole front end of a compiler for even a limited but useful subset of Haskell is a ludicrously ambitious and risky project for GSoC.
One can get a prototype "up and running" with a relatively low amount of effort by translating either GHC's core or stg. While core isn't fully strict, it is a much easier input language than Haskell. Stg is even lower level and easier to translate to imperative machines. I've read two papers where translators were built on or in GHC using this approach in a period that I would assume to be similar to what GSoC provides. In the case of the JVM, performance was an issue and may not be with CLR. The JVM lacking tailcalls and having a GC tuned to the wrong use patterns made optimization hard. I guess the way closures/thunks are implemented on the JVM side can be problematic for GC too; making it easy to run out of permgen space. After reading some papers and talking to the relevant folks a bit, I think the hardest part of translating Haskell to the JVM is building the RTS support. I assume the same will be true, but with different details, in the case of .NET/CLR. Both of the projects I'm thinking of just worked on Haskell 98, but to be good for "real programs" you want to support lots of RTS features. Once you've solved the RTS problems well enough to get people's attention the hurdle becomes the semantics of the FFI. You'll want to be compatible with the other VM languages. To mirror the critiques of others on this thread, I too have concerns about the community impact of the proposed translator. I'd also like to see this proposal written against an existing FOSS project. For example, if the proposal was to dust off LambdaVM (http://wiki.brianweb.net/LambdaVM/LambdaVM), update it to ghc HEAD and make reasonable progress on the implementation that seems much more useful to me. With that said, actually finishing a Haskell -> .NET or Haskell -> JVM implementation to the point of usable seems to be a PhD worth of work, not a single summer of work even if F# or Scala is the target language. I could also imagine the proposal being tweaked to talk about some improvement in the internals of GHC that make targeting JVM/CLR easier, although I don't personally know enough about GHC internals to suggest anything. I hope that helps, Jason