
I have also been looking into LLVM and I wondered if this had been looked at before. Michael T. Richter wrote:
I've been eyeing LLVM[1 http://llvm.org] as interesting technology -- snip step -- and couldn't help but immediately think of the possibility of one of the Haskell compiler projects providing an LLVM code generator. I think this would help in several areas:
* it could make porting the compiler to other architectures -- including oddball ones that would be too small to otherwise support -- easier;
C is far more ubiquitous than LLVM. If GCC adopts LLVM as its back-end then that could change, but for now, C is still the most common "portable assembler"
* it could leverage some of the really interesting work that's going on in optimisation technology by letting one VM's optimiser do the work for any number of languages;
I am especially interested in the global optimizations for a static LLVM program. I would be curious to see if there is any improvement in the performance of the final executables. However, with GHC emitting the generated C into a single file, I think that GCC is already able to perform most of the same kinds of optimizations.
* it could improve interaction between source code written in multiple languages.
LLVM is really low level, so you still have most of the same problems with data representation that you have with using C as the common language. However, there is one advantage over C: since LLVM lets you directly represent tail-calls you can support languages like Scheme and Haskell that depend on recursion.
Is this me opening up a Pandora's Box of ignorance here? Or is LLVM potentially interesting? (And were someone motivated into perhaps trying to make an LLVM back-end, where would one start to poke around in, say, the GHC codebase to even begin to implement this? And how insane would they be driven by the process?)
After looking into it, I thought that generating the LLVM bytecode or text representation would be much easier than trying to use the FFI to link against the LLVM libraries. So, I thought a good starting project would be to write a module that lets you manipulate the LLVM representation in Haskell including reading and writing it. From there you would be in a good starting position to try to make GHC emit LLVM bytecode instead of C or C--. I haven't looked at the GHC internals, but C-- and LLVM are very similar, so I expect that writing the LLVM back-end would be a fairly mechanical transformation of the C-- back-end. Another nice side-benifit of an LLVM library in Haskell is that you can use Haskell to write stand-alone optimization passes for the LLVM compiler. -- Alan Falloon