GSOC idea: Haskell JVM bytecode library

Hi, I've got an idea for a Summer of Code project and I'd really appreciate some feedback on it. If people generally find it interesting, I'll go into more detail. GSoC: Haskell JVM bytecode library ================================== What ---- I'm thinking of writing a library for analyzing/generating/manipulating JVM bytecode. To be clear, this library would allow one to load and work with JVM classfiles; it wouldn't be a compiler, interpretor or a GHC backend. Motivation ---------- Over the past 20 or so years, the JVM has become a very popular platform. It's used in industry, on the web, and on mobile devices. The main programming language used is Java, but recently, quite a few new ones have turned up (including some functional ones): Scala, Clojure, Jython, Rhino, etc. All the languages compile to a single, universal, and very well documented format, namely classfiles. The problem with classfiles is that they're blackboxes. Once written, they can't be modified and they can only be read by the JVM. It's quite a shame, really, since we could do all sorts of interesting things with them. Possible uses ------------- If we had a well-documented and reliable way of working with classfiles, we could: * perform static analysis to identify bottle-necks, common errors, etc. * do further optimization pases over the binary; maybe, we could optimize for size more aggressively (UPX does this for win32 exes, why shouldn't it be done for classfiles?) * generate interfaces for the libraries directly from the binaries * visualize the structure of complex programs * modify existing programs to route-around errors * generate JVM code directly from Haskell programs * etc. This project aims to provide exactly this -- an easy way of working with JVM bytecode. Previous work ------------- Quite a few similar projects have existed in the past, most of them are mentioned on this page: http://www.haskell.org/haskellwiki/GHC:FAQ#Why_isn.27t_GHC_available_for_.NE... None of them have tried exactly this (some were more ambitious, some never bothered with providing a library), and most of them are undocumented and unmaintained. References ---------- JVM Spec: http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.html Cheers, Alex

This is certainly something I could use. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

On Fri, Mar 26, 2010 at 08:01:57PM +0000, Alexandru Scvortov wrote:
I'm thinking of writing a library for analyzing/generating/manipulating JVM bytecode. To be clear, this library would allow one to load and work with JVM classfiles; it wouldn't be a compiler, interpretor or a GHC backend.
I wrote a JVM classfile library as part of LambdaVM (which is actually a JVM backend for GHC, it is a bit bit-rotted though, I need to get back into it): http://darcs.brianweb.net/hsjava/ It requires my hsutils library (also available on darcs.brianweb.net). I haven't touched the code in a while and it might need a tweak or two with recent GHC versions. Drop me a line if you have any trouble getting it working (and patches are certainly welcome). There isn't much documentation, besides the code and a few example programs that dump a pretty printed version of the data structure. I'd be more than happy to give anybody who is interested in using it some pointers though. -Brian

We've used this library to generate a prototype JVM backend for UHC about a year ago, and it Just Worked. That was probably on 6.10 or 6.8. -chris On 26 mrt 2010, at 21:33, Brian Alliet wrote:
On Fri, Mar 26, 2010 at 08:01:57PM +0000, Alexandru Scvortov wrote:
I'm thinking of writing a library for analyzing/generating/manipulating JVM bytecode. To be clear, this library would allow one to load and work with JVM classfiles; it wouldn't be a compiler, interpretor or a GHC backend.
I wrote a JVM classfile library as part of LambdaVM (which is actually a JVM backend for GHC, it is a bit bit-rotted though, I need to get back into it):
http://darcs.brianweb.net/hsjava/
It requires my hsutils library (also available on darcs.brianweb.net). I haven't touched the code in a while and it might need a tweak or two with recent GHC versions. Drop me a line if you have any trouble getting it working (and patches are certainly welcome).
There isn't much documentation, besides the code and a few example programs that dump a pretty printed version of the data structure. I'd be more than happy to give anybody who is interested in using it some pointers though.
-Brian _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

How stable is it? Was it easy to use? Did it have enough documentation? Do you think it could use a rewrite? If so, what should be done differently? Could it be extended into something more? (sorry for the barrage of questions, but you're the one person I've seen so far, apart from the original programmer, that has experience with this) Alex On Friday 26 March 2010 21:27:26 you wrote:
We've used this library to generate a prototype JVM backend for UHC about a year ago, and it Just Worked. That was probably on 6.10 or 6.8.
-chris
On 26 mrt 2010, at 21:33, Brian Alliet wrote:
On Fri, Mar 26, 2010 at 08:01:57PM +0000, Alexandru Scvortov wrote:
I'm thinking of writing a library for analyzing/generating/manipulating JVM bytecode. To be clear, this library would allow one to load and work with JVM classfiles; it wouldn't be a compiler, interpretor or a GHC backend.
I wrote a JVM classfile library as part of LambdaVM (which is actually a JVM backend for GHC, it is a bit bit-rotted though, I need to get back into it):
http://darcs.brianweb.net/hsjava/
It requires my hsutils library (also available on darcs.brianweb.net). I haven't touched the code in a while and it might need a tweak or two with recent GHC versions. Drop me a line if you have any trouble getting it working (and patches are certainly welcome).
There isn't much documentation, besides the code and a few example programs that dump a pretty printed version of the data structure. I'd be more than happy to give anybody who is interested in using it some pointers though.
-Brian _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 26 mrt 2010, at 22:37, Alexandru Scvortov wrote:
How stable is it?
I don't know. I remember that we didn't have to change anything and that everything just worked.
Was it easy to use?
Actually yes, because:
Did it have enough documentation?
I think we used the Java documentation. The nice thing about the hsjava library is that it provides datatypes that closely match assembly.
Do you think it could use a rewrite? If so, what should be done differently?
Could it be extended into something more?
I guess it could be double-checked and released on hackage. Maybe add tail-call support to the library?
(sorry for the barrage of questions, but you're the one person I've seen so far, apart from the original programmer, that has experience with this)
Only very little experience. It was a project of a couple of weeks. -chris

Alexandru Scvortov wrote:
I'm thinking of writing a library for analyzing/generating/manipulating JVM bytecode. To be clear, this library would allow one to load and work with JVM classfiles; it wouldn't be a compiler, interpretor or a GHC backend.
You might be interested in http://semantic.org/jvm-bridge/. It's a bit bit-rotted, though. -- Ashley
participants (5)
-
Alexandru Scvortov
-
Ashley Yakeley
-
Brian Alliet
-
Chris Eidhof
-
John Meacham