External representation of GHC bytecode

Friends, I'd like to ask a possible quite ignorant question. Can GHC produce an external representation of GHC byte code, and load that byte code back up? If it's just an internal representation is there any good reason for this? It seems to me that this would be useful for at least a few reasons: 1. This could solve the Template Haskell on cross compiler's dilemma. Currently, the problem is that a cross compiler will produce object code for the target machine which cannot be run on the host machine. But if this were bytecode the problem would be trivial to solve. 2. You could transmit Haskell programs over the network to run on arbitrary machines. 3. As it stands GHC sometimes has long standing bugs where it produces incorrect machine code on some architectures for various (fairly boring) reasons. Interpreting bytecode might be less efficient but would at least ensure that you could, in principle, run GHC on a variety of different platforms until these bugs are finally squashed. Sean

On Mon, Dec 8, 2014 at 11:41 PM, Sean Seefried
I'd like to ask a possible quite ignorant question. Can GHC produce an external representation of GHC byte code, and load that byte code back up? If it's just an internal representation is there any good reason for this?
As I understand it, the only reason is that nobody has written the necessary code. I think there may even be an open bug mouldering on the GHC Trac, looking for someone to adopt it. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Tue, Dec 9, 2014 at 5:41 AM, Sean Seefried
1. This could solve the Template Haskell on cross compiler's dilemma. Currently, the problem is that a cross compiler will produce object code for the target machine which cannot be run on the host machine. But if this were bytecode the problem would be trivial to solve.
It still wouldn't be trivial since all kinds of platform assumptions are still baked into the code, for example the word size, sizes/alignment of foreign data types, endianness and all error constants in the base library. If you compile a program with the wrong constants you get very weird error messages and problems (I've had this problem). Bytecode also doesn't support certain constructs, like unboxed tuples, so it'd need to be extended if we wanted to run a whole program from bytecode. To do it properly would involve configuring packages for two different platforms, which would either mean supporting multiple versions/targets in a single GHC and package db, or a massive overhaul of how platform specifics are handled in libraries. Either way it'd be a lot more work than compiling Template Haskell code for the target and shipping it there to run it [1]. Eventually I'd still like to see multiple targets properly supported, for example for heterogeneous cloud haskell environments, but given the changes required in the Cabal and GHC, this looks like a strictly longer term option. luite [1] https://www.haskell.org/pipermail/ghc-devs/2014-December/007555.html

Thanks Luite,
You've obviously looked into this a lot more deeply than I have. In a
perfect world though,
wouldn't it be good for the bytecode to be platform independent?
Theoretically we know this to be
possible because the JVM has been around for a long time now. If the
bytecode could be made
to be platform independent -- and I know this would be a *lot* of work --
then we'd have a pretty good solution for Template Haskell with cross
compilers, wouldn't we?
Sean
On 9 December 2014 at 21:03, Luite Stegeman
On Tue, Dec 9, 2014 at 5:41 AM, Sean Seefried
wrote: 1. This could solve the Template Haskell on cross compiler's dilemma.
Currently, the problem is that a cross compiler will produce object code for the target machine which cannot be run on the host machine. But if this were bytecode the problem would be trivial to solve.
It still wouldn't be trivial since all kinds of platform assumptions are still baked into the code, for example the word size, sizes/alignment of foreign data types, endianness and all error constants in the base library. If you compile a program with the wrong constants you get very weird error messages and problems (I've had this problem). Bytecode also doesn't support certain constructs, like unboxed tuples, so it'd need to be extended if we wanted to run a whole program from bytecode.
To do it properly would involve configuring packages for two different platforms, which would either mean supporting multiple versions/targets in a single GHC and package db, or a massive overhaul of how platform specifics are handled in libraries. Either way it'd be a lot more work than compiling Template Haskell code for the target and shipping it there to run it [1]. Eventually I'd still like to see multiple targets properly supported, for example for heterogeneous cloud haskell environments, but given the changes required in the Cabal and GHC, this looks like a strictly longer term option.
luite
[1] https://www.haskell.org/pipermail/ghc-devs/2014-December/007555.html

On Tue, Dec 9, 2014 at 5:27 AM, Sean Seefried
If the bytecode could be made to be platform independent -- and I know this would be a *lot* of work -- then we'd have a pretty good solution for Template Haskell with cross compilers, wouldn't we?
I think you're missing the point a bit; such a setup would work for runghc, but TH needs to be aware of both the host (in this case that'd be a platform-independent VM bytecode) and the target (since it is generating AST splices for a specific target). The latter is much harder than the former. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

If what you say is true then I think I have some holes in my understanding
of TH. Aren't AST splices, by definition, just generated Haskell source
code? Isn't the whole idea that TH produces source code as if a programmer
wrote it in the first place? How is this platform dependent?
My understanding of Template Haskell was that something like the following
happened:
1. Code generating functions in a module (call it module M) are compiled to
an object file.
2. A modules, call it P, that wishes generate code imports module M and
runs this code at compile time.
But as far as I can tell this only generates *source code*. P then needs to
be compiled to object code.
The problem in the cross compiler situation is that the dynamic linker
can't load the code in M.o
since that has been compiled for the target machine, not the host.
Apart from that though, I don't see what is platform specific about the
code *generated* in module P by the functions imported from module M. It's
just source code isn't it? Not object code.
Sean
On 9 December 2014 at 22:02, Brandon Allbery
On Tue, Dec 9, 2014 at 5:27 AM, Sean Seefried
wrote: If the bytecode could be made to be platform independent -- and I know this would be a *lot* of work -- then we'd have a pretty good solution for Template Haskell with cross compilers, wouldn't we?
I think you're missing the point a bit; such a setup would work for runghc, but TH needs to be aware of both the host (in this case that'd be a platform-independent VM bytecode) and the target (since it is generating AST splices for a specific target). The latter is much harder than the former.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
participants (3)
-
Brandon Allbery
-
Luite Stegeman
-
Sean Seefried