
On Tue, 27 Aug 2002 17:19:15 +0100
"Simon Marlow"
Right, now what would it take to implement this. As Duncan points out, this is almost possible already using the GHCi dynamic linker, which is available to any program compiled with GHC via the FFI. The interface is fairly straightforward, eg:
[snip] This is what Andre Pang has done, modulo any changes between ghc 5.03 & 5.04. http://www.algorithm.com.au/wiki-files/hacking/haskell/chiba-0.2.tar.gz Andre says: The actual runtime loader itself is in a runtime_loader/ directory in the tarball. The best example of how to use it is in the tests/ChibaTest* files.
but the main problem is that the dynamic linker can't link new modules to symbols in the currently running binary. So, in order to link a new Haskell module, you first have to load up a fresh copy of the 'base' and 'haskell98' packages, just like GHCi does. It *almost* works to do this, except that you get strange effects, one of which is that you have two copies of stdout each with their own buffer.
This is exatly what Andre does in Chiba, he has to load extra copies of certian interface modules, but it is ok since the Haskell modules are stateless.
Going the final step and allowing linkage to the current binary is possible, it just means the linker has to know how to read the symbol table out of the binary, and you have to avoid running 'strip'. I believe reading the symbol table is quite straightforward, the main problem being that on Unix you don't actually know where the binary lives, so you have to wire it in or search the PATH.
Would it be easier/better to exlicitly specify a symbol map file for the linker to use to locate the appropriate points in the current binary. Then perhaps we need a flag to ask ghc to spit out a symbol map along with the .o. Alternatively there tools to extract the map from a .o, I don't know - I'm not a bin-utils guru!
Another problem is that you don't normally link a *complete* copy of the base package into your binary, you only link the bits you need. Linking the whole lot would mean every binary would be about 10M; but doing this on the basis of a flag which you turn on when you want to do dynamic linking maybe isn't so bad.
The only bit that I would want to include completely is the API module which would likely be quite small as it would only rexport other parts of the program though a smaller simpler interface. Ah, I see what you're saying now, we'd have to include the whole of the standard library, or indeed any library that we wanted the plugins to be able to use. The system's dynamic linker doesn't have this problem because it always has all of the libraries avaliable and just loads them on demand. With static linking we have to predict what would be wanted beforehand. Aaarg! Perhaps linking all of the standard library wouldn't be so bad (using a special flag of course) since only the bits that are used get loaded into memory, leaving just the large disk overhead.
There are a couple of other options:
- make your program into a collection of dynamically-linked libraries itself. i.e. have a little stub main() which links with the RTS, and loads up 'base' followed by your program when it starts. The startup cost would be high (we don't do lazy linking in Haskell), but you'd only get one copy of the base package and this is possible right now.
I don't understand this, would you mind explainging a bit more.
Summary: extending GHC's dynamic linker to be able to slurp in the symbol table from the currently running binary would be useful, and is a good bite-sized GHC hacker task. I can't guarantee that we'll get around to it in a timely fashion, but contributions are, as always, entirely welcome...
Having made the suggestion, I'ts only right that I contribute my (limited) skills. I have done some gdb hacking before (not out of choice you understand!) so I ought to know a bit about .o's ELF sections and such. Duncan