Re: RFC: ghc's dynamic linker

27 Aug 2002

      On Tue, 27 Aug 2002 17:19:15 +0100
"Simon Marlow"  wrote:
...
Right, now what would it take to implement this.  As Duncan points out,
this is almost possible already using the GHCi dynamic linker, which is
available to any program compiled with GHC via the FFI.  The interface
is fairly straightforward, eg:
[snip]

This is what Andre Pang has done, modulo any changes between ghc 5.03 & 5.04.
http://www.algorithm.com.au/wiki-files/hacking/haskell/chiba-0.2.tar.gz

Andre says:
  The actual runtime loader itself is in a runtime_loader/
  directory in the tarball.  The best example of how to use it is
  in the tests/ChibaTest* files.
...
but the main problem is that the dynamic linker can't link new modules
to symbols in the currently running binary.  So, in order to link a new
Haskell module, you first have to load up a fresh copy of the 'base' and
'haskell98' packages, just like GHCi does.  It *almost* works to do
this, except that you get strange effects, one of which is that you have
two copies of stdout each with their own buffer.
This is exatly what Andre does in Chiba, he has to load extra copies of
certian interface modules, but it is ok since the Haskell modules are
stateless.
...
Going the final step and allowing linkage to the current binary is
possible, it just means the linker has to know how to read the symbol
table out of the binary, and you have to avoid running 'strip'.  I
believe reading the symbol table is quite straightforward, the main
problem being that on Unix you don't actually know where the binary
lives, so you have to wire it in or search the PATH.
Would it be easier/better to exlicitly specify a symbol map file for the
linker to use to locate the appropriate points in the current binary.
Then perhaps we need a flag to ask ghc to spit out a symbol map along
with the .o. Alternatively there tools to extract the map from a .o, I
don't know - I'm not a bin-utils guru!
...
Another problem is that you don't normally link a *complete* copy of the
base package into your binary, you only link the bits you need.  Linking
the whole lot would mean every binary would be about 10M; but doing this
on the basis of a flag which you turn on when you want to do dynamic
linking maybe isn't so bad.
The only bit that I would want to include completely is the API module
which would likely be quite small as it would only rexport other parts
of the program though a smaller simpler interface.

Ah, I see what you're saying now, we'd have to include the whole of the
standard library, or indeed any library that we wanted the plugins to be
able to use. The system's dynamic linker doesn't have this problem
because it always has all of the libraries avaliable and just loads them
on demand. With static linking we have to predict what would be wanted
beforehand. Aaarg! Perhaps linking all of the standard library wouldn't
be so bad (using a special flag of course) since only the bits that are
used get loaded into memory, leaving just the large disk overhead.
...
There are a couple of other options:
- make your program into a collection of dynamically-linked
    libraries itself.  i.e. have a little stub main() which links
    with the RTS, and loads up 'base' followed by your program
    when it starts.  The startup cost would be high (we don't
    do lazy linking in Haskell), but you'd only get one copy of
    the base package and this is possible right now.
I don't understand this, would you mind explainging a bit more.
...
Summary: extending GHC's dynamic linker to be able to slurp in the
symbol table from the currently running binary would be useful, and is a
good bite-sized GHC hacker task.  I can't guarantee that we'll get
around to it in a timely fashion, but contributions are, as always,
entirely welcome...
Having made the suggestion, I'ts only right that I contribute my (limited)
skills. I have done some gdb hacking before (not out of choice you
understand!) so I ought to know a bit about .o's ELF sections and such.

Duncan

Re: RFC: ghc's dynamic linker

Duncan Coutts