
On Tue, Aug 27, 2002 at 05:19:15 +0100, Simon Marlow wrote:
'haskell98' packages, just like GHCi does. It *almost* works to do this, except that you get strange effects, one of which is that you have two copies of stdout each with their own buffer.
If it's not too much trouble, do you mind explaining why this is so? It's just to satisfy my curiosity; don't worry if it's too long-winded or contains really heavy wizardry :).
Going the final step and allowing linkage to the current binary is possible, it just means the linker has to know how to read the symbol table out of the binary, and you have to avoid running 'strip'. I believe reading the symbol table is quite straightforward, the main problem being that on Unix you don't actually know where the binary lives, so you have to wire it in or search the PATH.
You've already got the symbol table in memory though, right? Is it absolutely necessary to re-read the binary? BTW, I tried using objcopy (part of binutils) to 'merge' together several plugin modules by copying over all the symbols in a bunch of files to a single .o file. Loading that up using the GHCI linker didn't work :(. If there's no reason why it shouldn't work, I'll try again ... it's entirely possible that I stuffed up somewhere.
Another problem is that you don't normally link a *complete* copy of the base package into your binary, you only link the bits you need. Linking the whole lot would mean every binary would be about 10M; but doing this on the basis of a flag which you turn on when you want to do dynamic linking maybe isn't so bad.
How about a feature (maybe a tool separate to GHC) which can find the dependencies required for a particular symbol, and removes all the excess baggage? e.g. You have a program called, uhh, "Program", and a plugin called, uhh, "Plugin", with Program containing the symbols 1, 2, 3, and Plugin containing symbols A B C. Symbol "1" in Program uses the "head" function from the standard library, so you need to compile that into Program, and symbol "B" in Plugin uses the "tail" function, so you need to compile that in: Program: 1 head 2 3 Plugin: A B tail C That should work, no? Maybe it's even possible to do this right now using a combination of evil GHC hacks and binutils? However, then you have the problem that the RTS doesn't _know_ that it has to load the "tail" symbol when it loads the plugin. Program will just load symbols A, B, C, and then die a sad death when it realises it can't resolve the symbols (since the tail symbol required for B is missing). I guess you could work around this by using some "stub" function (like "dependentSymbols") which the linker first loads. In Plugin.hs: dependentSymbols = ["tail"] In Program.hs: loadModule "plugin" -- Load the symbols which A, B, C require loadFunction "dependentSymbols" resolveFunctions mapM_ (loadFunction) dependentSymbols -- Load A, B, C themselves mapM_ (loadFunction) ["A", "B", "C"] Hopefully I'm not describing non-issues here ...
- make your program into a collection of dynamically-linked libraries itself. i.e. have a little stub main() which links with the RTS, and loads up 'base' followed by your program when it starts. The startup cost would be high (we don't do lazy linking in Haskell), but you'd only get one copy of the base package and this is possible right now.
I was thinking of doing this when I started my own project. However, I don't think it's really acceptable, because: 1. You still need the base Haskell libraries on the system, which means that you either ship it with your application, or the user needs GHC installed on their system. (I'm a big fan of the "it should just work" principle when user downloads and installs applications.) If the user has GHC installed on their system, it probably also needs to be the same version of GHC, otherwise you will probably run into Bad Problems. 2. As you say, startup cost (time) is high. This is fine for some applications, but my next project will be invoked as a CGI, where the ~2 second overhead involved at startup really kills performance (to the point where it won't scale to handle lots of users). Of course, the big advantage is that you can do this right now.
- make GHC generate objects that play nicely with the standarad dynamic linker on your system. This is entirely non-trivial, I believe. See previous discussions on this list. However, it might get easier in the future; I'm currently working on removing the need to distinguish code from data in GHC's RTS, which will eliminate some of the problems.
Just a comment: it's, well, interesting how GHC has this
fantastic method of importing modules at runtime, which is
similar (at least in what it achieves) to the dynamic linker.
I dunno, it feels like reinvent-wheel syndrome. Not saying
that's a bad or good thing, just an observation.
--
#ozone/algorithm