
Is there some reason haskell binaries have to be statically linked? I can't seem to find a way to make them otherwise, at least with ghc.
The ELF dynamic linking format seems to be designed on the assumption of poor code reuse (e.g., C code) where calls from one module to another are rare and can therefore be expensive.
Haskell code has very high levels of reuse and calls from one module to another (especially calls into Prelude, List, Monad, etc) are very common so dynamic linking imposes a very high overhead.
This isn't a reason to not support dynamic linking but it's a reason to make it a low priority. Incidentally, it avoids confusion between the performance overhead of lazy evaluation and the performance overhead of dynamic linking of modular code.
Yes, and there are various other reasons too. There's a FAQ question in GHC's (rather well hidden) FAQ: http://www.haskell.org/ghc/docs/latest/html/users_guide/faq.html and a search through the glasgow-haskell-users archives will turn up previous discussions. Also, we do (or did) have support for dynamic libraries on Windows. Cheers, Simon

Is there some reason haskell binaries have to be statically linked?
It would not be entirely fair to lay all the blame for large Haskell binaries entirely at the door of static vs. dynamic linking. After all, the Haskell version is dynamically linked against exactly the same shared libraries as the C version, at least on my machine: ldd Hello (Hello.hs) libm.so.6 => /lib/libm.so.6 (0x40022000) libc.so.6 => /lib/libc.so.6 (0x40044000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) Of course, it is static linking against the *Haskell* runtime system, Prelude and Libraries that is the cause of binary bloat. Quite simply, lots of extra stuff is dragged in that isn't visible in the apparently simple source program. For instance, I can find all the following symbols in the binary for "hello world" (compiled with nhc98): putStr, shows, showChar, showParen, showString, fromCString, toCString, hGetFileName, hPutChar, hPutStr, error, flip, id, init, length, not, putChar, putStrLn, seq, show, subtract, exitWith, instance Bounded Int (maxBound, minBound), instance Enum Ordering (succ, pred, toEnum, fromEnum, enumFrom, enumFromThen, enumFromTo, enumFromThenTo), instance Enum ErrNo (succ, pred, toEnum, fromEnum, enumFrom, enumFromThen, enumFromTo, enumFromThenTo), instance Monad IO (>>=, >>, return, fail), instance Eq ErrNo (==, /=), instance Eq Int (==, /=), instance Eq Ordering (==, /=), instance Num Int (+, -, *, negate, abs, signum, fromInteger), instance Ord Int (compare, <, <=, >=, >, max, min), instance Show ErrNo (show, showsPrec, showList), instance Show IOError (show, showsPrec, showList), instance Show Int (show, showsPrec, showList) This is not the fault of any particular implementation - the ghc-built binary has a similar collection - rather it is dictated by the nature of the language and its standard libraries. Because Prelude functions are small and re-usable, they do get used all over the place in the implementation of other parts of the Prelude, so you end up with a huge dependency graph hiding underneath the simplest of calls. In fact, most of the extra stuff in "Hello World" is there purely to handle all possible error conditions in the I/O monad. Several years ago, Colin Runciman and I did the experiment of removing all the nice error-handling stuff from the prelude (and eliminating a few classes too I think), to see just how small we could squash "Hello World". The idea was to target embedded systems where memory is a scarce resource, and fancy error-reporting is pointless (a single red LED would do). IIRC, we managed to achieve a size of 25kb, compiled with nhc98, which don't forget includes a bytecode interpreter in the runtime system. Regards, Malcolm

Malcolm Wallace
For instance, I can find all the following symbols in the binary for "hello world" (compiled with nhc98):
putStr, shows, showChar, showParen, showString, fromCString, toCString, hGetFileName, hPutChar, hPutStr, error, flip, id, init, length, not, putChar, putStrLn, seq, show, subtract, exitWith, instance Bounded Int (maxBound, minBound), instance Enum Ordering (succ, pred, toEnum, fromEnum, enumFrom, enumFromThen, enumFromTo, enumFromThenTo), instance Enum ErrNo (succ, pred, toEnum, fromEnum, enumFrom, enumFromThen, enumFromTo, enumFromThenTo), instance Monad IO (>>=, >>, return, fail), instance Eq ErrNo (==, /=), instance Eq Int (==, /=), instance Eq Ordering (==, /=), instance Num Int (+, -, *, negate, abs, signum, fromInteger), instance Ord Int (compare, <, <=, >=, >, max, min), instance Show ErrNo (show, showsPrec, showList), instance Show IOError (show, showsPrec, showList), instance Show Int (show, showsPrec, showList)
[...]
In fact, most of the extra stuff in "Hello World" is there purely to handle all possible error conditions in the I/O monad.
How many of those class methods could actually be executed and how many are included just because the dictionary that contains them is included? And how many of those dictionaries are included just because they are reachable from the dictionary of a subclass? Some compilers for C++ and Java are claimed to discard member functions which cannot possibly be executed even if the corresponding vtable is kept. The simplest (least effective) way of doing this is to discard all enumFromThenTo methods if none of the live code refers to enumFromThenTo. More effective is to discard the Int instance of enumFromThenTo if all enumFromThenTo references can be shown to involve other types. -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/
participants (3)
-
Alastair Reid
-
Malcolm Wallace
-
Simon Marlow