
#5987: Too many symbols in ghc package DLL ---------------------------------+---------------------------------------- Reporter: igloo | Owner: Phyx- Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.5 Resolution: | Keywords: Operating System: Windows | Architecture: Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: 5355 Related Tickets: | Differential Rev(s): Wiki Page: | ---------------------------------+---------------------------------------- Changes (by Phyx-): * owner: => Phyx- Comment: I have recently taken a look at this and have an almost working version that should solve the problem once and for all. First, we don't actually have that many symbols to go over the limit. Or it seems we don't. If I measure amount of symbols in the input object files going into the link and the amount coming out, the difference is huge. Looking at it further this is because of two things. We never explicitly use `__declspec`, we just change the names of the functions to match the conventions that `__declspec` would use. This is fine, but it means that binutil's default of `--export-all-symbols` is still enabled. Which means, we'll re-export any symbol we link from archives as well. Given that `-dynamic` on Windows always produces an import library `.dll.a` and the search order for `ld` is {{{ libxxx.dll.a xxx.dll.a libxxx.a cygxxx.dll (*) libxxx.dll xxx.dll }}} Then we always end up picking the import lib. This is recursive, we link against `gmp`, `base` etc. By the time it gets to `GHC` the resulting import lib is huge and hence we blow passed the number of symbols. Also `kernel` and `gdi32` and `mingwex` etc are all import libraries for GCC. So we accumulate a ton of symbols from there as well. So the first thing my changes do is only export symbols defined in the input object files. This not only drastically reduces the size of the resulting DLLs and import libraries, it also pushes the number of symbols way way below the limit. In fact I got rid of `dll-split` all together and allow all symbols to go into the same dll and we end up with {{{ $ nm -g "R:\ghc\libHSghc-8.1-ghc8.1.20160617.dll" | wc -l 49610 }}} This down from ~240,000 (mingwex and mingw32 are huge for instance). The second thing my build changes do is that in order to prevent this from happening again, I implemented an automatic partitioning scheme which requires no special treatment of the split dlls. In case we hit the limit again, the build script will automatically detect this and do the following: It will split the symbols up per object file input. So that all symbols of the same object file are in the same DLL. Like @rassilon suggested before, I'm using `import libraries` to break the dependencies. So the specific grouping doesn't matter. The import libraries point to the location of the dll which contains the symbol: {{{ LIBRARY "libHSCabal-1.25.0.0-ghc8.1.20160617-pt2.dll" EXPORTS "__stginit_Cabalzm1zi25zi0zi0_DistributionziCompatziBinary" "__stginit_Cabalzm1zi25zi0zi0_DistributionziCompatziCopyFile" ... }}} And these are used to break the dependencies. We then end up with smaller dlls with the suffix `-pt<num>.dll` and their import libraries. The next step is to produce one large/merged import library with the name of the dll we were originally supposed to create. `libHSCabal-1.25.0.0-ghc8.1.20160617.dll.a` which is just a merging of the different `-pt` import files. This has the effect that when `-lHSCabal-1.25.0.0-ghc8.1.20160617` is used as the link argument (which we do), the import lib is found and the linker puts a reference to the right dlls. No extra/special handling is needed by any other tool. Using the import libraries essentially removes the limit, since each symbol is an object file in the archive. (note that while I recently added support for import libraries to GHCi, this support only extends to single dll import libraries. It needs some minor modifications to support this too but LD should work fine.) This works fine, and I can successfully compile a dynamic version of GHC and the program runs (but segfaults due to a piece of bit rotted code I'm looking at). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/5987#comment:54 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler