
Hi Ian, Simon, I have ghc-6.6 (darcs version from 20070405) running registerized on FreeBSD/amd64. The FreeBSD version is 6.2. The problem with the compiler crash turned out to be simple. In the FreeBSD header file regex.h, regex_t is defined as typedef struct { int re_magic; size_t re_nsub; /* number of parenthesized subexpressions */ __const char *re_endp; /* end pointer for REG_PEND */ struct re_guts *re_g; /* none of your business :-) */ } regex_t; The problem is that the "re_magic" field is defined as an int. When building the .hc files on the i386 host, the re_nsub field is at an offset of 4. On the amd64 target, it is at an offset of 8. In the ghc binding to the regex functions, re_nsub is used to compute how much memory to allocate in a call to allocaBytes. This leads to garbage being passed to newPinnedByteArray#. The fix is to patch libraries/base/Text/Regex/Posix.hs on the amd64 target: --- libraries/base/Text/Regex/Posix.hs.sav Thu Apr 5 12:05:22 2007 +++ libraries/base/Text/Regex/Posix.hs Thu Apr 5 12:05:45 2007 @@ -106,7 +106,7 @@ regexec (Regex regex_fptr) str = do withCString str $ \cstr -> do withForeignPtr regex_fptr $ \regex_ptr -> do - nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 4)) regex_ptr + nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 8)) regex_ptr {-# LINE 109 "Posix.hsc" #-} let nsub_int = fromIntegral (nsub :: CSize) allocaBytes ((1 + nsub_int) * (16)) $ \p_match -> do With this patch, we are pretty close. However, there still seems to be something wrong with the splitter. I can make a working registerized compiler if I set splitObjs=NO in build.mk, but it seems as if whatever is wrong with ghc-split shouldn't be too hard to fix. The splitting problem shows up as a linking failure. Some variables defined in the text section are changed from global symbols to local symbols by the splitter. An example (just one of several hundred symbols that are changed from global to local): From building ghc-6.6-20070405 on i386:
nm --defined-only libHSbase.a | grep "D "
<snip> 00000000 D base_TextziReadziLex_zdLr3bklvl122_closure and from building ghc-6.6-20070405 on amd64:
nm --defined-only libHSbase.a | grep "d "
<snip> 0000000000000000 d base_TextziReadziLex_zdLr3bklvl122_closure The "D" on i386 indicates a global symbol, the "d" on amd64 a local symbol. I've glanced at ghc-split.lprl, but on what files is it invoked? Can I run it from the command line on a file and see check what comes out? The file itself doesn't say what it expects as input, and the section of the Commentary on the splitter is more than terse. The linker is still broken (so no ghci): greenhouse-george> ghci ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.6.20770405, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help. ghc-6.6.20770405: internal error: R_X86_64_PC32 relocation out of range: __isthreaded = 0xfffffff800122aad (GHC version 6.6.20070405 for x86_64_unknown_freebsd) Please report this as a GHC bug: http://www.haskell.org/ghc/ reportabug Abort trap: 6 (core dumped) but I think I understand this. On FreeBSD mmap does not have the MAP_32BIT option that linux does to guarantee a mapping in first 2 GB of address space. But by supplying a hint address in the lower address space we can get the effect the MAP_32BIT option. I thought I had this fixed in the patch I applied to Linker.c, but I have obviously overlooked something. I'm continuing to work on the linker, and expect that it will be working soon. I'd appreciate a some guidance on the splitter question as I am entirely unfamiliar with it. Best Wishes, Greg