
Gregory Wright wrote:
I have ghc-6.6 (darcs version from 20070405) running registerized on FreeBSD/amd64. The FreeBSD version is 6.2.
The problem with the compiler crash turned out to be simple. In the FreeBSD header file regex.h, regex_t is defined as
typedef struct { int re_magic; size_t re_nsub; /* number of parenthesized subexpressions */ __const char *re_endp; /* end pointer for REG_PEND */ struct re_guts *re_g; /* none of your business :-) */ } regex_t;
The problem is that the "re_magic" field is defined as an int. When building the .hc files on the i386 host, the re_nsub field is at an offset of 4. On the amd64 target, it is at an offset of 8. In the ghc binding to the regex functions, re_nsub is used to compute how much memory to allocate in a call to allocaBytes. This leads to garbage being passed to newPinnedByteArray#.
The fix is to patch libraries/base/Text/Regex/Posix.hs on the amd64 target:
--- libraries/base/Text/Regex/Posix.hs.sav Thu Apr 5 12:05:22 2007 +++ libraries/base/Text/Regex/Posix.hs Thu Apr 5 12:05:45 2007 @@ -106,7 +106,7 @@ regexec (Regex regex_fptr) str = do withCString str $ \cstr -> do withForeignPtr regex_fptr $ \regex_ptr -> do - nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 4)) regex_ptr + nsub <- ((\hsc_ptr -> peekByteOff hsc_ptr 8)) regex_ptr {-# LINE 109 "Posix.hsc" #-} let nsub_int = fromIntegral (nsub :: CSize) allocaBytes ((1 + nsub_int) * (16)) $ \p_match -> do
With this patch, we are pretty close.
Aha. Text/Regex/Posix.hs is generated from Text/Regex/Posix.hsc by hsc2hs, but this is done on the *host* rather than the *target* when bootstrapping, and thus generates the wrong results. If you'd run hsc2hs on the target, then Text/Regex/Posix.hs would have been correct, but you can't do this because hsc2hs is a Haskell program. You could take the .c file generated by hsc2hs on the host and compile/run it on the target, but that's a hassle, so instead our policy is that we don't rely on any hsc2hs-generated code for bootstrapping. Unfortunately I broke the rules by accident when I introduced the dependency on regex. I can't think of an easy way to enforce the rule, at least at the moment, since there are other hsc2hs-processed modules that we happen to not depend on in GHC (System.Time and System.CPUTime). This will be fixed as a side effect of http://hackage.haskell.org/trac/ghc/ticket/1160. Also after the base reorg we might find we have no hsc2hs-generated code left in base and we can disable hsc2hs to prevent this happening again. Cheers, Simon