RE: stg_ap_v_ret porting crash: solved?

but
-------------------------------------------------------------- ---------- ===fptools== Recursively making `boot' in base haskell98 network haskell-src unix ... PWD = /scratch/users/eden/ghc-6.0.1/libraries -------------------------------------------------------------- ---------- -------------------------------------------------------------- ---------- ==fptools== gmake boot -r; in /scratch/users/eden/ghc-6.0.1/libraries/base -------------------------------------------------------------- ---------- ../../ghc/utils/ghc-pkg/ghc-pkg-inplace --update-package
Have you done 'make boot' in ghc/driver? Cheers, Simon

On Fri, 12 Sep 2003, Simon Marlow wrote:
---------- ../../ghc/utils/ghc-pkg/ghc-pkg-inplace --update-package
Have you done 'make boot' in ghc/driver?
mips-sgi-irix65: ---------------- 1.- OK. The "distrib/hc-build" migth not handle the dependencies soundly. Now I build on the target T as in the host H: cd hslibs/ && gmake boot && gmake cd ghc && gmake boot && gmake The "touch config.h" patch seems to solve the stg_ap_v_ret bug... 2.- Now the problem seems to be another one: bash-2.05$ ghc/compiler/ghc-inplace hello.hs crash, and tracing the core it seems a problem having to do with gmp software: bash-2.05a$ gdb ghc-6.0.1 core # 0x1153a208 in __decodeFloat () if I apply the "-v" flag, then bash-2.05$ ghc/compiler/ghc-inplace -v hello.hs bash-2.05a$ gdb ghc-6.0.1 core # 0x11554ae4 in __gmpn_tdiv_qr / ( qp=0x454ed518 ...) Any idea ? 3) Simon M. : For the time beeing this is the patch you can integrate safely into the CVS repository . I have read-only permissions. There is no doubt on it.

rmartine:
On Fri, 12 Sep 2003, Simon Marlow wrote:
---------- ../../ghc/utils/ghc-pkg/ghc-pkg-inplace --update-package
Have you done 'make boot' in ghc/driver?
Yes, I had this error too. I've solved it by adding: [] into driver/package.conf.inplace and touch driver/stamp-pkg-conf-rts It is not so nice that ghc-inplace dumps core if package.conf.inplace is an empty file... I think Ian has mentioned this too.
2.- Now the problem seems to be another one:
bash-2.05$ ghc/compiler/ghc-inplace hello.hs
crash, and tracing the core it seems a problem having to do with gmp software:
bash-2.05a$ gdb ghc-6.0.1 core
# 0x1153a208 in __decodeFloat ()
if I apply the "-v" flag, then
bash-2.05$ ghc/compiler/ghc-inplace -v hello.hs bash-2.05a$ gdb ghc-6.0.1 core
# 0x11554ae4 in __gmpn_tdiv_qr / ( qp=0x454ed518 ...)
Any idea ?
This is where I am now. It looks like a bug in the gmp software. It happens for me this happens when I try to rebuild the rts, near the end of distrib/hc-build (and I guess it would happen if you tried to just run hello.hs without rebuilding the libraries). This GDB was configured as "mips-sgi-irix6.5"... Core was generated by `ghc-6.0.1'. Program terminated with signal 10, Bus error. Reading symbols from /import/pill0/1/dons/lib/libgmp.so.4...done. Loaded symbols for /import/pill0/1/dons/lib/libgmp.so.4 Reading symbols from /usr/lib32/libm.so...done. Loaded symbols for /usr/lib32/libm.so Reading symbols from /usr/lib32/libdl.so...done. Loaded symbols for /usr/lib32/libdl.so Reading symbols from /usr/lib32/libc.so.1...done. Loaded symbols for /usr/lib32/libc.so.1 #0 0x00430768 in __gmpn_tdiv_qr (qp=0x41c0480, rp=0x41c0494, qxn=68975248, np=0x41c7a90, nn=1, dp=0x41c7aec, dn=1) at tdiv_qr.c:65 #1 0x0041ffc0 in __gmpz_tdiv_qr (quot=0x7fff2d70, rem=0x7fff2d80, num=0x7fff2d50, den=0x41c0480) at tdiv_qr.c:100 (gdb) list 100 } 101 TMP_FREE (marker); 102 return; 103 } 104 105 default: 106 { 107 int adjust; 108 TMP_DECL (marker); 109 TMP_MARK (marker); I'm looking in to this now. Note that I am using an external gmp: 4.1.2, along with gcc 3.3 and gnu binutils 2.14 (though using Irix tools doesn't seem to matter anymore). Cheers, Don

dons:
rmartine:
2.- Now the problem seems to be another one:
bash-2.05$ ghc/compiler/ghc-inplace hello.hs
crash, and tracing the core it seems a problem having to do with gmp software:
bash-2.05a$ gdb ghc-6.0.1 core
# 0x1153a208 in __decodeFloat ()
if I apply the "-v" flag, then
bash-2.05$ ghc/compiler/ghc-inplace -v hello.hs bash-2.05a$ gdb ghc-6.0.1 core
# 0x11554ae4 in __gmpn_tdiv_qr / ( qp=0x454ed518 ...)
Any idea ?
Well, I had an idea :) When you build an external gmp, you might just notice wizzing by some statements about gmp linking to mips64 directories. At least, I saw that. So I went and rebuilt libgmp with --build=mips-sgi-irix, so that it would explicitly use the mips32 assembly. A patch to GHC to make the internal gmp do this will appear soon. GMP ./configure assumes you want 64bit mips, even if you are building a 32bit mips GHC, using 32 bit tools. Actually building mips64 GHC is going to have to wait till 32 bit works (unless I get stuck). Anyway, this seems to solve the __gmpn_tdiv_qr problems... And we continue. For rmartine, I'm now at: ../../ghc/compiler/ghc-inplace -optc-O -optc-L/import/pill0/1/dons/lib -optc-Wall -optc-W -optc-Wstrict-prototypes -optc-Wmissing-prototypes -optc-Wmissing-declarations -optc-Winline -optc-Waggregate-return -optc-Wbad-function-cast -optc-I../includes -optc-I. -optc-Iparallel -optc-DCOMPILING_RTS -optc-fomit-frame-pointer -H16m -O -L/import/pill0/1/dons/lib -O2 -static -c GC.c -o GC.o gmake: *** [GC.o] Bus error (core dumped) gmake: Leaving directory `/import/pill0/1/dons/ghc/ghc-6.0.1/ghc/rts' $ gdb -c core ../compiler/stage1/ghc-6.0.1 GNU gdb 5.3 This GDB was configured as "mips-sgi-irix6.5"... Core was generated by `ghc-6.0.1'. Program terminated with signal 10, Bus error. Reading symbols from /usr/lib32/libm.so...done. Loaded symbols for /usr/lib32/libm.so Reading symbols from /usr/lib32/libdl.so...done. Loaded symbols for /usr/lib32/libdl.so Reading symbols from /usr/lib32/libc.so.1...done. Loaded symbols for /usr/lib32/libc.so.1 #0 0x11543374 in __decodeDouble () __decodeDouble is back in GHC code, so that at least moves us 1 step closer. Cheers, Don

dons:
rmartine:
On Fri, 12 Sep 2003, Simon Marlow wrote:
---------- ../../ghc/utils/ghc-pkg/ghc-pkg-inplace --update-package
Have you done 'make boot' in ghc/driver?
Yes, I had this error too. I've solved it by adding:
[] into driver/package.conf.inplace and touch driver/stamp-pkg-conf-rts
It is not so nice that ghc-inplace dumps core if package.conf.inplace is an empty file... I think Ian has mentioned this too.
As a final note on this bug, the core dumps disappear on mips-sgi-irix if I start from the beginning with 64bit code, i.e. mips64-sgi-irix. By setting -mabi=64 in CFLAGS, longs become 8 bytes, and bugs disappear. No need for my hack of tricking gmp into using 32 bit mips asm. This solves the package.conf bug, the __decodeFloat bug and the gmp div bug that mips people have encountered. So we'll see how far we get using 64 bits right from the start on this funny Irix thing. -- Don

On Sat, 11 Oct 2003, Donald Bruce Stewart wrote:
As a final note on this bug, the core dumps disappear on mips-sgi-irix if I start from the beginning with 64bit code, i.e. mips64-sgi-irix.
By setting -mabi=64 in CFLAGS, longs become 8 bytes, and bugs disappear. No need for my hack of tricking gmp into using 32 bit mips asm. This solves the package.conf bug, the __decodeFloat bug and the gmp div bug that mips people have encountered.
After edting by hand the configure script, to accept the "mips64-sgi-irix" platform, I got the next failure gcc -x c GHC/IOBase.hc -o GHC/IOBase.o -c -O -DNO_REGS -DUSE_MINIINTERPRETER -D__GLASGOW_HASKELL__=600 -O -DNO_REGS -DUSE_MINIINTERPRETER -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/ghc/includes -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/libraries/base/include -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/libraries/unix/include -mabi=64 -I. `echo | sed 's/^$/-DSTOLEN_X86_REGS=4/'` gcc -x c GHC/Int.hc -o GHC/Int.o -c -O -DNO_REGS -DUSE_MINIINTERPRETER -D__GLASGOW_HASKELL__=600 -O -DNO_REGS -DUSE_MINIINTERPRETER -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/ghc/includes -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/libraries/base/include -I/usr/users/eden/scratch/6.0.1/ghc-6.0.1/libraries/unix/include -mabi=64 -I. `echo | sed 's/^$/-DSTOLEN_X86_REGS=4/'` GHC/Int.hc: In function `sbsy_ret': GHC/Int.hc:3093: `int64ToIntegerzh_fast' undeclared (first use in this function) GHC/Int.hc:3093: (Each undeclared identifier is reported only once GHC/Int.hc:3093: for each function it appears in.) GHC/Int.hc: In function `sbst_ret': GHC/Int.hc:3170: `int64ToIntegerzh_fast' undeclared (first use in this function) gmake[1]: *** [GHC/Int.o] Error 1 gmake: *** [all] Error 1 gmake: Leaving directory `/scratch/users/eden/6.0.1/ghc-6.0.1/libraries'
So we'll see how far we get using 64 bits right from the start on this funny Irix thing.
-- Don

Hey Rafaelh, I will describe what I have done to reach the point I am at on mips-sgi-irix. The machine I am using reports as an "IRIX64 6.5 IP30 mips" machine. The host machine for all these builds was a i386-*-openbsd machine. Before you can start you need to install GNU tools. I used gmake 3.80 gcc 3.3 also, as a precaution, I installed the latest libgmp: gmp-4.1.2. On this machine at least I can set a compiler flag and get either 64 or 32 bit longs, corresponding to a 64 or 32 bit compiler. I tried 32 bits first. This is the default mode on the machine, and requires no special flags. --------------- On the .hc host --------------- * be careful on the host machine to write the TARGET_ARCH variable when generating hc files, as mipseb_TARGET_ARCH or you will get strange errors ----------- 32 bit mips ----------- * apply rmartine's latest CVS patch to MBlock.c * The build will go through once, but when you then have to rebuild the rts, pkg-conf will crash creating driver/package.conf.inplace. The solution is to manually echo "[]" > driver/package.conf.inplace, and to touch driver/stamp-pkg-conf-rts * The build will dump core in the rts. Fire up gdb and you'll see that it died in gmp code around "__gmpn_tdiv_qr". The solution is to build an external libgmp, and override the target architecture in the gmp build with ./configure --build=mips-sgi-irix. This forces gmp to use 32 bit assembly, rather than going for 64 bit code, which it does by default. * The build then crashes in __decodeDouble (), in GC.c, I think it was. Using an extenal gmp, and reconfiguring it for a variety of mips machines didn't solve this. So at this point I decided to try to build a 64 bit ghc. ------ 64 bit ------ * you have to add the same 64 bit fixes in cvs that were needed for alpha, and that were discussed a few weeks ago. The patches are attached, for MBlock.h, MBlock.c, PrimOps.h and RtsUtils.h * before you build, you have to tell ./configure's gcc that you want to use the 64 bit abi: export CFLAGS="-mabi=64" note that I still let the machine be detected as mips-sgi-irix. I haven't seen a problem with this yet, as there is so little mips code in ghc in the first place. * but we also have to tell the gcc that compiles ghc how to get 64 bit longs. So in mk/build.mk add: SRC_CC_OPTS+=-mabi=64 SRC_HC_OPTS+=-optc-mabi=64 The libraries and compiler go though nicely, although you may see warnings of the kind: ghc/Num.hc:5303: warning: this decimal constant is unsigned only in ISO C90 I am not sure if these are significant. Also, the bugs in the 32 bit build, with pkg-conf, gmp and decodeDouble do not occur. * My current bug: following distrib/hc-build, after bootstrapping ghc/ we rebuild the rts and libraries with the ghc binrary. AutoApply.hc is generated. ghc then can't compile this file. If you look carefully at the bottom of AutoApply.hc you'll see some strange characters: [ARG_8??] MK_SMALL_BITMAP(3,6), [ARG_8?8] MK_SMALL_BITMAP(3,2), [ARG_88?] MK_SMALL_BITMAP(3,4), [ARG_888] MK_SMALL_BITMAP(3,0), [ARG_8888] MK_SMALL_BITMAP(4,0), [ARG_88888] MK_SMALL_BITMAP(5,0), This is a problem! The ARG_xxx characters should be either N or P or a some other characters. Not strange ones as we see. This file is generated by utils/genapply. If you run it after it is built for .hc files, it generates these faulty chars. So, somehow the .hc files generated on the host machine are being miscompiled on the mips machine. The rest of AutoApply.hc is fine. Only the ARG_xxx symbols are wrong. If you copy in a working AutoApply.hc (e.g. the one on the host machine) then the file compiles, and the rts compiles. Then we try to compile the libraries with ghc. But ghc generated .hc files -fvia-C, and these files have this bug too! A faulty ghc has been built that inserts the wrong characters in some places in .hc files, but not all or even most. I haven't solved this, but am attempting to generated host .hc files on an 64 bit machine (ia64) to see if this is a 32/64 bit bug. Otherwise we may need SimonM. -- Don
participants (3)
-
dons@cse.unsw.edu.au
-
Rafael Martinez Torres
-
Simon Marlow