
That's the bug. Fix coming! Simon On 02/09/13 05:46, Austin Seipp wrote:
I (think) I see the problem, but maybe I'm just tired and shooting in the dark.
The only time checkUnload really iteratively calls free is in CheckUnload.c (I say 'iteratively', because the fact you're touching/freeing blocks inside already free blocks make me suspicious.) The relevant code is:
--------------------------------------------------------------------------- // Look through the unloadable objects, and any object that is still // marked as unreferenced can be physically unloaded, because we // have no references to it. prev = NULL; for (oc = unloaded_objects; oc; prev = oc, oc = oc->next) { if (oc->referenced == 0) { if (prev == NULL) { unloaded_objects = oc->next; } else { prev->next = oc->next; } IF_DEBUG(linker, debugBelch("Unloading object file %s\n", oc->fileName)); freeObjectCode(oc); } else { IF_DEBUG(linker, debugBelch("Object file still in use: %s\n", oc->fileName)); } } ---------------------------------------------------------------------------
Note that we iterate over oc->next in order to check every unloadable object. If the object can be unloaded, we call freeObjectCode:
--------------------------------------------------------------------------- void freeObjectCode (ObjectCode *oc) { .... stgFree(oc->fileName); stgFree(oc->archiveMemberName); stgFree(oc); } ---------------------------------------------------------------------------
So it would seem we free the object we point to during each traversal. This is probably bad and could lead to very weird behavior probably.
Ryan, can you do one final thing? When you run that program, be sure to specify `+RTS -Dl` (must be linked with -debug.) This will enable all the debug output where the linker is concerned. There will be a few hundred lines just for initialization (based on my machine.) If my theory is correct, you'll probably see stuff like 'Unloading object file ...' right as the invalid read/segfault occurs.
On Sun, Sep 1, 2013 at 11:28 PM, Ryan Newton
wrote: Ah, yes I see. Well, giving it the proper arguments when running via valgrind puts me back to an "Invalid read" segfault. I confirmed that the linker_unload executable itself is 64 bit:
$ file linker_unload linker_unload: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
==72103== Command: ./linker_unload /home/beehive/ryan_scratch/ghc-working/libraries/base/dist-install/build/libHSbase-4.7.0.0.a /home/beehive/ryan_scratch/ghc-working/libraries/ghc-prim/dist-install/build/libHSghc-prim-0.3.1.0.a /home/beehive/ryan_scratch/ghc-working/libraries/integer-gmp/dist-install/build/libHSinteger-gmp-0.5.1.0.a ==72103== ==72103== Invalid read of size 8 ==72103== at 0x479F9F: checkUnload (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x4689DA: GarbageCollect (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x4621F0: scheduleDoGC (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x462314: performGC_ (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x403341: main (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== Address 0xf45ed70 is 80 bytes inside a block of size 120 free'd ==72103== at 0x4A063F0: free (vg_replace_malloc.c:446) ==72103== by 0x479F9E: checkUnload (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x4689DA: GarbageCollect (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x4621F0: scheduleDoGC (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x462314: performGC_ (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103== by 0x403341: main (in /home/beehive/ryan_scratch/ghc-working/testsuite/tests/rts/linker_unload) ==72103==
On Sun, Sep 1, 2013 at 11:01 PM, Austin Seipp
wrote: Oops, should have said this: if you checkout the Makefile for testsuite/tests/rts - at the very bottom - you'll see the linker_unload target. When run, the executable needs some arguments so it knows what to try and load:
--- ./linker_unload $(BASE) $(GHC_PRIM) $(INTEGER_GMP) ---
So you also need to provide the right arguments. Sorry about that!
On Sun, Sep 1, 2013 at 9:54 PM, Ryan Newton
wrote: Hi Austin,
Should have said -- this is 64-bit RHEL 6 (my academic departments standardized configuration).
$ uname -a Linux 2.6.32-358.14.1.el6.x86_64 #1 SMP Mon Jun 17 15:54:20 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
Weirdly it seems to have a different behavior when run by "make" and by hand. When I run the make command you provided it segfaults with error code 2:
cd . && $MAKE -s --no-print-directory linker_unload
linker_unload.run.stdout 2>linker_unload.run.stderr Wrong exit code (expected 0 , actual 2 ) Stdout: Stderr: make[1]: *** [linker_unload] Segmentation fault (core dumped) *** unexpected failure for linker_unload(normal) Unexpected results from: TEST="linker_unload"
But then when I run it by hand with "./linker_unload" or "valgrind ./linker_unload" I get an unknown symbol error with exit code 1:
==70613== linker_unload: Test.o: unknown symbol `base_GHCziNum_zdfNumInt_closure' linker_unload: resolveObjs failed ==70613== ==70613== HEAP SUMMARY:
-Ryan
On Sun, Sep 1, 2013 at 10:46 PM, Austin Seipp
wrote: I have also not seen this test fail on amd64/Linux since Simon committed it. From the valgrind output, it looks like your machine is 32bit, correct Ryan? Edward told me yesterday on IRC he saw this fail on 64bit Linux, so I'm a little confused.
Can you please try this?
$ cd testsuite/tests/rts $ make TEST="linker_unload" EXTRA_HC_OPTS="-debug" $ valgrind ./linker_unload
This will link you with a debug copy of the RTS, so Valgrind/GDB can relate errors back to the relevant source code. Perhaps this will help shed light on your problem.
On Sun, Sep 1, 2013 at 9:39 PM, Edward Z. Yang
wrote: However, as far as I can tell, it is not 100% reproduceable. In a recent validate of 5f98d44d8617756971cf47c040f2556de4e98f63, this test does not fail.
Edward
Excerpts from Edward Z. Yang's message of Fri Aug 30 21:55:29 -0700 2013: > Yes, this one is failing for me too. Probably related to the > recent object unload patch for > http://ghc.haskell.org/trac/ghc/ticket/8039 > > Excerpts from Ryan Newton's message of Fri Aug 30 21:51:24 -0700 > 2013: >> That test builds an executable named 'linker_unload' which >> segfaults >> for >> me. Valgrind says this: >> >> >> ==42800== Invalid read of size 8 >> ==42800== at 0x66945F: checkUnload (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x657F7A: GarbageCollect (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x651790: scheduleDoGC (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x6518B4: performGC_ (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x403BB1: main (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== Address 0x5bfdd20 is 80 bytes inside a block of >> size >> 120 >> free'd >> ==42800== at 0x4C273F0: free (vg_replace_malloc.c:446) >> ==42800== by 0x66945E: checkUnload (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x657F7A: GarbageCollect (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x651790: scheduleDoGC (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x6518B4: performGC_ (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> ==42800== by 0x403BB1: main (in >> >> >> /home/beehive/ryan_scratch/validate14/testsuite/tests/rts/linker_unload) >> >> This went the same across a couple different independent >> checkouts. >> >> -Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin - PGP: 4096R/0x91384671
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin - PGP: 4096R/0x91384671