
All,
While compiling the new release (6.8.1, with both the primary source and the
extra libraries) I experienced a gcc error, gcc of course called by ghc.
I'm compiling 6.8.1 with ghc 6.6.1. The gcc version is 4.1.1.
I've seen similar behavior in the past where some sort of heap corruption
occurs within gcc and triggers an internal compiler error. So this may not
be a true ghc error at all, but one can't completely rule it out.
This type of error, especially appearing at random, is suggestive of a
memory hardware problem. To eliminate this possibility I ran memtest86 for
several hours on the machine. No errors were detected.
I'll repeat the build again from a fresh directory to see if the problem is
repeatable. Then I plan to run a build after upgrading gcc from 4.1.1 to
4.2.1. I'll report the results.
The environment is Linux using kernel 2.6.21.
The compilation command and the resulting error:
/usr/local/bin/ghc -H16m -O -istage1/utils -istage1/basicTypes
-istage1/types -istage1/hsSyn -istage1/prelude -istage1/rename
-istage1/typecheck -istage1/deSugar -istage1/coreSyn -istage1/vectorise
-istage1/specialise -istage1/simplCore -istage1/stranal -istage1/stgSyn
-istage1/simplStg -istage1/codeGen -istage1/main -istage1/profiling
-istage1/parser -istage1/cprAnalysis -istage1/ndpFlatten -istage1/iface
-istage1/cmm -istage1/nativeGen -Wall -fno-warn-name-shadowing
-fno-warn-orphans -Istage1 -cpp -fglasgow-exts -fno-generics -Rghc-timing
-I. -Iparser -package unix -ignore-package lang -recomp -Rghc-timing -H16M
'-#include "cutils.h"' -DUSING_COMPAT -i../compat -ignore-package Cabal
-c rename/RnSource.lhs -o stage1/rename/RnSource.o -ohi
stage1/rename/RnSource.hi
/tmp/ghc1316_0/ghc1316_0.hc: In function 'raVb_entry':
/tmp/ghc1316_0/ghc1316_0.hc:1983:0:
internal compiler error: in referenced_var_lookup, at tree-dfa.c:578
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.
<

Seth Kurtzberg wrote:
While compiling the new release (6.8.1, with both the primary source and the extra libraries) I experienced a gcc error, gcc of course called by ghc. I’m compiling 6.8.1 with ghc 6.6.1. The gcc version is 4.1.1.
I’ve seen similar behavior in the past where some sort of heap corruption occurs within gcc and triggers an internal compiler error. So this may not be a true ghc error at all, but one can’t completely rule it out.
This type of error, especially appearing at random, is suggestive of a memory hardware problem. To eliminate this possibility I ran memtest86 for several hours on the machine. No errors were detected.
Dedicated memory test programs are notoriously inadequate for finding bad memory, whereas both GHC and GCC are actually quite good at it :-)
I’ll repeat the build again from a fresh directory to see if the problem is repeatable. Then I plan to run a build after upgrading gcc from 4.1.1 to 4.2.1. I’ll report the results.
Thanks, I look forward to hearing the results. Cheers, Simon

Hi, I'm using heap profiling on AMD64, and I am getting some slightly strange results. Running the profiling, 'top' shows about 600Mb in use, but the resulting profile shows ~80Mb. Rerunning with -M200M results in an out-of-memory error. Could it be that the profile is calculated incorrectly for 64bit systems? -k -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde wrote:
I'm using heap profiling on AMD64, and I am getting some slightly strange results. Running the profiling, 'top' shows about 600Mb in use, but the resulting profile shows ~80Mb. Rerunning with -M200M results in an out-of-memory error.
Could it be that the profile is calculated incorrectly for 64bit systems?
It's probably correct. Profiling overheads add 50-100% to the space usage, which is subtracted from the profile. Copying GC will use about 3x the live data in normal operation: 2x due to the copying, and an extra 1x to allocate into, plus there may be some free memory due to fragmentation (hopefully not much). The +RTS -sstderr will give you an accurate figure of how much memory GHC's allocator is using. Cheers, Simon

Simon, At this point I don't believe the problem that I reported is related to ghc, although I'm repeating things to bolster that conclusion. (As an aside, except for memory testing, the manufacturing test suite for the product I'm about to discuss is written in Haskell with just a handful of situations that required using the FFI to call C++ or C functions.) I've done memory hardware testing in manufacturing situations, and until quite recently I would have agreed with your characterization of memory testing programs. (I understand your comment was not intended to be 100% serious, but I think it's worth answering regardless.) We, of course, keep statistics about the accuracy of the manufacturing line testing. With the most recent version of memtest86, we've found the rate of false negatives to have declined dramatically, and is now in the area of 1-2%. The increased accuracy, of course, has a cost; on the current platform a single testing round takes almost four hours, and I consider three rounds to be the minimum required for thorough testing. The point is that stand-alone memory testing is no longer useless, although of course it is not perfect. Seth Kurtzberg Software Engineer Specializing in Security, Reliability, and the Hardware/Software Interface -----Original Message----- From: Simon Marlow [mailto:simonmarhaskell@gmail.com] Sent: Monday, November 05, 2007 6:17 AM To: Seth Kurtzberg Cc: glasgow-haskell-users@haskell.org Subject: Re: 6.8.1 compilation error Seth Kurtzberg wrote:
While compiling the new release (6.8.1, with both the primary source and the extra libraries) I experienced a gcc error, gcc of course called by ghc. I'm compiling 6.8.1 with ghc 6.6.1. The gcc version is 4.1.1.
I've seen similar behavior in the past where some sort of heap corruption occurs within gcc and triggers an internal compiler error. So this may not be a true ghc error at all, but one can't completely rule it out.
This type of error, especially appearing at random, is suggestive of a memory hardware problem. To eliminate this possibility I ran memtest86 for several hours on the machine. No errors were detected.
Dedicated memory test programs are notoriously inadequate for finding bad memory, whereas both GHC and GCC are actually quite good at it :-)
I'll repeat the build again from a fresh directory to see if the problem is repeatable. Then I plan to run a build after upgrading gcc from 4.1.1 to 4.2.1. I'll report the results.
Thanks, I look forward to hearing the results. Cheers, Simon

Seth Kurtzberg wrote:
At this point I don't believe the problem that I reported is related to ghc, although I'm repeating things to bolster that conclusion.
(As an aside, except for memory testing, the manufacturing test suite for the product I'm about to discuss is written in Haskell with just a handful of situations that required using the FFI to call C++ or C functions.)
I've done memory hardware testing in manufacturing situations, and until quite recently I would have agreed with your characterization of memory testing programs. (I understand your comment was not intended to be 100% serious, but I think it's worth answering regardless.)
We, of course, keep statistics about the accuracy of the manufacturing line testing. With the most recent version of memtest86, we've found the rate of false negatives to have declined dramatically, and is now in the area of 1-2%. The increased accuracy, of course, has a cost; on the current platform a single testing round takes almost four hours, and I consider three rounds to be the minimum required for thorough testing.
Interesting... I might actually use memtest86 now, thanks! Simon

At this point I don't believe the problem that I reported is related to ghc, although I'm repeating things to bolster that conclusion.
(As an aside, except for memory testing, the manufacturing test suite for the product I'm about to discuss is written in Haskell with just a handful of situations that required using the FFI to call C++ or C functions.)
I've done memory hardware testing in manufacturing situations, and until quite recently I would have agreed with your characterization of memory testing programs. (I understand your comment was not intended to be 100% serious, but I think it's worth answering regardless.)
We, of course, keep statistics about the accuracy of the manufacturing
For those (if any) following my latest build saga :) After install gcc 4.2.1, and dispensing with extralibs, I was able to build 6.8.1 from source. (This is on an x86 linux box running the 2.6.21 with the preemptive scheduler.) (I mention the scheduler because I have a lurking suspicion that it is related to the fact that I see more seg faults and internal compiler errors than people I've communicated with running the same kernel and compiler but the default scheduler.) I did experience one seg fault during link, near the end of the build process. I restarted the build and it completed. I'm going to run the build on one of my other Linux boxes today with the same tools (gcc 4.2.1 and ghc 6.6.1) and see if there are any linker seg faults. I've tried to eliminate my memory hardware as a factor; of course, the only way to truly eliminate hardware is to get the same behavior on more than one box. -----Original Message----- From: Simon Marlow [mailto:simonmarhaskell@gmail.com] Sent: Tuesday, November 06, 2007 4:01 AM To: Seth Kurtzberg Cc: glasgow-haskell-users@haskell.org Subject: Re: 6.8.1 compilation error Seth Kurtzberg wrote: line
testing. With the most recent version of memtest86, we've found the rate of false negatives to have declined dramatically, and is now in the area of 1-2%. The increased accuracy, of course, has a cost; on the current platform a single testing round takes almost four hours, and I consider three rounds to be the minimum required for thorough testing.
Interesting... I might actually use memtest86 now, thanks! Simon
participants (3)
-
Ketil Malde
-
Seth Kurtzberg
-
Simon Marlow