
Several people have mentioned that LLVM 3.2 works with GHC. That has not been my experience. Using the quick-llvm BuildFlavour and LLVM 3.2 on Linux 64-bit, the stage2 compiler fails for me. Running the test suite with the stage 1 compiler unfortunately does not reveal any likely culprits (only the tests that rely on stuff built as part of the stage 2 compiler fail). Has anyone gotten LLVM 3.2 to work with the quick-llvm BuildFlavour? Thanks, Geoff

On 03/14/2013 02:01 PM, Jan Stolarek wrote:
Has anyone gotten LLVM 3.2 to work with the quick-llvm BuildFlavour? Yes. I can do both quick-llvm and perf-llvm build with LLVM 3.2. Surprisingly, yesterday I couldn't build GHC with LLVM 3.0 and I don't know why (it used to work).
Janek
Hm, you're sure that LLVM 3.2 is in your path when you configure GHC? This is HEAD? Linux x64? What is in your build.mk file? Thanks, Geoff

Hm, you're sure that LLVM 3.2 is in your path when you configure GHC? I removed LLVM 3.0 from my system so there's no possibility of mistaking 3.2 with 3.0. I'm also getting lots of compilation warnings about untested LLVM version - this didn't happen with 3.0.
This is HEAD? Yes. Commit 56353e3da9d5718dfd25e25ccf61c78b25deefe8
Linux x64? Yes:
[killy@xerxes : ~] uname -a Linux xerxes.discovery 2.6.37.6-24-desktop #1 SMP PREEMPT 2012-10-18 22:36:08 +0200 x86_64 x86_64 x86_64 GNU/Linux
What is in your build.mk file? The relevant parts are:
BuildFlavour = quick-llvm ifeq "$(BuildFlavour)" "quick-llvm" SRC_HC_OPTS = -H64m -O0 -fllvm GhcStage1HcOpts = -O -fllvm GhcStage2HcOpts = -O0 -fllvm GhcLibHcOpts = -O -fllvm SplitObjs = NO HADDOCK_DOCS = NO BUILD_DOCBOOK_HTML = NO BUILD_DOCBOOK_PS = NO BUILD_DOCBOOK_PDF = NO endif As you can see I'm not building the documentation. It's because it fails to build on my system and I don't care much about resolving that. Janek

Hm, you're sure that LLVM 3.2 is in your path when you configure GHC? I removed LLVM 3.0 from my system so there's no possibility of mistaking 3.2 with 3.0. I'm also getting lots of compilation warnings about untested LLVM version -
On 03/14/2013 02:15 PM, Jan Stolarek wrote: this didn't happen with 3.0.
This is HEAD? Yes. Commit 56353e3da9d5718dfd25e25ccf61c78b25deefe8
Linux x64? Yes:
[killy@xerxes : ~] uname -a Linux xerxes.discovery 2.6.37.6-24-desktop #1 SMP PREEMPT 2012-10-18
22:36:08 +0200 x86_64 x86_64
x86_64 GNU/Linux
What is in your build.mk file? The relevant parts are:
BuildFlavour = quick-llvm
ifeq "$(BuildFlavour)" "quick-llvm"
SRC_HC_OPTS = -H64m -O0 -fllvm GhcStage1HcOpts = -O -fllvm GhcStage2HcOpts = -O0 -fllvm GhcLibHcOpts = -O -fllvm SplitObjs = NO HADDOCK_DOCS = NO BUILD_DOCBOOK_HTML = NO BUILD_DOCBOOK_PS = NO BUILD_DOCBOOK_PDF = NO
endif
As you can see I'm not building the documentation. It's because it fails to build on my system and I don't care much about resolving that.
Janek
You don't have the following line? GhcLibWays = $(if $(filter $(DYNAMIC_BY_DEFAULT),YES),dyn,v) I ask because I am using a stock build.mk copied from build.mk.sample with BuildFlavour = quick-llvm, GHC HEAD, and LLVM 3.2, and my stage2 compiler crashes. It would be good to know *exactly* what the contents of your build.mk are. What version of GHC are you using to perform the build? Are you using parallel make? Thanks, Geoff

You don't have the following line?
GhcLibWays = $(if $(filter $(DYNAMIC_BY_DEFAULT),YES),dyn,v) I do. Sorry, I didn't notice it. I'm attaching my build.mk.
What version of GHC are you using to perform the build? [killy@xerxes : ~] ghc --version The Glorious Glasgow Haskell Compilation System, version 7.6.2.20130129
Are you using parallel make? Yes, I'm using make -j4
Janek

I was able to reproduce Geoffrey's failure on Mac OS X 10.8, with LLVM
3.2. The stage2 compiler eventually segfaults ("Segmentation Fault
11") during the build process after being compiled successfully with
stage1.
Something recently happened, because I was bootstrapping fine with
LLVM 3.2 recently after David's fixes landed (I filed a small raft of
tickets.) It's times like these I really wish we had a reliable 'git
bisect'...
I unfortunately haven't had time to dig into this, but I'll file a
ticket to track it this morning. I can also reproduce this on my
ARM/Linux machine. Previously, I got it to at least get done with
stage2, and fail later in DPH. Now it seems to fail earlier in the
same way the OS X build does.
OS's:
- 32bit ARM/Linux, Ubuntu 12.10 Linaro derivative; GCC 4.6.3.
Bootstrapping compiler is GHC 7.4.1, LLVM 3.2.
- 64bit OS X 10.8 Mountain Lion, llvm-gcc (XCode 4.6.) Bootstrapping
compiler is GHC 7.6.2, LLVM 3.2.
Also, @Jan: the warnings during the build process probably come from
your bootstrap compiler. The built compiler (stage1/stage2) both
support LLVM 3.2 directly and have correct version checks, but the
bootstrap compiler you're using won't. In practice this mismatch never
proved a problem in the past; just weeks ago the entire tree validated
with the LLVM build on OS X with no failures and I was working on ARM
things.
I'll file a ticket and dig in soon when I get a chance.
On Thu, Mar 14, 2013 at 9:29 AM, Geoffrey Mainland
Hm, you're sure that LLVM 3.2 is in your path when you configure GHC? I removed LLVM 3.0 from my system so there's no possibility of mistaking 3.2 with 3.0. I'm also getting lots of compilation warnings about untested LLVM version -
On 03/14/2013 02:15 PM, Jan Stolarek wrote: this didn't happen with 3.0.
This is HEAD? Yes. Commit 56353e3da9d5718dfd25e25ccf61c78b25deefe8
Linux x64? Yes:
[killy@xerxes : ~] uname -a Linux xerxes.discovery 2.6.37.6-24-desktop #1 SMP PREEMPT 2012-10-18
22:36:08 +0200 x86_64 x86_64
x86_64 GNU/Linux
What is in your build.mk file? The relevant parts are:
BuildFlavour = quick-llvm
ifeq "$(BuildFlavour)" "quick-llvm"
SRC_HC_OPTS = -H64m -O0 -fllvm GhcStage1HcOpts = -O -fllvm GhcStage2HcOpts = -O0 -fllvm GhcLibHcOpts = -O -fllvm SplitObjs = NO HADDOCK_DOCS = NO BUILD_DOCBOOK_HTML = NO BUILD_DOCBOOK_PS = NO BUILD_DOCBOOK_PDF = NO
endif
As you can see I'm not building the documentation. It's because it fails to build on my system and I don't care much about resolving that.
Janek
You don't have the following line?
GhcLibWays = $(if $(filter $(DYNAMIC_BY_DEFAULT),YES),dyn,v)
I ask because I am using a stock build.mk copied from build.mk.sample with BuildFlavour = quick-llvm, GHC HEAD, and LLVM 3.2, and my stage2 compiler crashes. It would be good to know *exactly* what the contents of your build.mk are.
What version of GHC are you using to perform the build? Are you using parallel make?
Thanks, Geoff
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin

Glad to know I'm not crazy... There is already a ticket #7694 about the failure bootstrapping with LLVM 3.2. Also, Jan, could you send a fingerprint of your build tree? You can use utils/fingerprint/fingerprint.py to generate one. If your tree works and ours doesn't, that might help us narrow down the bug. Thanks, Geoff On 03/14/2013 02:40 PM, Austin Seipp wrote:
I was able to reproduce Geoffrey's failure on Mac OS X 10.8, with LLVM 3.2. The stage2 compiler eventually segfaults ("Segmentation Fault 11") during the build process after being compiled successfully with stage1.
Something recently happened, because I was bootstrapping fine with LLVM 3.2 recently after David's fixes landed (I filed a small raft of tickets.) It's times like these I really wish we had a reliable 'git bisect'...
I unfortunately haven't had time to dig into this, but I'll file a ticket to track it this morning. I can also reproduce this on my ARM/Linux machine. Previously, I got it to at least get done with stage2, and fail later in DPH. Now it seems to fail earlier in the same way the OS X build does.
OS's:
- 32bit ARM/Linux, Ubuntu 12.10 Linaro derivative; GCC 4.6.3. Bootstrapping compiler is GHC 7.4.1, LLVM 3.2. - 64bit OS X 10.8 Mountain Lion, llvm-gcc (XCode 4.6.) Bootstrapping compiler is GHC 7.6.2, LLVM 3.2.
Also, @Jan: the warnings during the build process probably come from your bootstrap compiler. The built compiler (stage1/stage2) both support LLVM 3.2 directly and have correct version checks, but the bootstrap compiler you're using won't. In practice this mismatch never proved a problem in the past; just weeks ago the entire tree validated with the LLVM build on OS X with no failures and I was working on ARM things.
I'll file a ticket and dig in soon when I get a chance.
On Thu, Mar 14, 2013 at 9:29 AM, Geoffrey Mainland
wrote: Hm, you're sure that LLVM 3.2 is in your path when you configure GHC? I removed LLVM 3.0 from my system so there's no possibility of mistaking 3.2 with 3.0. I'm also getting lots of compilation warnings about untested LLVM version -
On 03/14/2013 02:15 PM, Jan Stolarek wrote: this didn't happen with 3.0.
This is HEAD? Yes. Commit 56353e3da9d5718dfd25e25ccf61c78b25deefe8
Linux x64? Yes:
[killy@xerxes : ~] uname -a Linux xerxes.discovery 2.6.37.6-24-desktop #1 SMP PREEMPT 2012-10-18
22:36:08 +0200 x86_64 x86_64
x86_64 GNU/Linux
What is in your build.mk file? The relevant parts are:
BuildFlavour = quick-llvm
ifeq "$(BuildFlavour)" "quick-llvm"
SRC_HC_OPTS = -H64m -O0 -fllvm GhcStage1HcOpts = -O -fllvm GhcStage2HcOpts = -O0 -fllvm GhcLibHcOpts = -O -fllvm SplitObjs = NO HADDOCK_DOCS = NO BUILD_DOCBOOK_HTML = NO BUILD_DOCBOOK_PS = NO BUILD_DOCBOOK_PDF = NO
endif
As you can see I'm not building the documentation. It's because it fails to build on my system and I don't care much about resolving that.
Janek
You don't have the following line?
GhcLibWays = $(if $(filter $(DYNAMIC_BY_DEFAULT),YES),dyn,v)
I ask because I am using a stock build.mk copied from build.mk.sample with BuildFlavour = quick-llvm, GHC HEAD, and LLVM 3.2, and my stage2 compiler crashes. It would be good to know *exactly* what the contents of your build.mk are.
What version of GHC are you using to perform the build? Are you using parallel make?
Thanks, Geoff

I'm attaching a fingerprint - is this OK? I'm quite puzzled about this, mostly because yesterday I couldn't build GHC using LLVM 3.0 - I'm attaching error messages in a separate file. It used to work about two weeks ago when I used GHC 7.4.2 + LLVM 3.0 to build myself an optimized version of GHC 7.6.2 (perf-llvm, official source snapshot from GHC download page). Janek

Where are all the fingerprints for the libraries? You only seem to have the submodule libraries in there... Geoff On 03/14/2013 03:00 PM, Jan Stolarek wrote:
I'm attaching a fingerprint - is this OK?
I'm quite puzzled about this, mostly because yesterday I couldn't build GHC using LLVM 3.0 - I'm attaching error messages in a separate file. It used to work about two weeks ago when I used GHC 7.4.2 + LLVM 3.0 to build myself an optimized version of GHC 7.6.2 (perf-llvm, official source snapshot from GHC download page).
Janek

I just tried building your fingerprinted tree here two different ways, and both failed: GHC 7.4.2 as bootstrap compiler + LLVM 3.2 GHC 7.6.2 as bootstrap compiler + LLVM 3.2 If you type llc -version at the command line, it really says it's 3.2? Geoff On 03/14/2013 03:06 PM, Jan Stolarek wrote:
Where are all the fingerprints for the libraries? You only seem to have the submodule libraries in there... Whoops, I ran the fingerprint script in the build tree which doesn't have symlinks to .git directories. Is this version of the fingerprint correct?
Janek

If you type llc -version at the command line, it really says it's 3.2? You don't seem to believe me :)
[killy@xerxes : ~] llc --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7 Registered Targets: arm - ARM cellspu - STI CBEA Cell SPU [experimental] cpp - C++ backend hexagon - Hexagon mblaze - MBlaze mips - Mips mips64 - Mips64 [experimental] mips64el - Mips64el [experimental] mipsel - Mipsel msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc64 - PowerPC 64 sparc - Sparc sparcv9 - Sparc V9 thumb - Thumb x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore [killy@xerxes : ~] opt --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7 So at this point we are clearly dealing with a system-specific problem. The possible differences that come to my mind are: - I'm using LLVM 3.2 compiled from source, while you might be using a pre-built version from the repository - And I'm also using GHC 7.6.2 that I compiled by myself, instead of pre-built binaries available at GHC web site. Are you using the binaries or do you also compiled your GHC from sources? Janek

On 03/14/2013 04:40 PM, Jan Stolarek wrote:
If you type llc -version at the command line, it really says it's 3.2? You don't seem to believe me :)
[killy@xerxes : ~] llc --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
Registered Targets: arm - ARM cellspu - STI CBEA Cell SPU [experimental] cpp - C++ backend hexagon - Hexagon mblaze - MBlaze mips - Mips mips64 - Mips64 [experimental] mips64el - Mips64el [experimental] mipsel - Mipsel msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc64 - PowerPC 64 sparc - Sparc sparcv9 - Sparc V9 thumb - Thumb x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore [killy@xerxes : ~] opt --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
So at this point we are clearly dealing with a system-specific
that come to my mind are: - I'm using LLVM 3.2 compiled from source, while you might be using a
repository - And I'm also using GHC 7.6.2 that I compiled by myself, instead of
Given that Austin and I have the stage 2 compiler failure and you don't, I think it is reasonable do double check :) problem. The possible differences pre-built version from the pre-built binaries available
at GHC web site. Are you using the binaries or do you also compiled your GHC from sources?
Janek
I built LLVM 3.2 from source, but from the release tarball, not subversion. Does your svn checkout correspond exactly to the source in the 3.2 release tarball? I also built both GHC 7.4.2 and 7.6.2 from source (release tarballs), both using the native back end. Since it's the stage 2 compiler that is failing, it's difficult to see why this would matter. Geoff

The LLVM 3.2 tarball has an annoying bug: it specifies the version as
'3.2svn' and not 3.2. So it's kind of difficult to distinguish them.
You can verify this by downloading the 3.2 tarball from their website
and looking at autoconf's AC_INIT line:
$ pwd
/Users/a/Downloads/llvm-3.2.src
$ grep 3.2svn autoconf/configure.ac
AC_INIT([LLVM],[3.2svn],[http://llvm.org/bugs/])
It's likely Jan is using the right version. It's annoying as hell this
bug is there, though* and LLVM developers don't generally do
point-releases or update the tarballs. It's probably stuck like this
until LLVM 3.3.
* Any interested parties can find a patch for the 3.2 tarball here,
but you'll of course have to apply manually and rebuild:
https://github.com/thoughtpolice/homebrew/blob/35d39a504e619a3443abae0e249b3...
On Thu, Mar 14, 2013 at 11:47 AM, Geoffrey Mainland
On 03/14/2013 04:40 PM, Jan Stolarek wrote:
If you type llc -version at the command line, it really says it's 3.2? You don't seem to believe me :)
Given that Austin and I have the stage 2 compiler failure and you don't, I think it is reasonable do double check :)
[killy@xerxes : ~] llc --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
Registered Targets: arm - ARM cellspu - STI CBEA Cell SPU [experimental] cpp - C++ backend hexagon - Hexagon mblaze - MBlaze mips - Mips mips64 - Mips64 [experimental] mips64el - Mips64el [experimental] mipsel - Mipsel msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc64 - PowerPC 64 sparc - Sparc sparcv9 - Sparc V9 thumb - Thumb x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore [killy@xerxes : ~] opt --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
So at this point we are clearly dealing with a system-specific problem. The possible differences that come to my mind are: - I'm using LLVM 3.2 compiled from source, while you might be using a pre-built version from the repository - And I'm also using GHC 7.6.2 that I compiled by myself, instead of pre-built binaries available at GHC web site. Are you using the binaries or do you also compiled your GHC from sources?
Janek
I built LLVM 3.2 from source, but from the release tarball, not subversion. Does your svn checkout correspond exactly to the source in the 3.2 release tarball?
I also built both GHC 7.4.2 and 7.6.2 from source (release tarballs), both using the native back end. Since it's the stage 2 compiler that is failing, it's difficult to see why this would matter.
Geoff
-- Regards, Austin

At least they didn't re-roll the release tarball a second time :) Would be good to confirm that we built from the same source tree. I am building LLVM HEAD right now and will try that with GHC. Geoff On 03/14/2013 04:54 PM, Austin Seipp wrote:
The LLVM 3.2 tarball has an annoying bug: it specifies the version as '3.2svn' and not 3.2. So it's kind of difficult to distinguish them. You can verify this by downloading the 3.2 tarball from their website and looking at autoconf's AC_INIT line:
$ pwd /Users/a/Downloads/llvm-3.2.src $ grep 3.2svn autoconf/configure.ac AC_INIT([LLVM],[3.2svn],[http://llvm.org/bugs/])
It's likely Jan is using the right version. It's annoying as hell this bug is there, though* and LLVM developers don't generally do point-releases or update the tarballs. It's probably stuck like this until LLVM 3.3.
* Any interested parties can find a patch for the 3.2 tarball here, but you'll of course have to apply manually and rebuild:
https://github.com/thoughtpolice/homebrew/blob/35d39a504e619a3443abae0e249b3...
On Thu, Mar 14, 2013 at 11:47 AM, Geoffrey Mainland
wrote: On 03/14/2013 04:40 PM, Jan Stolarek wrote:
If you type llc -version at the command line, it really says it's 3.2? You don't seem to believe me :)
Given that Austin and I have the stage 2 compiler failure and you don't, I think it is reasonable do double check :)
[killy@xerxes : ~] llc --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
Registered Targets: arm - ARM cellspu - STI CBEA Cell SPU [experimental] cpp - C++ backend hexagon - Hexagon mblaze - MBlaze mips - Mips mips64 - Mips64 [experimental] mips64el - Mips64el [experimental] mipsel - Mipsel msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc64 - PowerPC 64 sparc - Sparc sparcv9 - Sparc V9 thumb - Thumb x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore [killy@xerxes : ~] opt --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
So at this point we are clearly dealing with a system-specific problem. The possible differences that come to my mind are: - I'm using LLVM 3.2 compiled from source, while you might be using a pre-built version from the repository - And I'm also using GHC 7.6.2 that I compiled by myself, instead of pre-built binaries available at GHC web site. Are you using the binaries or do you also compiled your GHC from sources?
Janek
I built LLVM 3.2 from source, but from the release tarball, not subversion. Does your svn checkout correspond exactly to the source in the 3.2 release tarball?
I also built both GHC 7.4.2 and 7.6.2 from source (release tarballs), both using the native back end. Since it's the stage 2 compiler that is failing, it's difficult to see why this would matter.
Geoff

urgh... really need to get a LLVM build bot up and running.
I'm tied up for next week or two so won't be able to address this
soon. Thanks though Austin for your work here and everyone else, great
to have the pain shared :).
Cheers,
David
On 14 March 2013 10:00, Geoffrey Mainland
At least they didn't re-roll the release tarball a second time :)
Would be good to confirm that we built from the same source tree. I am building LLVM HEAD right now and will try that with GHC.
Geoff
On 03/14/2013 04:54 PM, Austin Seipp wrote:
The LLVM 3.2 tarball has an annoying bug: it specifies the version as '3.2svn' and not 3.2. So it's kind of difficult to distinguish them. You can verify this by downloading the 3.2 tarball from their website and looking at autoconf's AC_INIT line:
$ pwd /Users/a/Downloads/llvm-3.2.src $ grep 3.2svn autoconf/configure.ac AC_INIT([LLVM],[3.2svn],[http://llvm.org/bugs/])
It's likely Jan is using the right version. It's annoying as hell this bug is there, though* and LLVM developers don't generally do point-releases or update the tarballs. It's probably stuck like this until LLVM 3.3.
* Any interested parties can find a patch for the 3.2 tarball here, but you'll of course have to apply manually and rebuild:
https://github.com/thoughtpolice/homebrew/blob/35d39a504e619a3443abae0e249b3...
On Thu, Mar 14, 2013 at 11:47 AM, Geoffrey Mainland
wrote: On 03/14/2013 04:40 PM, Jan Stolarek wrote:
If you type llc -version at the command line, it really says it's 3.2? You don't seem to believe me :)
Given that Austin and I have the stage 2 compiler failure and you don't, I think it is reasonable do double check :)
[killy@xerxes : ~] llc --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
Registered Targets: arm - ARM cellspu - STI CBEA Cell SPU [experimental] cpp - C++ backend hexagon - Hexagon mblaze - MBlaze mips - Mips mips64 - Mips64 [experimental] mips64el - Mips64el [experimental] mipsel - Mipsel msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc64 - PowerPC 64 sparc - Sparc sparcv9 - Sparc V9 thumb - Thumb x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore [killy@xerxes : ~] opt --version LLVM (http://llvm.org/): LLVM version 3.2svn Optimized build with assertions. Built Mar 14 2013 (09:02:06). Default target: x86_64-unknown-linux-gnu Host CPU: corei7
So at this point we are clearly dealing with a system-specific problem. The possible differences that come to my mind are: - I'm using LLVM 3.2 compiled from source, while you might be using a pre-built version from the repository - And I'm also using GHC 7.6.2 that I compiled by myself, instead of pre-built binaries available at GHC web site. Are you using the binaries or do you also compiled your GHC from sources?
Janek
I built LLVM 3.2 from source, but from the release tarball, not subversion. Does your svn checkout correspond exactly to the source in the 3.2 release tarball?
I also built both GHC 7.4.2 and 7.6.2 from source (release tarballs), both using the native back end. Since it's the stage 2 compiler that is failing, it's difficult to see why this would matter.
Geoff
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Austin is correct - I have built LLVM from the tarball.
I also built both GHC 7.4.2 and 7.6.2 from source (release tarballs), both using the native back end. I used LLVM backend (perf-llvm) for mine GHC 7.6.2.
I'm attaching some notes I made yesterday and today when I was investigating my build failures with LLVM 3.0. Perhaps they will be helpful. Janek

Goeff, Austin, do the build errors always happen with the same module or do they occur randomly? If the former, which modules do you have problems with? Janek

My stage 2 compiler was crashing the first time it was invoked. I just finished building GHC HEAD using LLVM compiled from HEAD, and that worked, so perhaps this was just a 3.2 bug. I have yet to run the testsuite though. Geoff On 03/14/2013 07:16 PM, Jan Stolarek wrote:
Goeff, Austin,
do the build errors always happen with the same module or do they occur randomly? If the former, which modules do you have problems with?
Janek

My stage2 compiler got built and also fails on any compilation, no matter
how trivial. After linking stage2, my build fails with:
$ make
===--- building phase 0
make -r --no-print-directory -f ghc.mk phase=0 phase_0_builds
make[1]: Nothing to be done for `phase_0_builds'.
===--- building phase 1
make -r --no-print-directory -f ghc.mk phase=1 phase_1_builds
make[1]: Nothing to be done for `phase_1_builds'.
===--- building final phase
make -r --no-print-directory -f ghc.mk phase=final all
"inplace/bin/ghc-stage2" -static -H64m -O0 -fllvm -package-name
old-time-1.1.0.1 -hide-all-packages -i -ilibraries/old-time/.
-ilibraries/old-time/dist-install/build
-ilibraries/old-time/dist-install/build/autogen
-Ilibraries/old-time/dist-install/build
-Ilibraries/old-time/dist-install/build/autogen
-Ilibraries/old-time/include -optP-include
-optPlibraries/old-time/dist-install/build/autogen/cabal_macros.h -package
base-4.7.0.0 -package old-locale-1.0.0.5 -XHaskell98 -XCPP
-XForeignFunctionInterface -O -fllvm -no-user-package-db -rtsopts
-odir libraries/old-time/dist-install/build -hidir
libraries/old-time/dist-install/build -stubdir
libraries/old-time/dist-install/build -hisuf hi -osuf o -hcsuf hc -c
libraries/old-time/dist-install/build/System/Time.hs -o
libraries/old-time/dist-install/build/System/Time.o
make[1]: *** [libraries/old-time/dist-install/build/System/Time.o]
Segmentation fault: 11
Just compiling 'hello world' is enough to trigger it, however:
$ cat hi.hs
main = putStrLn "hello world"
$ ./inplace/bin/ghc-stage2 -v3 -fforce-recomp hi.hs
⏎
Glasgow Haskell Compiler, Version 7.7.20130313, stage 2 booted by GHC
version 7.6.2
Using binary package database:
/Users/a/ghc/ghc-pristine/inplace/lib/package.conf.d/package.cache
wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace
wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace
wired-in package base mapped to base-4.7.0.0-inplace
wired-in package rts mapped to builtin_rts
wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace
wired-in package dph-seq not found.
wired-in package dph-par not found.
Hsc static flags:
wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace
wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace
wired-in package base mapped to base-4.7.0.0-inplace
wired-in package rts mapped to builtin_rts
wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace
wired-in package dph-seq not found.
wired-in package dph-par not found.
*** Chasing dependencies:
Chasing modules from: *hi.hs
Stable obj: []
Stable BCO: []
Ready for upsweep
[NONREC
ModSummary {
ms_hs_date = 2013-03-14 19:58:30 UTC
ms_mod = main:Main,
ms_textual_imps = [import (implicit) Prelude]
ms_srcimps = []
}]
*** Deleting temp files:
Deleting:
compile: input file hi.hs
Created temporary directory:
/var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc39205_0
*** Checking old interface for main:Main:
[1 of 1] Compiling Main ( hi.hs, hi.o )
*** Parser:
*** Renamer/typechecker:
*** Desugar:
Result size of Desugar (after optimization)
= {terms: 7, types: 5, coercions: 0}
*** Simplifier:
[1] 39205 segmentation fault ./inplace/bin/ghc-stage2 -v3
-fforce-recomp hi.hs
It says the segfault occurs near the simplifier, but I'm a little skeptical
this is the actual cause. Digging further with LLDB on my Mac OS X machine,
I can see this:
$ lldb /Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2 --
-B/Users/a/ghc/ghc-pristine/inplace/lib -v3 -fforce-recomp hi.hs
Current executable set to
'/Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2' (x86_64).
(lldb) r
Process 39787 launched:
'/Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2' (x86_64)
Glasgow Haskell Compiler, Version 7.7.20130313, stage 2 booted by GHC
version 7.6.2
Using binary package database:
/Users/a/ghc/ghc-pristine/inplace/lib/package.conf.d/package.cache
wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace
wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace
wired-in package base mapped to base-4.7.0.0-inplace
wired-in package rts mapped to builtin_rts
wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace
wired-in package dph-seq not found.
wired-in package dph-par not found.
Hsc static flags:
wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace
wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace
wired-in package base mapped to base-4.7.0.0-inplace
wired-in package rts mapped to builtin_rts
wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace
wired-in package dph-seq not found.
wired-in package dph-par not found.
*** Chasing dependencies:
Chasing modules from: *hi.hs
Stable obj: []
Stable BCO: []
Ready for upsweep
[NONREC
ModSummary {
ms_hs_date = 2013-03-14 19:58:30 UTC
ms_mod = main:Main,
ms_textual_imps = [import (implicit) Prelude]
ms_srcimps = []
}]
*** Deleting temp files:
Deleting:
compile: input file hi.hs
Created temporary directory:
/var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc39787_0
*** Checking old interface for main:Main:
[1 of 1] Compiling Main ( hi.hs, hi.o )
*** Parser:
*** Renamer/typechecker:
*** Desugar:
Result size of Desugar (after optimization)
= {terms: 7, types: 5, coercions: 0}
*** Simplifier:
Process 39787 stopped
* thread #1: tid = 0x1c03, 0x0000000101f9a153 ghc-stage2`threadPaused +
163, stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff8)
frame #0: 0x0000000101f9a153 ghc-stage2`threadPaused + 163
ghc-stage2`threadPaused + 163:
-> 0x101f9a153: movl -8(%rax), %ecx
0x101f9a156: addl $-31, %ecx
0x101f9a159: cmpl $7, %ecx
0x101f9a15c: ja 0x101f9a2ed ; threadPaused + 573
So the crash is actually happening in the RTS. Looking at the disassembly
of the current frame, we see:
(lldb) disassemble -f
ghc-stage2`threadPaused:
0x101f9a0b0: pushq %rbp
0x101f9a0b1: pushq %r15
0x101f9a0b3: pushq %r14
0x101f9a0b5: pushq %r13
0x101f9a0b7: pushq %r12
0x101f9a0b9: pushq %rbx
0x101f9a0ba: subq $56, %rsp
0x101f9a0be: movq %rsi, %rbx
0x101f9a0c1: movq %rdi, 48(%rsp)
0x101f9a0c6: movq %rbx, %rsi
0x101f9a0c9: callq 0x101f8f350 ;
maybePerformBlockedException
0x101f9a0ce: cmpw $3, 32(%rbx)
0x101f9a0d3: je 0x101f9a540 ; threadPaused + 1168
0x101f9a0d9: movq 24(%rbx), %rax
0x101f9a0dd: movl 8(%rax), %ecx
0x101f9a0e0: leaq 24(%rax,%rcx,8), %rcx
0x101f9a0e5: movq %rcx, 40(%rsp)
0x101f9a0ea: movq 16(%rax), %r14
0x101f9a0ee: xorl %eax, %eax
0x101f9a0f0: movl %eax, 12(%rsp)
0x101f9a0f4: leaq 1109(%rip), %r15 ; threadPaused + 1184
0x101f9a0fb: leaq 102230(%rip), %r12 ; stg_WHITEHOLE_info
0x101f9a102: movl %eax, 8(%rsp)
0x101f9a106: movl %eax, 32(%rsp)
0x101f9a10a: movl %eax, %r13d
0x101f9a10d: jmpq 0x101f9a305 ; threadPaused + 597
0x101f9a112: movslq (%r15,%rcx,4), %rcx
0x101f9a116: addq %r15, %rcx
0x101f9a119: jmpq *%rcx
0x101f9a11b: cmpq 1097902(%rip), %rax ; (void
*)0x0000000101fb4a28: stg_marked_upd_frame_info
0x101f9a122: jne 0x101f9a164 ; threadPaused + 180
0x101f9a124: testl %r13d, %r13d
0x101f9a127: je 0x101f9a310 ; threadPaused + 608
0x101f9a12d: movl 32(%rsp), %r13d
0x101f9a132: addl %r13d, 8(%rsp)
0x101f9a137: addl $2, 12(%rsp)
0x101f9a13c: jmpq 0x101f9a310 ; threadPaused + 608
0x101f9a141: nopw %cs:(%rax,%rax)
0x101f9a150: movq (%r14), %rax
-> 0x101f9a153: movl -8(%rax), %ecx
0x101f9a156: addl $-31, %ecx
0x101f9a159: cmpl $7, %ecx
0x101f9a15c: ja 0x101f9a2ed ; threadPaused + 573
... lots more code
The segfaulting instruction attempts to load ECX from RAX, but RAX is null,
hence the access violation on 0xfffffffffffffff8
(lldb) register read
General Purpose Registers:
rax = 0x0000000000000000
rbx = 0x00000001061d4000
rcx = 0x00000000fffffff0
rdx = 0x00000001024e10a0 ghc-stage2`large_alloc_lim
rdi = 0x00000001024d8540 ghc-stage2`MainCapability
rsi = 0x00000001061d4000
rbp = 0x00000001020c6760
ghczm7zi7zi20130313_Demand_cprProdSig_closure + 16
rsp = 0x00007fff5fbfb3f0
r8 = 0x00000001020c8789
ghczm7zi7zi20130313_IdInfo_MayHaveCafRefs_closure + 1
r9 = 0x00000001020c8779
ghczm7zi7zi20130313_IdInfo_NoLBVarInfo_closure + 1
r10 = 0x0000000100685920 ghc-stage2`s5or_info
r11 = 0x00000001051e98c8
r12 = 0x0000000101fb3058 ghc-stage2`stg_WHITEHOLE_info
r13 = 0x0000000000000000
r14 = 0x00000001020c67d8
ghczm7zi7zi20130313_Demand_worthSplittingThunk_closure + 8
r15 = 0x0000000101f9a550 ghc-stage2`threadPaused + 1184
rip = 0x0000000101f9a153 ghc-stage2`threadPaused + 163
rflags = 0x0000000000010287
cs = 0x000000000000002b
fs = 0x0000000000000000
gs = 0x0000000000000000
So there's still something weird going on. Looking at rts/ThreadPaused.c in
threadPaused, we see some code like:
while ((P_)frame < stack_end) {
info = get_ret_itbl(frame);
switch (info->i.type) {
case UPDATE_FRAME:
// If we've already marked this frame, then stop here.
if (frame->header.info == (StgInfoTable
*)&stg_marked_upd_frame_info) {
if (prev_was_update_frame) {
words_to_squeeze += sizeofW(StgUpdateFrame);
weight += weight_pending;
weight_pending = 0;
}
goto end;
}
SET_INFO(frame, (StgInfoTable *)&stg_marked_upd_frame_info);
bh = ((StgUpdateFrame *)frame)->updatee;
bh_info = bh->header.info;
Which I believe roughly corresponds to this assembly:
0x101f9a11b: cmpq 1097902(%rip), %rax ; (void
*)0x0000000101fb4a28: stg_marked_upd_frame_info
0x101f9a122: jne 0x101f9a164 ; threadPaused + 180 #
check if frame->header.info = stg_marked_upd_frame_info
0x101f9a124: testl %r13d, %r13d ; Check if
prev_was_update_frame == 0
0x101f9a127: je 0x101f9a310 ; threadPaused + 608; if
prev_was_update_frame == 0
0x101f9a12d: movl 32(%rsp), %r13d ; increment words_to_squeeze,
etc
0x101f9a132: addl %r13d, 8(%rsp) ; same as above
0x101f9a137: addl $2, 12(%rsp) ; same as above
0x101f9a13c: jmpq 0x101f9a310 ; threadPaused + 608;
exit if frame->header.info == stg_marked_up_frame_info
0x101f9a141: nopw %cs:(%rax,%rax) ; Crap opcodes for alignment
0x101f9a150: movq (%r14), %rax ; Load info into RAX
-> 0x101f9a153: movl -8(%rax), %ecx ; deref info->i.type
Due to optimization settings, the code is rather reorganized and coalesced
for being nice to the processor, but the segfault actually occurs on this
line:
while ((P_)frame < stack_end) {
info = get_ret_itbl(frame);
switch (info->i.type) { <- SEGFAULT HERE
So this seems to be some interaction between the compiler and info table
layout, possibly? If we rebuild the stage2 compiler with a debug RTS and
disassemble with source, we see the same thing:
$ lldb ...
...
...
*** Desugar:
Result size of Desugar (after optimization)
= {terms: 7, types: 5, coercions: 0}
*** Simplifier:
Process 42406 stopped
* thread #1: tid = 0x1c03, 0x0000000101fb79c1
ghc-stage2`threadPaused(cap=0x0000000102521980, tso=0x00000001060d0000) +
177 at ThreadPaused.c:223, stop reason = EXC_BAD_ACCESS (code=1,
address=0xfffffffffffffff8)
frame #0: 0x0000000101fb79c1
ghc-stage2`threadPaused(cap=0x0000000102521980, tso=0x00000001060d0000) +
177 at ThreadPaused.c:223
220 while ((P_)frame < stack_end) {
221 info = get_ret_itbl(frame);
222
-> 223 switch (info->i.type) {
224
225 case UPDATE_FRAME:
226
(lldb) disassemble -f -m
ghc-stage2`threadPaused + 157 at ThreadPaused.c:221
220 while ((P_)frame < stack_end) {
221 info = get_ret_itbl(frame);
222
0x101fb79ad: movq -32(%rbp), %rax
0x101fb79b1: movq %rax, %rdi
0x101fb79b4: callq 0x101fb7730 ; get_ret_itbl at
ClosureMacros.h:88
0x101fb79b9: movq %rax, -40(%rbp)
ghc-stage2`threadPaused + 173 at ThreadPaused.c:223
222
223 switch (info->i.type) {
224
0x101fb79bd: movq -40(%rbp), %rax
-> 0x101fb79c1: movl 16(%rax), %eax
0x101fb79c4: leal -37(%rax), %ecx
0x101fb79c7: cmpl $2, %ecx
0x101fb79ca: movl %eax, -96(%rbp)
0x101fb79cd: jb 0x101fb7c56 ; threadPaused + 838 at
ThreadPaused.c:342
0x101fb79d3: movl -96(%rbp), %eax
0x101fb79d6: cmpl $35, %eax
0x101fb79d9: jne 0x101fb7c58 ; threadPaused + 840 at
ThreadPaused.c:352
ghc-stage2`threadPaused + 207 at ThreadPaused.c:228
227 // If we've already marked this frame, then stop here.
228 if (frame->header.info == (StgInfoTable
*)&stg_marked_upd_frame_info) {
229 if (prev_was_update_frame) {
0x101fb79df: movq -32(%rbp), %rax
0x101fb79e3: movq (%rax), %rax
0x101fb79e6: leaq 210779(%rip), %rcx ;
stg_marked_upd_frame_info
0x101fb79ed: leaq (%rcx), %rcx
0x101fb79f0: cmpq %rcx, %rax
0x101fb79f3: jne 0x101fb7a1d ; threadPaused + 269 at
ThreadPaused.c:237
...
...
...
Geoffrey, if you have time, can you confirm this behavior on your Linux
machine with LLVM 3.2? I think we should really fix this; it's rather
unfortunate if we have to tell users to use some specific LLVM 3.3 SVN
revision, or stay on 3.1 (and it's a pain to keep multiple LLVM installs
synchronized.) On that note, we should test this with 3.1 as well possibly.
You can rebuild a stage2 GHC with the debug RTS very easily - that'll give
you RTS source and extra sanity checks, etc. Just run it under GDB instead
and look at the trace. You can rebuild stage2 by saying (from the top-level
source directory.)
$ cd ghc
$ make re2 GhcDebugged=YES
And the new inplace/bin/ghc-stage2 compiler will have the debug runtime
enabled. I don't have a lot of more time to dig at this exact moment. I'll
look more tonight when I have time.
On Thu, Mar 14, 2013 at 2:23 PM, Geoffrey Mainland
My stage 2 compiler was crashing the first time it was invoked.
I just finished building GHC HEAD using LLVM compiled from HEAD, and that worked, so perhaps this was just a 3.2 bug. I have yet to run the testsuite though.
Geoff
On 03/14/2013 07:16 PM, Jan Stolarek wrote:
Goeff, Austin,
do the build errors always happen with the same module or do they occur randomly? If the former, which modules do you have problems with?
Janek
-- Regards, Austin

On 03/14/2013 09:36 PM, Austin Seipp wrote: > My stage2 compiler got built and also fails on any compilation, no > matter how trivial. After linking stage2, my build fails with: > > $ make > ===--- building phase 0 > make -r --no-print-directory -f ghc.mk phase=0 phase_0_builds > make[1]: Nothing to be done for `phase_0_builds'. > ===--- building phase 1 > make -r --no-print-directory -f ghc.mk phase=1 phase_1_builds > make[1]: Nothing to be done for `phase_1_builds'. > ===--- building final phase > make -r --no-print-directory -f ghc.mk phase=final all > "inplace/bin/ghc-stage2" -static -H64m -O0 -fllvm -package-name old-time-1.1.0.1 -hide-all-packages -i -ilibraries/old-time/. -ilibraries/old-time/dist-install/build -ilibraries/old-time/dist-install/build/autogen -Ilibraries/old-time/dist-install/build -Ilibraries/old-time/dist-install/build/autogen -Ilibraries/old-time/include -optP-include -optPlibraries/old-time/dist-install/build/autogen/cabal_macros.h -package base-4.7.0.0 -package old-locale-1.0.0.5 -XHaskell98 -XCPP -XForeignFunctionInterface -O -fllvm -no-user-package-db -rtsopts -odir libraries/old-time/dist-install/build -hidir libraries/old-time/dist-install/build -stubdir libraries/old-time/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/old-time/dist-install/build/System/Time.hs -o libraries/old-time/dist-install/build/System/Time.o > make[1]: *** [libraries/old-time/dist-install/build/System/Time.o] Segmentation fault: 11 > > > Just compiling 'hello world' is enough to trigger it, however: > > $ cat hi.hs > main = putStrLn "hello world" > $ ./inplace/bin/ghc-stage2 -v3 -fforce-recomp hi.hs ⏎ > Glasgow Haskell Compiler, Version 7.7.20130313, stage 2 booted by GHC version 7.6.2 > Using binary package database: /Users/a/ghc/ghc-pristine/inplace/lib/package.conf.d/package.cache > wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace > wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace > wired-in package base mapped to base-4.7.0.0-inplace > wired-in package rts mapped to builtin_rts > wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace > wired-in package dph-seq not found. > wired-in package dph-par not found. > Hsc static flags: > wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace > wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace > wired-in package base mapped to base-4.7.0.0-inplace > wired-in package rts mapped to builtin_rts > wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace > wired-in package dph-seq not found. > wired-in package dph-par not found. > *** Chasing dependencies: > Chasing modules from: *hi.hs > Stable obj: [] > Stable BCO: [] > Ready for upsweep > [NONREC > ModSummary { > ms_hs_date = 2013-03-14 19:58:30 UTC > ms_mod = main:Main, > ms_textual_imps = [import (implicit) Prelude] > ms_srcimps = [] > }] > *** Deleting temp files: > Deleting: > compile: input file hi.hs > Created temporary directory: /var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc39205_0 > *** Checking old interface for main:Main: > [1 of 1] Compiling Main ( hi.hs, hi.o ) > *** Parser: > *** Renamer/typechecker: > *** Desugar: > Result size of Desugar (after optimization) > = {terms: 7, types: 5, coercions: 0} > *** Simplifier: > [1] 39205 segmentation fault ./inplace/bin/ghc-stage2 -v3 -fforce-recomp hi.hs > > It says the segfault occurs near the simplifier, but I'm a little skeptical this is the actual cause. Digging further with LLDB on my Mac OS X machine, I can see this: > > $ lldb /Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2 -- -B/Users/a/ghc/ghc-pristine/inplace/lib -v3 -fforce-recomp hi.hs > > Current executable set to '/Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2' (x86_64). > (lldb) r > Process 39787 launched: '/Users/a/ghc/ghc-pristine/inplace/lib/bin/ghc-stage2' (x86_64) > Glasgow Haskell Compiler, Version 7.7.20130313, stage 2 booted by GHC version 7.6.2 > Using binary package database: /Users/a/ghc/ghc-pristine/inplace/lib/package.conf.d/package.cache > wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace > wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace > wired-in package base mapped to base-4.7.0.0-inplace > wired-in package rts mapped to builtin_rts > wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace > wired-in package dph-seq not found. > wired-in package dph-par not found. > Hsc static flags: > wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-inplace > wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-inplace > wired-in package base mapped to base-4.7.0.0-inplace > wired-in package rts mapped to builtin_rts > wired-in package template-haskell mapped to template-haskell-2.9.0.0-inplace > wired-in package dph-seq not found. > wired-in package dph-par not found. > *** Chasing dependencies: > Chasing modules from: *hi.hs > Stable obj: [] > Stable BCO: [] > Ready for upsweep > [NONREC > ModSummary { > ms_hs_date = 2013-03-14 19:58:30 UTC > ms_mod = main:Main, > ms_textual_imps = [import (implicit) Prelude] > ms_srcimps = [] > }] > *** Deleting temp files: > Deleting: > compile: input file hi.hs > Created temporary directory: /var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc39787_0 > *** Checking old interface for main:Main: > [1 of 1] Compiling Main ( hi.hs, hi.o ) > *** Parser: > *** Renamer/typechecker: > *** Desugar: > Result size of Desugar (after optimization) > = {terms: 7, types: 5, coercions: 0} > *** Simplifier: > Process 39787 stopped > * thread #1: tid = 0x1c03, 0x0000000101f9a153 ghc-stage2`threadPaused + 163, stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff8) > frame #0: 0x0000000101f9a153 ghc-stage2`threadPaused + 163 > ghc-stage2`threadPaused + 163: > -> 0x101f9a153: movl -8(%rax), %ecx > 0x101f9a156: addl $-31, %ecx > 0x101f9a159: cmpl $7, %ecx > 0x101f9a15c: ja 0x101f9a2ed ; threadPaused + 573 > > So the crash is actually happening in the RTS. Looking at the > disassembly of the current frame, we see: > > (lldb) disassemble -f > ghc-stage2`threadPaused: > 0x101f9a0b0: pushq %rbp > 0x101f9a0b1: pushq %r15 > 0x101f9a0b3: pushq %r14 > 0x101f9a0b5: pushq %r13 > 0x101f9a0b7: pushq %r12 > 0x101f9a0b9: pushq %rbx > 0x101f9a0ba: subq $56, %rsp > 0x101f9a0be: movq %rsi, %rbx > 0x101f9a0c1: movq %rdi, 48(%rsp) > 0x101f9a0c6: movq %rbx, %rsi > 0x101f9a0c9: callq 0x101f8f350 ; maybePerformBlockedException > 0x101f9a0ce: cmpw $3, 32(%rbx) > 0x101f9a0d3: je 0x101f9a540 ; threadPaused + 1168 > 0x101f9a0d9: movq 24(%rbx), %rax > 0x101f9a0dd: movl 8(%rax), %ecx > 0x101f9a0e0: leaq 24(%rax,%rcx,8), %rcx > 0x101f9a0e5: movq %rcx, 40(%rsp) > 0x101f9a0ea: movq 16(%rax), %r14 > 0x101f9a0ee: xorl %eax, %eax > 0x101f9a0f0: movl %eax, 12(%rsp) > 0x101f9a0f4: leaq 1109(%rip), %r15 ; threadPaused + 1184 > 0x101f9a0fb: leaq 102230(%rip), %r12 ; stg_WHITEHOLE_info > 0x101f9a102: movl %eax, 8(%rsp) > 0x101f9a106: movl %eax, 32(%rsp) > 0x101f9a10a: movl %eax, %r13d > 0x101f9a10d: jmpq 0x101f9a305 ; threadPaused + 597 > 0x101f9a112: movslq (%r15,%rcx,4), %rcx > 0x101f9a116: addq %r15, %rcx > 0x101f9a119: jmpq *%rcx > 0x101f9a11b: cmpq 1097902(%rip), %rax ; (void *)0x0000000101fb4a28: stg_marked_upd_frame_info > 0x101f9a122: jne 0x101f9a164 ; threadPaused + 180 > 0x101f9a124: testl %r13d, %r13d > 0x101f9a127: je 0x101f9a310 ; threadPaused + 608 > 0x101f9a12d: movl 32(%rsp), %r13d > 0x101f9a132: addl %r13d, 8(%rsp) > 0x101f9a137: addl $2, 12(%rsp) > 0x101f9a13c: jmpq 0x101f9a310 ; threadPaused + 608 > 0x101f9a141: nopw %cs:(%rax,%rax) > 0x101f9a150: movq (%r14), %rax > -> 0x101f9a153: movl -8(%rax), %ecx > 0x101f9a156: addl $-31, %ecx > 0x101f9a159: cmpl $7, %ecx > 0x101f9a15c: ja 0x101f9a2ed ; threadPaused + 573 > ... lots more code > > The segfaulting instruction attempts to load ECX from RAX, but RAX is null, hence the access violation on 0xfffffffffffffff8 > > (lldb) register read > General Purpose Registers: > rax = 0x0000000000000000 > rbx = 0x00000001061d4000 > rcx = 0x00000000fffffff0 > rdx = 0x00000001024e10a0 ghc-stage2`large_alloc_lim > rdi = 0x00000001024d8540 ghc-stage2`MainCapability > rsi = 0x00000001061d4000 > rbp = 0x00000001020c6760 ghczm7zi7zi20130313_Demand_cprProdSig_closure + 16 > rsp = 0x00007fff5fbfb3f0 > r8 = 0x00000001020c8789 ghczm7zi7zi20130313_IdInfo_MayHaveCafRefs_closure + 1 > r9 = 0x00000001020c8779 ghczm7zi7zi20130313_IdInfo_NoLBVarInfo_closure + 1 > r10 = 0x0000000100685920 ghc-stage2`s5or_info > r11 = 0x00000001051e98c8 > r12 = 0x0000000101fb3058 ghc-stage2`stg_WHITEHOLE_info > r13 = 0x0000000000000000 > r14 = 0x00000001020c67d8 ghczm7zi7zi20130313_Demand_worthSplittingThunk_closure + 8 > r15 = 0x0000000101f9a550 ghc-stage2`threadPaused + 1184 > rip = 0x0000000101f9a153 ghc-stage2`threadPaused + 163 > rflags = 0x0000000000010287 > cs = 0x000000000000002b > fs = 0x0000000000000000 > gs = 0x0000000000000000 > > > So there's still something weird going on. Looking at rts/ThreadPaused.c in threadPaused, we see some code like: > > while ((P_)frame < stack_end) { > info = get_ret_itbl(frame); > > switch (info->i.type) { > > case UPDATE_FRAME: > > // If we've already marked this frame, then stop here. > if (frame->header.info == (StgInfoTable *)&stg_marked_upd_frame_info) { > if (prev_was_update_frame) { > words_to_squeeze += sizeofW(StgUpdateFrame); > weight += weight_pending; > weight_pending = 0; > } > goto end; > } > > SET_INFO(frame, (StgInfoTable *)&stg_marked_upd_frame_info); > > bh = ((StgUpdateFrame *)frame)->updatee; > bh_info = bh->header.info; > > Which I believe roughly corresponds to this assembly: > > 0x101f9a11b: cmpq 1097902(%rip), %rax ; (void *)0x0000000101fb4a28: stg_marked_upd_frame_info > 0x101f9a122: jne 0x101f9a164 ; threadPaused + 180 # check if frame->header.info = stg_marked_upd_frame_info > 0x101f9a124: testl %r13d, %r13d ; Check if prev_was_update_frame == 0 > 0x101f9a127: je 0x101f9a310 ; threadPaused + 608; if prev_was_update_frame == 0 > 0x101f9a12d: movl 32(%rsp), %r13d ; increment words_to_squeeze, etc > 0x101f9a132: addl %r13d, 8(%rsp) ; same as above > 0x101f9a137: addl $2, 12(%rsp) ; same as above > 0x101f9a13c: jmpq 0x101f9a310 ; threadPaused + 608; exit if frame->header.info == stg_marked_up_frame_info > 0x101f9a141: nopw %cs:(%rax,%rax) ; Crap opcodes for alignment > 0x101f9a150: movq (%r14), %rax ; Load info into RAX > -> 0x101f9a153: movl -8(%rax), %ecx ; deref info->i.type > > Due to optimization settings, the code is rather reorganized and > coalesced for being nice to the processor, but the segfault actually > occurs on this line: > > while ((P_)frame < stack_end) { > info = get_ret_itbl(frame); > > switch (info->i.type) { <- SEGFAULT HERE > > So this seems to be some interaction between the compiler and info > table layout, possibly? If we rebuild the stage2 compiler with a debug > RTS and disassemble with source, we see the same thing: > > $ lldb ... > ... > ... > *** Desugar: > Result size of Desugar (after optimization) > = {terms: 7, types: 5, coercions: 0} > *** Simplifier: > Process 42406 stopped > * thread #1: tid = 0x1c03, 0x0000000101fb79c1 ghc-stage2`threadPaused(cap=0x0000000102521980, tso=0x00000001060d0000) + 177 at ThreadPaused.c:223, stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff8) > frame #0: 0x0000000101fb79c1 ghc-stage2`threadPaused(cap=0x0000000102521980, tso=0x00000001060d0000) + 177 at ThreadPaused.c:223 > 220 while ((P_)frame < stack_end) { > 221 info = get_ret_itbl(frame); > 222 > -> 223 switch (info->i.type) { > 224 > 225 case UPDATE_FRAME: > 226 > (lldb) disassemble -f -m > ghc-stage2`threadPaused + 157 at ThreadPaused.c:221 > 220 while ((P_)frame < stack_end) { > 221 info = get_ret_itbl(frame); > 222 > 0x101fb79ad: movq -32(%rbp), %rax > 0x101fb79b1: movq %rax, %rdi > 0x101fb79b4: callq 0x101fb7730 ; get_ret_itbl at ClosureMacros.h:88 > 0x101fb79b9: movq %rax, -40(%rbp) > ghc-stage2`threadPaused + 173 at ThreadPaused.c:223 > 222 > 223 switch (info->i.type) { > 224 > 0x101fb79bd: movq -40(%rbp), %rax > -> 0x101fb79c1: movl 16(%rax), %eax > 0x101fb79c4: leal -37(%rax), %ecx > 0x101fb79c7: cmpl $2, %ecx > 0x101fb79ca: movl %eax, -96(%rbp) > 0x101fb79cd: jb 0x101fb7c56 ; threadPaused + 838 at ThreadPaused.c:342 > 0x101fb79d3: movl -96(%rbp), %eax > 0x101fb79d6: cmpl $35, %eax > 0x101fb79d9: jne 0x101fb7c58 ; threadPaused + 840 at ThreadPaused.c:352 > ghc-stage2`threadPaused + 207 at ThreadPaused.c:228 > 227 // If we've already marked this frame, then stop here. > 228 if (frame->header.info == (StgInfoTable *)&stg_marked_upd_frame_info) { > 229 if (prev_was_update_frame) { > 0x101fb79df: movq -32(%rbp), %rax > 0x101fb79e3: movq (%rax), %rax > 0x101fb79e6: leaq 210779(%rip), %rcx ; stg_marked_upd_frame_info > 0x101fb79ed: leaq (%rcx), %rcx > 0x101fb79f0: cmpq %rcx, %rax > 0x101fb79f3: jne 0x101fb7a1d ; threadPaused + 269 at ThreadPaused.c:237 > ... > ... > ... > > > Geoffrey, if you have time, can you confirm this behavior on your > Linux machine with LLVM 3.2? I think we should really fix this; it's > rather unfortunate if we have to tell users to use some specific LLVM > 3.3 SVN revision, or stay on 3.1 (and it's a pain to keep multiple > LLVM installs synchronized.) On that note, we should test this with > 3.1 as well possibly. > > You can rebuild a stage2 GHC with the debug RTS very easily - that'll > give you RTS source and extra sanity checks, etc. Just run it under > GDB instead and look at the trace. You can rebuild stage2 by saying > (from the top-level source directory.) > > $ cd ghc > $ make re2 GhcDebugged=YES > > And the new inplace/bin/ghc-stage2 compiler will have the debug > runtime enabled. I don't have a lot of more time to dig at this exact > moment. I'll look more tonight when I have time. > > > On Thu, Mar 14, 2013 at 2:23 PM, Geoffrey Mainlandwrote: > > My stage 2 compiler was crashing the first time it was invoked. > > I just finished building GHC HEAD using LLVM compiled from HEAD, and that > worked, so perhaps this was just a 3.2 bug. I have yet to run the > testsuite though. > > Geoff > > On 03/14/2013 07:16 PM, Jan Stolarek wrote: > > Goeff, Austin, > > > > do the build errors always happen with the same module or do > > they occur randomly? If the former, which modules do you have > > problems with? > > > > Janek > > -- > Regards, > Austin Yes, I'm seeing exactly this failure. I have been building stage2 with the debug RTS. The current TSO's stack is getting corrupted, but I haven't had time to dig in and find out why this is happening. I defined the following macro and sprinkled it everywhere I saw the current TSO's stack being modified: #define CHECK_STACK(stackobj) \ ASSERT((StgWord*) (stackobj)->sp > (stackobj)->stack); \ ASSERT((StgWord*) (stackobj)->sp <= (stackobj)->stack + (stackobj)->stack_size); Make the following check the first statement in the function threadPaused in ThreadPaused.c and I suspect you will get a failed assertion quite quickly: CHECK_STACK(tso->stackobj); FWIW, LLVM 3.3 built from subversion seems fine... Geoff
participants (4)
-
Austin Seipp
-
David Terei
-
Geoffrey Mainland
-
Jan Stolarek