[GHC] #8965: bootstrapping failure on Linux/ppc64el

#8965: bootstrapping failure on Linux/ppc64el
-----------------------------+--------------------------------------------
Reporter: cjwatson | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.1-rc2
Keywords: | Operating System: Linux
Architecture: powerpc64 | Type of failure: GHC doesn't work at all
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
-----------------------------+--------------------------------------------
A few distributions, including Ubuntu which I work on, have a new little-
endian Linux ppc64 port (known variously as ppc64el, ppc64le, powerpc64le,
etc.). As well as the obvious endianness difference, this varies from
traditional ppc64 in that it uses a new version of the ELF ABI.
https://bugs.openjdk.java.net/browse/JDK-8035647 has a useful set of links
explaining the changes.
I've been trying to bootstrap GHC on this architecture, but I've been
running into failures, which I explained here along with my procedure:
https://lists.ubuntu.com/archives/ubuntu-devel-
discuss/2014-April/014922.html
Things that I believe I have ruled out so far:
* It's not libffi. In the above procedure I was missing unpacking
libffi6:ppc64el in the sysroot and configuring using `--with-system-
libffi` (our libffi is newer), but that makes no difference; furthermore,
breakpoints set on all exposed `ffi_*` symbols and on `createAdjustor`
never fire.
* It's not signal handling. I replaced unlit with a shell script that
sleeps 5 and then execs the real unlit, and ghc-stage2 segfaulted before
it returned; strace shows that SIGSEGV is the first signal received.
* It's not buggy native code generation. Although getting an NCG up on
this platform might be tractable later, for now I added a `powerpc64le*)
$2="powerpc64le" ;;` case before `powerpc64*` to `GHC_CONVERT_CPU` and
added powerpc64le to the `ArchUnknown` list in
`FPTOOLS_SET_HASKELL_PLATFORM_VARS`. Everything should be going via GCC.
* I don't think I'm taking a fundamentally wrong bootstrapping approach;
I'm using the exact same procedure and in fact I started from the exact
same git tree that I just used to successfully bootstrap GHC on arm64.
Here's a gdb trace showing the location of the segfault:
{{{
*** Literate pre-processor:
../inplace/lib/unlit -h HelloWorld.lhs HelloWorld.lhs
/tmp/ghc6597_0/ghc6597_1.lpp
Breakpoint 1, 0x000000001507daac in runInteractiveProcess ()
(gdb) bt
#0 0x000000001507daac in runInteractiveProcess ()
#1 0x000000001507d968 in c8z6_entry ()
#2 0x00003fffb7804a20 in ?? ()
Cannot access memory at address 0xf
(gdb) finish
Run till exit from #0 0x000000001507daac in runInteractiveProcess ()
0x000000001507d968 in c8z6_entry ()
(gdb) stepi
0x000000001507d96c in c8z6_entry ()
(gdb)
0x000000001507d970 in c8z6_entry ()
(gdb)
0x000000001507d974 in c8z6_entry ()
(gdb)
0x000000001507d978 in c8z6_entry ()
(gdb)
0x000000001507d97c in c8z6_entry ()
(gdb)
0x000000001507d980 in c8z6_entry ()
(gdb)
0x000000001507d984 in c8z6_entry ()
(gdb)
0x000000001507d988 in c8z6_entry ()
(gdb)
0x000000001507d98c in c8z6_entry ()
(gdb)
0x000000001507d990 in c8z6_entry ()
(gdb)
0x000000001507d994 in c8z6_entry ()
(gdb)
0x000000001507d998 in c8z6_entry ()
(gdb)
0x000000001507d99c in c8z6_entry ()
(gdb)
0x000000001507d9a0 in c8z6_entry ()
(gdb)
0x000000001507d9a4 in c8z6_entry ()
(gdb)
0x000000001507d9a8 in c8z6_entry ()
(gdb)
0x000000001507d9ac in c8z6_entry ()
(gdb)
0x000000001507d9b0 in c8z6_entry ()
(gdb)
0x000000001507d9e8 in c8z6_entry ()
(gdb)
Cannot access memory at address 0xf
(gdb) disas /rm
Dump of assembler code for function c8z6_entry:
0x000000001507d8ec <+0>: 6f 16 40 3c lis r2,5743
0x000000001507d8f0 <+4>: 20 9c 42 38 addi r2,r2,-25568
0x000000001507d8f4 <+8>: a6 02 08 7c mflr r0
0x000000001507d8f8 <+12>: 10 00 01 f8 std r0,16(r1)
0x000000001507d8fc <+16>: c1 ff 21 f8 stdu r1,-64(r1)
0x000000001507d900 <+20>: 00 00 00 60 nop
0x000000001507d904 <+24>: 60 92 22 39 addi r9,r2,-28064
0x000000001507d908 <+28>: 68 03 49 e9 ld r10,872(r9)
0x000000001507d90c <+32>: 10 00 4a 39 addi r10,r10,16
0x000000001507d910 <+36>: 68 03 49 f9 std r10,872(r9)
0x000000001507d914 <+40>: 58 03 69 e9 ld r11,856(r9)
0x000000001507d918 <+44>: 08 00 0b e8 ld r0,8(r11)
0x000000001507d91c <+48>: 70 03 29 e9 ld r9,880(r9)
0x000000001507d920 <+52>: 40 48 aa 7f cmpld cr7,r10,r9
0x000000001507d924 <+56>: 90 00 9d 41 bgt cr7,0x1507d9b4

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Comment (by cjwatson): I finally managed to figure this out, thanks in part to some debugging tips from `slyfox` on `#ghc`. The two necessary patches are attached, and I'd appreciate review. With this, I've been able to completely bootstrap GHC 7.8 on this architecture, albeit without GHCi for now. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Comment (by ezyang): Wow, nice catch! Are there any other places where we are improperly declaring null argument lists? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Changes (by thoughtpolice): * status: new => patch * milestone: => 7.8.3 Comment: Excellent work Colin! I'll put this in for 7.8.3 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Comment (by cjwatson): Replying to [comment:2 ezyang]:
Wow, nice catch! Are there any other places where we are improperly declaring null argument lists?
I wouldn't like to categorically say no, because I don't know GHC anywhere near well enough for that. :-) The best I can do is to say that nothing else impeded the bootstrap on this architecture ... I did find a couple of related problems: * While using an empty parameter list is better than a wrong parameter list, it's technically an obsolescent feature in C11, and the proper fix is to generate a correct prototype. The compiler hacking for this is beyond me. * There are a couple of uses of (at least) `debugBelch` in `rts/*.cmm`, which have a similar problem: if you try to run the compiler on ppc64el with `-Da`, for instance, it crashes because the call to `debugBelch` in `stg_ap_0_fast` corrupts the caller's stack, as it didn't realise it was calling a varargs function and so didn't allocate enough stack space. There's a `debugBelch2` workaround in `libraries/base` for the same kind of problem; the RTS probably needs to do something similar. Generating correct prototypes in the compiler would fix this problem too. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Changes (by slyfox): * cc: slyfox@… (added) Comment: Great catch indeed! As for prototype mismatch there is a very heavy hammer I tried a while ago: ./configure --enable-unrefisterised CFLAGS=-flto LDFLAGS=-flto It dies in Cmm in things like 'memset' being 'const' incompatible with stdlib.h thing, but in general very nice thing to check for bugs repository-wide. And find real problems. I'll try that once again and post most relevant bits. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Comment (by slyfox): http://code.haskell.org/~slyfox/ghc-7.9.20140412-lto.log.txt:
error: variable 'nocldstop' redeclared as function warning: type of 'stg_MUT_ARR_PTRS_CLEAN_info' does not match original declaration [enabled by default]
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el
--------------------------------------------+-----------------------------
Reporter: cjwatson | Owner:
Type: bug | Status: patch
Priority: normal | Milestone: 7.8.3
Component: Compiler | Version: 7.8.1-rc2
Resolution: | Keywords:
Operating System: Linux | Architecture: powerpc64
Type of failure: GHC doesn't work at all | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
--------------------------------------------+-----------------------------
Comment (by Austin Seipp

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: merge Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Changes (by thoughtpolice): * status: patch => merge Comment: Merged, thank you! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: fixed | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Changes (by thoughtpolice): * status: merge => closed * resolution: => fixed Comment: Merged in 7.8, thanks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el --------------------------------------------+----------------------------- Reporter: cjwatson | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.8.3 Component: Compiler | Version: 7.8.1-rc2 Resolution: fixed | Keywords: Operating System: Linux | Architecture: powerpc64 Type of failure: GHC doesn't work at all | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: --------------------------------------------+----------------------------- Comment (by simonpj): See [http://www.chiark.greenend.org.uk/ucgi/~cjwatson/blosxom/2014-04-15 -porting-ghc-a-tale-of-two-architectures.html Colin's excellent blog post] for more details. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8965#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8965: bootstrapping failure on Linux/ppc64el
--------------------------------------------+-----------------------------
Reporter: cjwatson | Owner:
Type: bug | Status: closed
Priority: normal | Milestone: 7.8.3
Component: Compiler | Version: 7.8.1-rc2
Resolution: fixed | Keywords:
Operating System: Linux | Architecture: powerpc64
Type of failure: GHC doesn't work at all | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
--------------------------------------------+-----------------------------
Comment (by Simon Peyton Jones
participants (1)
-
GHC