[GHC] #9391: LLVM 3.2 crash (AVX messes up GHC calling convention)

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.8.2 Keywords: | Operating System: MacOS X Architecture: Unknown/Multiple | Type of failure: Runtime Difficulty: Easy (less than 1 | crash hour) | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- I stumbled across the problem of LLVM 3.2 builds seemingly randomly crashing on my Mac. After quite a bit of investigation I found that the source of the problem was somewhere in the `base` library (`span` to be precise), where it encountered Cmm like follows: {{{ cp7n: if ((Sp + -56) < SpLim) goto cp8c; else goto cp8d; ... cp8d: I64[Sp - 40] = block_cp7g_info; _smKL::P64 = P64[R1 + 7]; _smKO::P64 = P64[R1 + 15]; _smKP::P64 = P64[R1 + 23]; _smL0::P64 = P64[R1 + 31]; R1 = R2; P64[Sp - 32] = _smKL::P64; P64[Sp - 24] = _smKO::P64; P64[Sp - 16] = _smKP::P64; P64[Sp - 8] = _smL0::P64; Sp = Sp - 40; if (R1 & 7 != 0) goto up8R; else goto cp7h; up8R: call block_cp7g_info(R1) args: 0, res: 0, upd: 0; ... }}} which leads LLVM 3.2 to produce the following assembly: {{{ _smL1_info: ## @smL1_info ## BB#0: ## %cp7n pushq %rbp movq %rsp, %rbp movq %r14, %rax movq %rbp, %rcx leaq -56(%rcx), %rdx cmpq %r15, %rdx jae LBB160_1 ... LBB160_1: ## %cp8d leaq _cp7g_info(%rip), %rdx movq %rdx, -40(%rcx) vmovups 7(%rbx), %ymm0 vmovups %ymm0, -32(%rcx) addq $-40, %rcx testb $7, %al je LBB160_4 ## BB#2: ## %up8R movq %rcx, %rbp movq %rax, %rbx popq %rbp vzeroupper jmp _cp7g_info ## TAILCALL }}} So here LLVM has figured out that it can use `vmovups` in order to move 4 words at the same time. However, there is a puzzling side effect: All of sudden we have a `pushq %rbp` at the start of the function with a matching `popq %rbp` at the very end. This overwrites the stack pointer update (`movq %rcx, %rbp`) and - unsurprisingly - causes the program to crash rather quickly. My interpretation is that LLVM 3.2 erroneously thinks that AVX instructions are incompatible with frame pointer elimination. The reasoning is that this is exactly the kind of code LLVM generates if we disable this "optimisation" (`--disable-fp-elim`). Furthermore, disabling AVX instructions (`-mattr=-avx`) fixes the problem - LLVM falls back to the less efficient `movups`, with `pushq $rbp` vanishing as well. Finally, this bug seems to happen exactly with LLVM 3.2, with 3.3 upwards generating correct code. My proposed fix would be to add `-mattr=-avx` to the `llc` command line by default for LLVM 3.2. This issue might be related to #7694. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => patch -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: patch => infoneeded Comment: Comments: For the check on line 1444 for `ver == 32`, why is this last? If we only issue a warning, then it will simply match on `isAvxEnabled dflags` (line 1443) and still pass in the right arguments. Would it not be more correct to have this branch checked first and instead turn off AVX completely? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: infoneeded Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by scpmw): My reasoning was that if the user explicitly requests AVX using `-mavx`, it makes sense not to override it. After all, it is a relatively uncommon bug, and there might be ways to work around it for a given module. I might be overthinking this... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by thomie): * status: infoneeded => patch * milestone: => 7.10.1 Comment: scpmw's explanation sounds reasonable. Does GHC actually support LLVM 3.2? For example, in https://ghc.haskell.org/trac/ghc/ticket/9555#comment:7, rwbarton says:
There are known problems with LLVM 3.2 and GHC, can you try with 3.3 or 3.4?
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: patch Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by scpmw): LLVM 3.2 isn't blacklisted, so I think we are still "supporting" it, at least on paper. The comment might be referring to #7694? It's quite possible that this is actually the fix for that issue - crashes at virtually random places sounds about right. All I can say is that after applying the patch I was able to run the full test suite using LLVM 3.2. Without it, we get at least a dozen segfaults in there. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention)
-------------------------------------+-------------------------------------
Reporter: scpmw | Owner:
Type: bug | Status: patch
Priority: normal | Milestone: 7.10.1
Component: Compiler | Version: 7.8.2
(LLVM) | Keywords:
Resolution: | Architecture: Unknown/Multiple
Operating System: MacOS X | Difficulty: Easy (less than 1
Type of failure: Runtime | hour)
crash | Blocked By:
Test Case: | Related Tickets:
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Comment (by Austin Seipp

#9391: LLVM 3.2 crash (AVX messes up GHC calling convention) -------------------------------------+------------------------------------- Reporter: scpmw | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.2 (LLVM) | Keywords: Resolution: fixed | Architecture: Unknown/Multiple Operating System: MacOS X | Difficulty: Easy (less than 1 Type of failure: Runtime | hour) crash | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: patch => closed * resolution: => fixed Comment: OK, merged. Thanks Peter. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9391#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC