[GHC] #7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails --------------------------------+------------------------------------------- Reporter: thoughtpolice | Owner: Type: bug | Status: new Priority: normal | Component: Compiler (LLVM) Version: 7.7 | Keywords: Os: MacOS X | Architecture: Unknown/Multiple Failure: Building GHC failed | Blockedby: Blocking: | Related: #7571, #7575 --------------------------------+------------------------------------------- Using LLVM 3.2 (release,) or LLVM 3.3svn, on my Mac OS X 10.8 machine, the stage1 compiler fails to compile ```PrimOps.cmm``` with this error: {{{ $ make ===--- building phase 0 make -r --no-print-directory -f ghc.mk phase=0 phase_0_builds make[1]: Nothing to be done for `phase_0_builds'. ===--- building phase 1 make -r --no-print-directory -f ghc.mk phase=1 phase_1_builds make[1]: Nothing to be done for `phase_1_builds'. ===--- building final phase make -r --no-print-directory -f ghc.mk phase=final all "inplace/bin/ghc-stage1" -static -optc-DDEBUG -H64m -O0 -fllvm -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist- ghcconstants/header -Irts -Irts/dist/build -DCOMPILING_RTS -package-name rts -dcmm-lint -DDTRACE -i -irts -irts/dist/build -irts/dist/build/autogen -Irts/dist/build -Irts/dist/build/autogen -O0 -c rts/PrimOps.cmm -o rts/dist/build/PrimOps.debug_o /usr/local/bin/opt: /var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc99887_0/ghc99887_0.ll:6800:17: error: invalid cast opcode for cast from 'i64' to 'i64' %ln1N5 = zext i64 1 to i64 }}} This seems to occur inside ```stg_retryzh```, which looks like: {{{ ==================== LLVM Code ==================== define cc 10 void @stg_retryzh(i64* noalias nocapture %Base_Arg, i64* noalias nocapture %Sp_Arg, i64* noalias nocapture %Hp_Arg, i64 %R1_Arg, i64 %R2_Arg, i64 %R3_Arg, i64 %R4_Arg, i64 %R5_Arg, i64 %R6_Arg, i64 %SpLim_Arg) align 8 nounwind { { ... ... ccD: ... %ln1MS = call ccc i8* (i8*,i8*)* @stmStartTransaction( i8* %ln1MP, i8* %ln1MR ) nounwind %ln1MT = ptrtoint i8* %ln1MS to i64 store i64 %ln1MT, i64* %lccV %ln1MU = load i64** %Base_Var %ln1MV = getelementptr inbounds i64* %ln1MU, i32 25 %ln1MW = bitcast i64* %ln1MV to i64* %ln1MX = load i64* %ln1MW, !tbaa !4 %ln1MY = add i64 %ln1MX, 8 %ln1MZ = add i64 %ln1MY, 72 %ln1N0 = load i64* %lccV %ln1N1 = inttoptr i64 %ln1MZ to i64* store i64 %ln1N0, i64* %ln1N1, !tbaa !5 %ln1N2 = load i64* %lccU %ln1N3 = add i64 %ln1N2, 8 %ln1N4 = add i64 %ln1N3, 0 %ln1N5 = zext i64 1 to i64 <------ Invalid %ln1N6 = inttoptr i64 %ln1N4 to i64* store i64 %ln1N5, i64* %ln1N6, !tbaa !5 %ln1N7 = load i64* %lccU %ln1N8 = add i64 %ln1N7, 8 %ln1N9 = add i64 %ln1N8, 16 %ln1Na = inttoptr i64 %ln1N9 to i64* %ln1Nb = load i64* %ln1Na, !tbaa !5 store i64 %ln1Nb, i64* %R1_Var %ln1Nc = load i64** %Base_Var %ln1Nd = load i64** %Sp_Var %ln1Ne = load i64** %Hp_Var %ln1Nf = load i64* %R1_Var %ln1Ng = load i64* %SpLim_Var tail call cc 10 void (i64*,i64*,i64*,i64,i64,i64,i64,i64,i64,i64)* @stg_ap_v_fast( i64* %ln1Nc, i64* %ln1Nd, i64* %ln1Ne, i64 %ln1Nf, i64 undef, i ret void ... ... }}} Which would seem to correspond to this in ```PrimOps.cmm```: {{{ stg_retryzh /* no arg list: explicit stack layout */ { W_ frame_type; W_ frame; W_ trec; W_ outer; W_ r; ... ... (r) = ccall stmWait(MyCapability() "ptr", CurrentTSO "ptr", trec "ptr"); if (r != 0) { ... ... } else { // Transaction was not valid: retry immediately ("ptr" trec) = ccall stmStartTransaction(MyCapability() "ptr", outer "ptr"); StgTSO_trec(CurrentTSO) = trec; Sp = frame; R1 = StgAtomicallyFrame_code(frame); jump stg_ap_v_fast [R1]; } } }}} I'm still investigating. I think this is related to #7571 and #7575. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails --------------------------------+------------------------------------------- Reporter: thoughtpolice | Owner: Type: bug | Status: new Priority: normal | Component: Compiler (LLVM) Version: 7.7 | Keywords: Os: MacOS X | Architecture: Unknown/Multiple Failure: Building GHC failed | Blockedby: Blocking: | Related: #7571, #7575 --------------------------------+------------------------------------------- Comment(by thoughtpolice): The CMM produced for ```PrimOps.cmm` in this case looks like this: {{{ stg_retryzh() // [] { info_tbl: [] stack_info: arg_space: 0 updfr_space: Nothing } {offset ... ... ccg: (_ccl::I64) = call "ccall" ... stmStartTransaction(BaseReg - 24, _ccm::I64); I64[CurrentTSO + 8 + 72] = _ccl::I64; Sp = _cck::I64; R1 = I64[_cck::I64 + 8 + 0]; call stg_ap_v_fast(R1) args: 8, res: 0, upd: 8; } }}} So it looks like the store to `Sp` is introducing the zero-extended one & store, but I'm still looking as to why. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails --------------------------------+------------------------------------------- Reporter: thoughtpolice | Owner: Type: bug | Status: new Priority: normal | Component: Compiler (LLVM) Version: 7.7 | Keywords: Os: MacOS X | Architecture: Unknown/Multiple Failure: Building GHC failed | Blockedby: Blocking: | Related: #7571, #7575 --------------------------------+------------------------------------------- Comment(by thoughtpolice): There's logic to handle this in ```LlvmCodeGen.CodeGen``` but I'm not sure why it's not kicking in: {{{ sameConv from ty reduce expand = do x'@(env', vx, stmts, top) <- exprToVar env x let sameConv' op = do (v1, s1) <- doExpr ty $ Cast op vx ty return (env', v1, stmts `snocOL` s1, top) let toWidth = llvmWidthInBits dflags ty -- LLVM doesn't like trying to convert to same width, so -- need to check for that as we do get Cmm code doing it. case widthInBits from of w | w < toWidth -> sameConv' expand w | w > toWidth -> sameConv' reduce _w -> return x' }}} -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: dterei Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: | Blocking: Related: #7571, #7575 | ---------------------------------+------------------------------------------ Changes (by simonmar): * owner: => dterei * difficulty: => Unknown Comment: I don't see where the cast is coming from. The LLVM code looks like it has a lot of loads and stores in it too - where do they come from? This is probably David T's area. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: | Blocking: Related: #7571, #7575 | ---------------------------------+------------------------------------------ Changes (by thoughtpolice): * owner: dterei => thoughtpolice Comment: I have fixed this bug. But I cannot quite answer your question. The bugfix: In a nutshell, if you look at that snipped I posted, the secret is the definition of ```exprToVar```: {{{ exprToVar :: LlvmEnv -> CmmExpr -> UniqSM ExprData exprToVar env = exprToVarOpt env (wordOption (getDflags env)) }}} The ```exprToVarOpt``` call essentially forces the return value of the expression. By default it is the word size of the build (64, here.) In this case, we detected that we needed to convert a i32 to an i64 (the ```w | w < toWidth``` branch.) So we execute ```sameConv'```, which invokes ```exprToVar```, which turns ```vx``` into an ```LlvmVar``` with a width of the *native* word size. Which in this case, is i64. So the check passes, but the variable we substitute for the coercion is of the wrong type. Annoying. The fix looks like this: {{{ sameConv from ty reduce expand = do x'@(env', vx, stmts, top) <- exprToVarOpt env (EOption $ Just $ widthToLlvmInt from) x ... }}} I will create a patch and try to narrow down a test case. The patch I cooked up depends on #7571, which has a prerequisite patch and explanation of this same issue. I'll post it shortly. As for the cast, I am not precisely sure where it comes from. It comes from the ```MO_UU_Conv``` MachOp, which just does an unsigned -> unsigned conversion (optionally extending/truncating.) I don't know why it seems to be inserted for the store to ```Sp```. As for the loads/stores - isn't that natural? It looks fine to me. Before we pass code to ```opt```, the backend generates tons of loads and stores for all variable modification as opposed to using phi nodes for control flow. This is so that we can use the ```mem2reg``` pass in LLVM to essentially do the phi conversion for us, since we're essentially in CPS form at that point. The code generated by ```-ddump-llvm``` is of course before opt is run. Looking at the code, it seems pretty in line with the snippet of Cmm from the ```-ddump-cmm``` bit I posted, no? -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: | Blocking: Related: #7571, #7575 | ---------------------------------+------------------------------------------ Changes (by thoughtpolice): * cc: mad.one@… (added) * status: new => patch Comment: The patch is attached. No testcase yet, but being able to bootstrap the compiler is a pretty good one itself. :) I apologize these raft of tickets (the last 3-4) are so intertwined. I've ended up doing more than I expected this weekend and I realize we have minimal manpower for reviews, so please tell me if I can expedite the review process at all. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Changes (by bgamari): * cc: bgamari@… (added) -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by dterei): Argh. I didn't realize all these patches are problems due to cross compilation. Why are you using LLVM 3.3? It is unreleased and so not supported. Sure, we should support it, but you are throwing in two new problems now. 1) LLVM 3.2 and 3.3 are unsupported. 2) Cross compilation has bugs. Is it possible to use LLVM 3.1 or at least 3.2? Or do they not work well enough with ARM? I'm trying to review your patches now but this fact will slow me down a little. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by thoughtpolice): I'm a bit confused by your response. But it is my fault for not communicating better as I should have. For one, my OS X machine is running both 3.2 and 3.3svn, isolated. 3.2 is my primary installation. I mostly run both to narrow down bugs and out of curiosity, in this case it was identical output so there was otherwise no relevance. I should probably just not mentioned 3.3. I only have 3.3 for one personal thing I work on anyway. It's out of $PATH and out of mind. I just use 3.2 for everything else (including GHC.) Two, as for ARM, this ticket isn't about ARM? I imagine you meant #7575 where I do have a bug I found on ARM? And I'm not cross compiling in any of these tickets. I'm compiling on the ARM device itself with an existing 7.4.1 build provided by Ubuntu - so this would (I expect) work the same as any self-hosted 32bit build, modulo whatever ARM bugs I find. My ARM build of LLVM 3.3 is significantly older than what is in trunk now (because it takes a while to build,) and is much closer to the 3.2 branch. I can just put 3.2 on it anyway - I have a script to manually automate LLVM builds (and by default it just built trunk, not a released version. Likewise I am compiling here on OS X using LLVM 3.2. The host compiler is 64bit GHC 7.6.1. I am producing a 64bit compiler. The fix and patch for this issue were tested on that combo with that bootstrap compiler as well. For reference, here's the low down on my machines: Machine 1: 64-bit Quad core Mac OS X 10.8.2. ```xcodebuild -version``` reveals xcode version 4.5.2, build version 4G2008a. Host compiler is GHC 7.6.1, installed from a binary distribution. LLVM version is 3.2. This ticket is relevant to that. Machine 2: 32-bit Quad core ARM Linux. Ubuntu 12.10 derivative. GCC version is 4.6.3. LLVM version is 3.3svn as of about 3 weeks ago, but I'm going to replace that with 3.2 stable now. Bootstrap compiler is GHC 7.4.1 from Ubuntu repositories. Tickets #7571 and #7575 were filed against this machine. There is no cross compilation happening anywhere. All builds are standard with recent copies of HEAD. Karel Gardas has also reported success using my patch in #7571 for his ARM machine using LLVM 3.0, I believe. I reproduced #7571 with a hand-written Cmm file on OS X as well, which you [http://hackage.haskell.org/trac/ghc/attachment/ticket/7571/0001-Test-for- Trac-7571.patch can see in the ticket]. (Please try that test without this patch applied using the LLVM backend, and see if it fails.) I figured #7571 was probably due to the new code generator taking in a higher-level Cmm representation when parsing (because the bug came from the RTS, where a lot of Cmm was rewritten when the code generator went in.) I don't know what else to say - most of these are bugs long before the version of LLVM even matters as much. #7571 #7575 and this ticket are all outright compiler panics before code generation ever happen for me, so the actual version of LLVM doesn't seem as relevant. I unfortunately can't give you direct access to my machine OS X machine for various reasons. You are welcome to use my ARM machines if you'd like once I get them properly DMZ'd. Are there any other tests or examples you can think of? Could you try the testsuite test in #7571 with and without the patch? -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by dterei): OK so now I'm not sure what is causing these bugs. My own machine is very similar to yours. Latest generation 13" Macbook. So OS X 10.8.2, 64 bit. I can bootstrap GHC with LLVM 3.1 fine. As you said, for a lot of these tickets the LLVM version shouldn't matter, so I'm just not sure why you are getting these panics and I'm not. Especially as these panics are things I thought I had already dealt with 2 years ago... (as you've noted, there is code already in the backend to handle these cases, but it doesn't seem to trigger correctly on your machine but does fine on mine). If our machines were vastly different or you were cross compiling... argh! I'll look through your patches again then and try to see if I can figure anything out from them. Are you using homebrew on your machine? Hmmm... I don't have my OS X machine right now but I believe I may be using GHC 7.4.2. I'll check that and then try 7.6.1 if I'm indeed using 7.4.2. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by thoughtpolice): I do use homebrew, but I do not ever manage my GHC installation with it. I just installed the 7.6.1 binary distribution from here: http://www.haskell.org/ghc/download_ghc_7_6_1#macosx_x86_64 I will get LLVM 3.1 working on my machine and see what happens when I bootstrap with it instead. I can also get 7.4.2 working and see if we can narrow a specific configuration down. Also, did you give a shot to the test I wrote in #7571 ? It should just be as easy as `ghc-stage2 -c -fllvm T7571.cmm` - This test should really be the ultimate clear case of whether one of these bugs does in fact manifest or not. It's worth noting that #7571 will ''only'' manifest during a bootstrap when the RTS is built in profiling mode, due to `#ifdef`ery in the Cmm files. I don't think you can avoid a profiling version of the RTS, but it's worth keeping in mind. I believe I also have an extra 'debug' patch for this ticket in my personal repository, that output a lot of helpful debugging info in tracking this down. I might dust it off, attach it, and step through what I see. Also if you'd like to contact me directly via email or through more instantaneous means (like an instant messaging system or IRC or something) to help sort it out quicker, you're more than free to (I'm CC'd on this ticket, so you can see my email there.) -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by dterei): Hmm... OK. So the profiling mode is probably a difference. Do the other tickets depend on this? The other is GHC 7.6.1, I am indeed using 7.4.2. I'll run the test cases you supplied and see. However right now I get a GHC build that is built with -fasm, passing the testsuite for llvmopt 100% and can bootstrap GHC fine with LLVM. This is using the 'prof' build of GHC haven't tested others. I'll try a little more to replicate and if I can't then sure, may be worth chatting more directly. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ---------------------------------+------------------------------------------ Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Keywords: | Os: MacOS X Architecture: Unknown/Multiple | Failure: Building GHC failed Difficulty: Unknown | Testcase: Blockedby: 7590 | Blocking: 7588, 7589 Related: #7571, #7575 | ---------------------------------+------------------------------------------ Comment(by thoughtpolice): I need the patch in #7571 in order to get to this point on my OS X machine. #7575 is ARM (or 32bit) only it seems, so while related I think we can rule it out for the moment. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7580: Building PrimOps.cmm on OS X with LLVM 3.2 fails ----------------------------------+----------------------------------------- Reporter: thoughtpolice | Owner: thoughtpolice Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.7 Resolution: fixed | Keywords: Os: MacOS X | Architecture: Unknown/Multiple Failure: Building GHC failed | Difficulty: Unknown Testcase: | Blockedby: Blocking: 7588, 7590 | Related: #7571, #7575 ----------------------------------+----------------------------------------- Changes (by dterei): * status: patch => closed * resolution: => fixed Comment: Closing as this bug only manifested due to the fix for #7571 not being correct. A new fix for that and #7575 means that the existing logic for detecting this should fire correctly. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7580#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC