
#15449: Nondeterministic Failure on aarch64 with -jn, n > 1 -------------------------------------+------------------------------------- Reporter: tmobile | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: aarch64 Type of failure: Compile-time | Test Case: crash or panic | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tmobile): If this [https://en.wikipedia.org/wiki/Memory_ordering#In_symmetric_multiprocessing_(... table] is to be trusted, it looks like ARM 7 and PPC allow for the same sorts of load/store reorderings. As far as the difference between 32-bit and 64-bit ARM, the only thing I can guess is that perhaps the smaller ARM chips have much simpler instruction pipelines and don't necessarily perform the allowed reorderings in practice? But I might be missing something. As for x86 it does seem like we're getting away with this because of the stricter memory model. If I'm reading [https://llvm.org/docs/Atomics.html#sequentiallyconsistent this bit] correctly, as a frontend, we shouldn't have to emit any fences to ensure the semantics of `seq_cst`. If the target machine's memory model would require a fence to encode a `load atomic ... seq_cst` then it's on `llc` to emit it, not GHC, right? This seems to be contradicted by this equation in `genCall`: {{{ genCall (PrimTarget MO_WriteBarrier) _ _ = do platform <- getLlvmPlatform if platformArch platform `elem` [ArchX86, ArchX86_64, ArchSPARC] then return (nilOL, []) else barrier }}} Here we implement an arch-specific optimization on our own; I would've expected LLVM to be responsible for that, not GHC. I'm also a bit confused by this equation: {{{ genCall (PrimTarget (MO_AtomicWrite _width)) [] [addr, val] = runStmtsDecls $ do addrVar <- exprToVarW addr valVar <- exprToVarW val let ptrTy = pLift $ getVarType valVar ptrExpr = Cast LM_Inttoptr addrVar ptrTy ptrVar <- doExprW ptrTy ptrExpr statement $ Expr $ AtomicRMW LAO_Xchg ptrVar valVar SyncSeqCst }}} I must be missing some trick here; why isn't this implemented with `store atomic`? There isn't even a constructor for `store atomic` in `LlvmExpression` in `compiler/llvmGen/Llvm/AbsSyn.hs`. I think I'll try the sledgehammer approach of sticking fences before each atomic read and after each atomic write; if the behavior improves at least that's some evidence this is where the issue lies. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15449#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler