[GHC] #12469: Memory fence on writeIORef missing on ARM

#12469: Memory fence on writeIORef missing on ARM ------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Keywords: memory model | Operating System: Unknown/Multiple Architecture: arm | Type of failure: None/Unknown Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: ------------------------------------+------------------------------------- The memory model question has been debated now and again. his thread from ten years back (https://mail.haskell.org/pipermail/haskell-prime/2006-April/001237.html) lays out the basic situation with thunk update, writeIORef, and memory fences. But we recently began experimenting with GHC on ARM platforms, and it seems to lack a memory fence that the participants in the cited thread expect it to have. Here's an attempt to construct a program which writes fields of a data structure, and then writes the pointer to that structure to an IORef, without the proper fence inbetween: ```Haskell import Data.IORef import Control.Concurrent data Foo = Foo Int deriving Show {-# NOINLINE mkfoo #-} mkfoo x = Foo x {-# NOINLINE dowrite #-} dowrite r n = writeIORef r $! mkfoo n main = do r <- newIORef (Foo 3) forkIO (dowrite r 4) x <- readIORef r print x ``` Here's the relevant bits of the CMM that results when compiled on an ARM 64 machine: ```C mkfoo_rn1_entry() // [] { [] } {offset c40i: P64[MainCapability+872] = P64[MainCapability+872] + 16; if (P64[MainCapability+872] > I64[MainCapability+880]) goto c40m; else goto c40l; c40m: I64[MainCapability+928] = 16; P64[MainCapability+24] = mkfoo_rn1_closure; call (I64[MainCapability+16])(R1) args: 16, res: 0, upd: 8; c40l: I64[P64[MainCapability+872] - 8] = Foo_con_info; P64[P64[MainCapability+872]] = P64[I64[MainCapability+856]]; P64[MainCapability+24] = P64[MainCapability+872] - 7; I64[MainCapability+856] = I64[MainCapability+856] + 8; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } ``` ```C dowrite_entry() // [] { [] } {offset c44j: call a_r3Dy_entry() args: 24, res: 0, upd: 8; } } a_r3Dy_entry() // [R1] { [] } {offset c41D: if (I64[MainCapability+856] - 16 < I64[MainCapability+864]) goto c41H; else goto c41I; c41H: P64[MainCapability+24] = a_r3Dy_closure; call (I64[MainCapability+16])(R1) args: 24, res: 0, upd: 8; c41I: I64[I64[MainCapability+856] - 8] = block_c41B_info; P64[I64[MainCapability+856] - 16] = P64[I64[MainCapability+856] + 8]; I64[MainCapability+856] = I64[MainCapability+856] - 16; call mkfoo_rn1_entry() args: 16, res: 8, upd: 8; } } block_c41B_entry() // [R1] { [] } {offset c41B: _s3Ep::P64 = P64[I64[MainCapability+856] + 8]; I64[I64[MainCapability+856] + 8] = block_c41G_info; _s3Es::P64 = P64[MainCapability+24]; P64[MainCapability+24] = _s3Ep::P64; P64[I64[MainCapability+856] + 16] = _s3Es::P64; I64[MainCapability+856] = I64[MainCapability+856] + 8; if (P64[MainCapability+24] & 7 != 0) goto u41S; else goto c41K; u41S: call block_c41G_entry(R1) args: 0, res: 0, upd: 0; c41K: call (I64[I64[P64[MainCapability+24]]])(R1) args: 8, res: 8, upd: 8; } } block_c41G_entry() // [R1] { [] } {offset c41G: _s3Ev::P64 = P64[P64[MainCapability+24] + 7]; P64[_s3Ev::P64 + 8] = P64[I64[MainCapability+856] + 8]; call "ccall" arg hints: [PtrHint, PtrHint] result hints: [] dirty_MUT_VAR(MainCapability+24, _s3Ev::P64); P64[MainCapability+24] = ()_closure+1; I64[MainCapability+856] = I64[MainCapability+856] + 16; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } ``` The fence should happen before the write of the pointer into the IORef. I can't find the fence, and can't find a codepath in the compiler that would insert it (i.e. with MO_WriteBarrier). `dirty_MUT_VAR` is actually too late to perform the fence, but it doesn't either: ```C void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p) { Capability *cap = regTableToCapability(reg); if (p->header.info == &stg_MUT_VAR_CLEAN_info) { p->header.info = &stg_MUT_VAR_DIRTY_info; recordClosureMutated(cap,p); } } ``` (Neither does `recordClosureMutated`.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------ Reporter: rrnewton | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: arm Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------ Description changed by rrnewton: @@ -1,1 +1,1 @@ - The memory model question has been debated now and again. his + The memory model question has been debated now and again. This @@ -15,1 +15,2 @@ - ```Haskell + {{{ + #!haskell @@ -32,1 +33,1 @@ - ``` + }}} @@ -37,1 +38,2 @@ - ```C + {{{ + #!C @@ -59,1 +61,0 @@ - ``` @@ -61,1 +62,0 @@ - ```C @@ -126,1 +126,1 @@ - ``` + }}} @@ -135,1 +135,2 @@ - ```C + {{{ + #!C @@ -145,1 +146,1 @@ - ``` + }}} New description: The memory model question has been debated now and again. This thread from ten years back (https://mail.haskell.org/pipermail/haskell-prime/2006-April/001237.html) lays out the basic situation with thunk update, writeIORef, and memory fences. But we recently began experimenting with GHC on ARM platforms, and it seems to lack a memory fence that the participants in the cited thread expect it to have. Here's an attempt to construct a program which writes fields of a data structure, and then writes the pointer to that structure to an IORef, without the proper fence inbetween: {{{ #!haskell import Data.IORef import Control.Concurrent data Foo = Foo Int deriving Show {-# NOINLINE mkfoo #-} mkfoo x = Foo x {-# NOINLINE dowrite #-} dowrite r n = writeIORef r $! mkfoo n main = do r <- newIORef (Foo 3) forkIO (dowrite r 4) x <- readIORef r print x }}} Here's the relevant bits of the CMM that results when compiled on an ARM 64 machine: {{{ #!C mkfoo_rn1_entry() // [] { [] } {offset c40i: P64[MainCapability+872] = P64[MainCapability+872] + 16; if (P64[MainCapability+872] > I64[MainCapability+880]) goto c40m; else goto c40l; c40m: I64[MainCapability+928] = 16; P64[MainCapability+24] = mkfoo_rn1_closure; call (I64[MainCapability+16])(R1) args: 16, res: 0, upd: 8; c40l: I64[P64[MainCapability+872] - 8] = Foo_con_info; P64[P64[MainCapability+872]] = P64[I64[MainCapability+856]]; P64[MainCapability+24] = P64[MainCapability+872] - 7; I64[MainCapability+856] = I64[MainCapability+856] + 8; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } dowrite_entry() // [] { [] } {offset c44j: call a_r3Dy_entry() args: 24, res: 0, upd: 8; } } a_r3Dy_entry() // [R1] { [] } {offset c41D: if (I64[MainCapability+856] - 16 < I64[MainCapability+864]) goto c41H; else goto c41I; c41H: P64[MainCapability+24] = a_r3Dy_closure; call (I64[MainCapability+16])(R1) args: 24, res: 0, upd: 8; c41I: I64[I64[MainCapability+856] - 8] = block_c41B_info; P64[I64[MainCapability+856] - 16] = P64[I64[MainCapability+856] + 8]; I64[MainCapability+856] = I64[MainCapability+856] - 16; call mkfoo_rn1_entry() args: 16, res: 8, upd: 8; } } block_c41B_entry() // [R1] { [] } {offset c41B: _s3Ep::P64 = P64[I64[MainCapability+856] + 8]; I64[I64[MainCapability+856] + 8] = block_c41G_info; _s3Es::P64 = P64[MainCapability+24]; P64[MainCapability+24] = _s3Ep::P64; P64[I64[MainCapability+856] + 16] = _s3Es::P64; I64[MainCapability+856] = I64[MainCapability+856] + 8; if (P64[MainCapability+24] & 7 != 0) goto u41S; else goto c41K; u41S: call block_c41G_entry(R1) args: 0, res: 0, upd: 0; c41K: call (I64[I64[P64[MainCapability+24]]])(R1) args: 8, res: 8, upd: 8; } } block_c41G_entry() // [R1] { [] } {offset c41G: _s3Ev::P64 = P64[P64[MainCapability+24] + 7]; P64[_s3Ev::P64 + 8] = P64[I64[MainCapability+856] + 8]; call "ccall" arg hints: [PtrHint, PtrHint] result hints: [] dirty_MUT_VAR(MainCapability+24, _s3Ev::P64); P64[MainCapability+24] = ()_closure+1; I64[MainCapability+856] = I64[MainCapability+856] + 16; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } }}} The fence should happen before the write of the pointer into the IORef. I can't find the fence, and can't find a codepath in the compiler that would insert it (i.e. with MO_WriteBarrier). `dirty_MUT_VAR` is actually too late to perform the fence, but it doesn't either: {{{ #!C void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p) { Capability *cap = regTableToCapability(reg); if (p->header.info == &stg_MUT_VAR_CLEAN_info) { p->header.info = &stg_MUT_VAR_DIRTY_info; recordClosureMutated(cap,p); } } }}} (Neither does `recordClosureMutated`.) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------ Reporter: rrnewton | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: arm Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------ Description changed by rrnewton: @@ -35,1 +35,1 @@ - Here's the relevant bits of the CMM that results when compiled on an + Here are the relevant bits of the CMM that results when compiled on an New description: The memory model question has been debated now and again. This thread from ten years back (https://mail.haskell.org/pipermail/haskell-prime/2006-April/001237.html) lays out the basic situation with thunk update, writeIORef, and memory fences. But we recently began experimenting with GHC on ARM platforms, and it seems to lack a memory fence that the participants in the cited thread expect it to have. Here's an attempt to construct a program which writes fields of a data structure, and then writes the pointer to that structure to an IORef, without the proper fence inbetween: {{{ #!haskell import Data.IORef import Control.Concurrent data Foo = Foo Int deriving Show {-# NOINLINE mkfoo #-} mkfoo x = Foo x {-# NOINLINE dowrite #-} dowrite r n = writeIORef r $! mkfoo n main = do r <- newIORef (Foo 3) forkIO (dowrite r 4) x <- readIORef r print x }}} Here are the relevant bits of the CMM that results when compiled on an ARM 64 machine: {{{ #!C mkfoo_rn1_entry() // [] { [] } {offset c40i: P64[MainCapability+872] = P64[MainCapability+872] + 16; if (P64[MainCapability+872] > I64[MainCapability+880]) goto c40m; else goto c40l; c40m: I64[MainCapability+928] = 16; P64[MainCapability+24] = mkfoo_rn1_closure; call (I64[MainCapability+16])(R1) args: 16, res: 0, upd: 8; c40l: I64[P64[MainCapability+872] - 8] = Foo_con_info; P64[P64[MainCapability+872]] = P64[I64[MainCapability+856]]; P64[MainCapability+24] = P64[MainCapability+872] - 7; I64[MainCapability+856] = I64[MainCapability+856] + 8; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } dowrite_entry() // [] { [] } {offset c44j: call a_r3Dy_entry() args: 24, res: 0, upd: 8; } } a_r3Dy_entry() // [R1] { [] } {offset c41D: if (I64[MainCapability+856] - 16 < I64[MainCapability+864]) goto c41H; else goto c41I; c41H: P64[MainCapability+24] = a_r3Dy_closure; call (I64[MainCapability+16])(R1) args: 24, res: 0, upd: 8; c41I: I64[I64[MainCapability+856] - 8] = block_c41B_info; P64[I64[MainCapability+856] - 16] = P64[I64[MainCapability+856] + 8]; I64[MainCapability+856] = I64[MainCapability+856] - 16; call mkfoo_rn1_entry() args: 16, res: 8, upd: 8; } } block_c41B_entry() // [R1] { [] } {offset c41B: _s3Ep::P64 = P64[I64[MainCapability+856] + 8]; I64[I64[MainCapability+856] + 8] = block_c41G_info; _s3Es::P64 = P64[MainCapability+24]; P64[MainCapability+24] = _s3Ep::P64; P64[I64[MainCapability+856] + 16] = _s3Es::P64; I64[MainCapability+856] = I64[MainCapability+856] + 8; if (P64[MainCapability+24] & 7 != 0) goto u41S; else goto c41K; u41S: call block_c41G_entry(R1) args: 0, res: 0, upd: 0; c41K: call (I64[I64[P64[MainCapability+24]]])(R1) args: 8, res: 8, upd: 8; } } block_c41G_entry() // [R1] { [] } {offset c41G: _s3Ev::P64 = P64[P64[MainCapability+24] + 7]; P64[_s3Ev::P64 + 8] = P64[I64[MainCapability+856] + 8]; call "ccall" arg hints: [PtrHint, PtrHint] result hints: [] dirty_MUT_VAR(MainCapability+24, _s3Ev::P64); P64[MainCapability+24] = ()_closure+1; I64[MainCapability+856] = I64[MainCapability+856] + 16; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } }}} The fence should happen before the write of the pointer into the IORef. I can't find the fence, and can't find a codepath in the compiler that would insert it (i.e. with MO_WriteBarrier). `dirty_MUT_VAR` is actually too late to perform the fence, but it doesn't either: {{{ #!C void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p) { Capability *cap = regTableToCapability(reg); if (p->header.info == &stg_MUT_VAR_CLEAN_info) { p->header.info = &stg_MUT_VAR_DIRTY_info; recordClosureMutated(cap,p); } } }}} (Neither does `recordClosureMutated`.) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by trommler): * cc: erikd, trommler (added) * architecture: arm => Unknown/Multiple Comment: IIUC, this is also an issue on PowerPC. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => patch * differential: => Phab:D2495 Comment: rrnewton, is Phab:D2495 what you were looking for, perhaps? It's just a guess. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * cc: simonmar (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * cc: RyanGlScott (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Comment (by rrnewton): That looks like the right kind of thing, but it needs to be conditional -- only on the ARM backend. We wouldn't want to slow down x86. Ideally, it would only apply in `-threaded` mode to boot, but my current understanding is that we only make the distinction at the final link phase (which runtime to link), not as a different "way" during compilation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): `MO_WriteBarrier` compiles to no code on x86, but it prevents writes from being reordered by earlier optimisations, which is exactly what we want. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM
-------------------------------------+-------------------------------------
Reporter: rrnewton | Owner:
Type: bug | Status: patch
Priority: normal | Milestone: 8.2.1
Component: Compiler | Version: 8.0.1
Resolution: | Keywords: memory model
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2495
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: merge Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => merge * milestone: 8.2.1 => 8.0.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: merge => closed * resolution: => fixed Comment: Merged to `ghc-8.0` as 7053019e7b04842dd7364039381d8c4c069489a2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Comment (by rrnewton): Huh, I think the title of this issue was actually too narrow. We need the same barrier on array writes as well as MutVar writes, don't we? Re-open? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: memory model Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: trommler Type: bug | Status: new Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | -------------------------------------+------------------------------------- Changes (by trommler): * owner: => trommler Comment: It seems there is a write barrier missing here in `StgCmmPrim.hs`: {{{ emit (setInfo addr (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel))) -- the write barrier. We must write a byte into the mark table: -- bits8[a + header_size + StgMutArrPtrs_size(a) + x >> N] emit $ mkStore ( }}} I am going to add it an see what it is doing to #12537. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: trommler Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Changes (by trommler): * status: new => patch * differential: Phab:D2495 => Phab:D2495 Phab:D2525 Comment: With Phab:D2525 I have not seen the panic in `mkFastStringWith` described in #12537 anymore. I rebuilt Stackage nightly twice on a powerpce64le. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

With Phab:D2525 I have not seen the panic in `mkFastStringWith` described in #12537 anymore. I rebuilt Stackage nightly twice on a
#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: trommler Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:16 trommler]: powerpce64le. Another rebuild yielded one panic in `mkFastStringWith`. So it was a bit to early to declare victory :-( The situation has improved though. Previously I saw a handful of packages out of 1900 fail with that panic and now it is only one in around 4000 builds. My theory is that the write barrier is not sufficient in a memory model like PowerPC. The write barrier makes sure that the writes before it were performed in main memory before the barrier. But a different processor might still have stale values in its own caches and thus see an inconsistent state. We could insert read (load-load) barriers between the pointer access and data access but that might not be sufficient either. Think of a stale pointer cached in a processor cache that will be loaded before the read barrier will lead to an inconsistent state if data after the barrier comes from main memory. We need to define a semantics for MutVar/Array and then add the necessary barriers, which might compile to nothing depending on the memory model of the target platform. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: trommler Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Another Stackage Nightly build with ghc 8.0.1 patched with Phab:D2495 and Phab:D2525: {{{ ghc: internal error: MUT_VAR_CLEAN object entered! (GHC version 8.0.1 for powerpc64le_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug }}} This error seems to be even more difficult to trigger on Open Build Service. I have not seen this locally on my PowerMac (running powerpc64 Linux). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Changes (by trommler): * owner: trommler => Comment: Hmm, and in another package: {{{ ghc: internal error: ARR_WORDS object entered! (GHC version 8.0.1 for powerpc64le_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug }}} It looks like quite a few more memory barriers are missing. I don't have enough time right now to work on this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by erikd): @trommler, how are you triggering these errors? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

@trommler, how are you triggering these errors? Unfortunately, I cannot directly observe these errors. I see them on Open(SUSE) Build Service where a build runs inside a qemu VM on a POWER8. The issue is very rare, 1700 packages build just fine and around 30 to 50 fail. Most of the failures are in `Setup`, which we compile for each
#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:20 erikd]: package, and very few (less than 3) fail with a panic. Rebuilding the package works with high probability. On my PowerMac quad core I see a segfault in `./Setup build -v -j4` which is what is reported in #12537. In that ticket a compiler panic in `mkFastStringWith` is reported but I have not been able to provoke a panic on my machine. Is it worth trying some more runs and hope to observe a panic to rule out that qemu is doing something odd? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by erikd): @trommler Yes, that may well be worth while. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:22 erikd]:
@trommler Yes, that may well be worth while. OK, I am working on it and will report back.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:23 trommler]:
Replying to [comment:22 erikd]:
@trommler Yes, that may well be worth while. OK, I am working on it and will report back. So, I compiled all of Stackage 7.x on a POWER 8 LPAR instance with 8 CPUs, so no qemu involved and saw five packages fail with a panic in `mkFastStringWith`. A few more failed with segfaults in `Setup build -j8`.
I used ghc 8.0.1 patched with the two patches in this ticket. My theory is that `lwsync` is not a strong enough barrier for a write barrier. So far I thought a write barrier is the same as a store-store barrier and hence `lwsync` is the right choice on PowerPC. I could put a `sync` and see what happens but I would like to know what I am doing when using that big a hammer. So what is the semantics of a write barrier with respect to other processors/cores? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by simonmar): If a `sync` fixes it then that would be a useful data point. If not, it means there are barriers missing elsewhere. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:25 simonmar]:
If a `sync` fixes it then that would be a useful data point. If not, it means there are barriers missing elsewhere. Using `sync` the situation improves a lot on my PowerMac G5 quad. `Setup build -j4` still segfaults randomly but only occasionally. With `lwsync` it almost always segfaulted. An improvement but not a fix.
So I read the Power ISA specification again (especially Book II, Appendix B Programming Examples for Sharing Storage): It seems that even a `sync` on processor A is not sufficient to prevent reordering of loads on processor B. A load-load barrier is required when there is no data dependency between the two loads. I will look into that option over the weekend. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:25 simonmar]:
If a `sync` fixes it then that would be a useful data point. If not, it means there are barriers missing elsewhere. I had not seen your comment on moving the write barrier.
If that was indeed the problem then there is a data dependency and we don't have to worry about a load-load barrier on PowerPC (and ARM). I'll do a test build on a PowerPC 970MP and a POWER8. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:27 trommler]:
Replying to [comment:25 simonmar]:
If a `sync` fixes it then that would be a useful data point. If not, it means there are barriers missing elsewhere. I had not seen your comment on moving the write barrier.
If that was indeed the problem then there is a data dependency and we don't have to worry about a load-load barrier on PowerPC (and ARM).
I'll do a test build on a PowerPC 970MP and a POWER8. On POWER8 I still see a few panics in `mkFastStringWith` and on build service (qemu on POWER8 hardware) one panic with `ARR_WORDS object entered`. A couple of packages fail with segfaults in `Setup build`.
The PowerMac (PowerPC 970) build is still running. So far no panic but about the same number of packages as before fail in `Setup build`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by bgamari): trommler, would you like to see Phab:D2525 merged for 8.0.2? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:29 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by erikd): Even if @trommler doesn't, I would like to see it in 8.0.2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Comment (by trommler): Replying to [comment:29 bgamari]:
trommler, would you like to see Phab:D2525 merged for 8.0.2? Yes, please include it in 8.0.2.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:31 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM
-------------------------------------+-------------------------------------
Reporter: rrnewton | Owner:
Type: bug | Status: patch
Priority: normal | Milestone: 8.0.2
Component: Compiler | Version: 8.0.1
Resolution: | Keywords: memory model
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking: 12537
Related Tickets: | Differential Rev(s): Phab:D2495
Wiki Page: | Phab:D2525
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12469: Memory fence on writes to MutVar/Array missing on ARM -------------------------------------+------------------------------------- Reporter: rrnewton | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: memory model Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: 12537 Related Tickets: | Differential Rev(s): Phab:D2495 Wiki Page: | Phab:D2525 -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed Comment: I think comment:32 should have fixed this. Comment:32 was merged to `ghc-8.0` as 8008d27c09ce36f368ab562f3f821f666d99f519. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:33 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12469: Memory fence on writes to MutVar/Array missing on ARM
-------------------------------------+-------------------------------------
Reporter: rrnewton | Owner: (none)
Type: bug | Status: closed
Priority: normal | Milestone: 8.0.2
Component: Compiler | Version: 8.0.1
Resolution: fixed | Keywords: memory model
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking: 12537
Related Tickets: | Differential Rev(s): Phab:D2495
Wiki Page: | Phab:D2525
-------------------------------------+-------------------------------------
Comment (by Ömer Sinan Ağacan
participants (1)
-
GHC