
#12469: Memory fence on writeIORef missing on ARM -------------------------------------+------------------------------------ Reporter: rrnewton | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: memory model Operating System: Unknown/Multiple | Architecture: arm Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------ Description changed by rrnewton: @@ -1,1 +1,1 @@ - The memory model question has been debated now and again. his + The memory model question has been debated now and again. This @@ -15,1 +15,2 @@ - ```Haskell + {{{ + #!haskell @@ -32,1 +33,1 @@ - ``` + }}} @@ -37,1 +38,2 @@ - ```C + {{{ + #!C @@ -59,1 +61,0 @@ - ``` @@ -61,1 +62,0 @@ - ```C @@ -126,1 +126,1 @@ - ``` + }}} @@ -135,1 +135,2 @@ - ```C + {{{ + #!C @@ -145,1 +146,1 @@ - ``` + }}} New description: The memory model question has been debated now and again. This thread from ten years back (https://mail.haskell.org/pipermail/haskell-prime/2006-April/001237.html) lays out the basic situation with thunk update, writeIORef, and memory fences. But we recently began experimenting with GHC on ARM platforms, and it seems to lack a memory fence that the participants in the cited thread expect it to have. Here's an attempt to construct a program which writes fields of a data structure, and then writes the pointer to that structure to an IORef, without the proper fence inbetween: {{{ #!haskell import Data.IORef import Control.Concurrent data Foo = Foo Int deriving Show {-# NOINLINE mkfoo #-} mkfoo x = Foo x {-# NOINLINE dowrite #-} dowrite r n = writeIORef r $! mkfoo n main = do r <- newIORef (Foo 3) forkIO (dowrite r 4) x <- readIORef r print x }}} Here's the relevant bits of the CMM that results when compiled on an ARM 64 machine: {{{ #!C mkfoo_rn1_entry() // [] { [] } {offset c40i: P64[MainCapability+872] = P64[MainCapability+872] + 16; if (P64[MainCapability+872] > I64[MainCapability+880]) goto c40m; else goto c40l; c40m: I64[MainCapability+928] = 16; P64[MainCapability+24] = mkfoo_rn1_closure; call (I64[MainCapability+16])(R1) args: 16, res: 0, upd: 8; c40l: I64[P64[MainCapability+872] - 8] = Foo_con_info; P64[P64[MainCapability+872]] = P64[I64[MainCapability+856]]; P64[MainCapability+24] = P64[MainCapability+872] - 7; I64[MainCapability+856] = I64[MainCapability+856] + 8; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } dowrite_entry() // [] { [] } {offset c44j: call a_r3Dy_entry() args: 24, res: 0, upd: 8; } } a_r3Dy_entry() // [R1] { [] } {offset c41D: if (I64[MainCapability+856] - 16 < I64[MainCapability+864]) goto c41H; else goto c41I; c41H: P64[MainCapability+24] = a_r3Dy_closure; call (I64[MainCapability+16])(R1) args: 24, res: 0, upd: 8; c41I: I64[I64[MainCapability+856] - 8] = block_c41B_info; P64[I64[MainCapability+856] - 16] = P64[I64[MainCapability+856] + 8]; I64[MainCapability+856] = I64[MainCapability+856] - 16; call mkfoo_rn1_entry() args: 16, res: 8, upd: 8; } } block_c41B_entry() // [R1] { [] } {offset c41B: _s3Ep::P64 = P64[I64[MainCapability+856] + 8]; I64[I64[MainCapability+856] + 8] = block_c41G_info; _s3Es::P64 = P64[MainCapability+24]; P64[MainCapability+24] = _s3Ep::P64; P64[I64[MainCapability+856] + 16] = _s3Es::P64; I64[MainCapability+856] = I64[MainCapability+856] + 8; if (P64[MainCapability+24] & 7 != 0) goto u41S; else goto c41K; u41S: call block_c41G_entry(R1) args: 0, res: 0, upd: 0; c41K: call (I64[I64[P64[MainCapability+24]]])(R1) args: 8, res: 8, upd: 8; } } block_c41G_entry() // [R1] { [] } {offset c41G: _s3Ev::P64 = P64[P64[MainCapability+24] + 7]; P64[_s3Ev::P64 + 8] = P64[I64[MainCapability+856] + 8]; call "ccall" arg hints: [PtrHint, PtrHint] result hints: [] dirty_MUT_VAR(MainCapability+24, _s3Ev::P64); P64[MainCapability+24] = ()_closure+1; I64[MainCapability+856] = I64[MainCapability+856] + 16; call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8; } } }}} The fence should happen before the write of the pointer into the IORef. I can't find the fence, and can't find a codepath in the compiler that would insert it (i.e. with MO_WriteBarrier). `dirty_MUT_VAR` is actually too late to perform the fence, but it doesn't either: {{{ #!C void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p) { Capability *cap = regTableToCapability(reg); if (p->header.info == &stg_MUT_VAR_CLEAN_info) { p->header.info = &stg_MUT_VAR_DIRTY_info; recordClosureMutated(cap,p); } } }}} (Neither does `recordClosureMutated`.) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12469#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler