Right, it is compiler effects at this boundary that I'm worried about, values that are not read from memory after the changes have been made, not memory effects or data races.

On Fri, Oct 28, 2016 at 3:02 AM, Simon Marlow <marlowsd@gmail.com> wrote:
Hi Ryan, I don't think that's the issue.  Those variables can only be modified in setNumCapabilities, which acquires *all* the capabilities before it makes any changes.  There should be no other threads running RTS code(*) while we change the number of capabilities.  In particular we shouldn't be in releaseGCThreads while enabled_capabilities is being changed.

(*) well except for the parts at the boundary with the external world which run without a capability, such as rts_lock() which acquires a capability.

Cheers
Simon

On 27 Oct 2016 17:10, "Ryan Yates" <fryguybob@gmail.com> wrote:
Briefly looking at the code it seems like several global variables involved should be volatile: n_capabilities, enabled_capabilities, and capabilities.  Perhaps in a loop like in scheduleDoGC the compiler moves the reads of n_capabilites or capabilites outside the loop.  A failed requestSync in that loop would not get updated values for those global pointers.  That particular loop isn't doing that optimization for me, but I think it could happen without volatile.

Ryan

On Thu, Oct 27, 2016 at 9:18 AM, Ben Gamari <ben@smart-cactus.org> wrote:
Simon Marlow <marlowsd@gmail.com> writes:

> I haven't been able to reproduce the failure yet. :(
>
Indeed I've also not seen it in my own local builds. It's quite an
fragile failure.

Cheers,

- Ben


_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs