
#15834: genSym is not thread safe with respect to setNumCapabilities ----------------------------------------+--------------------------------- Reporter: NeilMitchell | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.6.1 Keywords: | Operating System: Linux Architecture: Unknown/Multiple | Type of failure: None/Unknown Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: ----------------------------------------+--------------------------------- In a large proprietary application using the GHC API, we observe really weird errors (e.g. overlapping instances for {{{Eq Foo}}} and {{{Eq Bar}}}, where {{{Foo}}} and {{{Bar}}} are completely unrelated, and come from different modules). The pattern we follow is: * Running with the threaded RTS, 1 initial thread * Create a new unique supply with {{{mkSplitUniqSupply}}} and put it in an {{{MVar}}}. * Repeating many times: * Set the thread count higher (e.g. 8) using {{{setNumCapabilities}}} * On many threads in parallel: * Obtain a new unique supply on the original with {{{splitUniqSupply}}}, protected by the {{{MVar}}}, and update the other one in the {{{MVar}}} * Use that unique supply to interact with the GHC API * Set the thread count back to 1 Our observations of the errors are best explained by the unique names not being nearly as unique as they might be expected to be. Reading the code for {{{genSym}}}: {{{#!c if (n_capabilities == 1) { GenSymCounter = (GenSymCounter + GenSymInc) & UNIQUE_MASK; checkUniqueRange(GenSymCounter); return GenSymCounter; } else { HsInt n = atomic_inc((StgWord *)&GenSymCounter, GenSymInc) & UNIQUE_MASK; checkUniqueRange(n); return n; } }}} It only does an {{{atomic_inc}}} if {{{n_capabilities == 1}}}, but it doesn't read {{{n_capabilities}}} atomically, so is it suffering a race? The solution was to set the thread count initially, before any interactions with the GHC API, which seems to solve the problem. Alas, we don't have a reproducible test case, and in fact were unable to reproduce it anywhere but our Linux CI, and even then non-deterministically. The problem does not currently impact us (the workaround is robust), but it seemed worth sharing. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15834 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler