Matthew Pickering pushed to branch wip/fix-eventlog-flush-deadlock at Glasgow Haskell Compiler / GHC Commits: a6d6024e by Matthew Pickering at 2025-11-17T13:19:15+00:00 rts: Fix a deadlock with eventlog flush interval and RTS shutdown The ghc_ticker thread attempts to flush at the eventlog tick interval, this requires waiting to take all capabilities. At the same time, the main thread is shutting down, the schedule is stopped and then we wait for the ticker thread to finish. Therefore we are deadlocked. The solution is to use `newBoundTask/exitMyTask`, so that flushing can cooperate with the scheduler shutdown. Fixes #26573 - - - - - 2 changed files: - rts/eventlog/EventLog.c - testsuite/tests/rts/all.T Changes: ===================================== rts/eventlog/EventLog.c ===================================== @@ -491,13 +491,7 @@ endEventLogging(void) eventlog_enabled = false; - // Flush all events remaining in the buffers. - // - // N.B. Don't flush if shutting down: this was done in - // finishCapEventLogging and the capabilities have already been freed. - if (getSchedState() != SCHED_SHUTTING_DOWN) { - flushEventLog(NULL); - } + flushEventLog(NULL); ACQUIRE_LOCK(&eventBufMutex); @@ -1626,15 +1620,24 @@ void flushEventLog(Capability **cap USED_IF_THREADS) return; } + // N.B. Don't flush if shutting down: this was done in + // finishCapEventLogging and the capabilities have already been freed. + // This can also race against the shutdown if the flush is triggered by the + // ticker thread. (#26573) + if (getSchedState() == SCHED_SHUTTING_DOWN) { + return; + } + ACQUIRE_LOCK(&eventBufMutex); printAndClearEventBuf(&eventBuf); RELEASE_LOCK(&eventBufMutex); #if defined(THREADED_RTS) - Task *task = getMyTask(); + Task *task = newBoundTask(); stopAllCapabilitiesWith(cap, task, SYNC_FLUSH_EVENT_LOG); flushAllCapsEventsBufs(); releaseAllCapabilities(getNumCapabilities(), cap ? *cap : NULL, task); + exitMyTask(); #else flushLocalEventsBuf(getCapability(0)); #endif ===================================== testsuite/tests/rts/all.T ===================================== @@ -2,6 +2,11 @@ test('testblockalloc', [c_src, only_ways(['normal','threaded1']), extra_run_opts('+RTS -I0')], compile_and_run, ['']) +test('numeric_version_eventlog_flush', + [ignore_stdout], + run_command, + ['{compiler} --numeric-version +RTS -l --eventlog-flush-interval=1 -RTS']) + test('testmblockalloc', [c_src, only_ways(['normal','threaded1']), extra_run_opts('+RTS -I0 -xr0.125T'), when(arch('wasm32'), skip)], # MBlocks can't be freed on wasm32, see Note [Megablock allocator on wasm] in rts View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a6d6024e37fb6a6f50a4418956f7e29a... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a6d6024e37fb6a6f50a4418956f7e29a... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Matthew Pickering (@mpickering)