[Git][ghc/ghc][wip/io-manager-deadlock-detection] 27 commits: Adjust the style of notification for I/O manager events

18 Nov 2025

      Duncan Coutts pushed to branch wip/io-manager-deadlock-detection at Glasgow Haskell Compiler / GHC

Commits:
da64c1ab by Duncan Coutts at 2025-11-17T21:11:07+00:00
Adjust the style of notification for I/O manager events

For in-Haskell I/O managers, when we start up the RTS or use
forkProcess, we need to start the I/O manager threads. Similarly, as
part of setNumCapabilities the RTS needs to notify the I/O manager to
let it respond.

Previously, in both cases, we did this synchronously. Doing so starts
the thread and runs the schduler until the thread finishes. In general,
running a thread synchronously in this way, using rts_evalIO, might
change the capability (due to GC, thread migration etc). This makes the
capability be an in/out parameter which is a bit of a wart in the API
for the I/O manager.

Fortunately, it's unnecessary.

During RTS startup and forkProcess, no other threads would be running
at the same time, and thus we can use rts_evalIO with the extra
assertion that in fact the capability does not change.

For setNumCapabilities on the other hand, other threads are running
so we can't use the same trick. But setNumCapabilities already has to
cope with an inherent race condition. So it does not need to be
synchronous, it just needs the I/O manager thread to be started. By
making the notification for setNumCapabilities be async (start the
thread but do not wait), we can again make the capability be a simple
in-param, not in/out.

This makes the I/O manager API simpler and more uniform.

- - - - -
76ebe9d8 by Duncan Coutts at 2025-11-17T22:00:11+00:00
Move THREADED_RTS-conditional struct members to end of Capability

Accessing members of the Capability struct from CMM code rely on
accessor macros. (The macros are generated by deriveConstants).

These macros have a single definition. This means that the offsets of
all struct members must *not* vary based on THREADED_RTS vs
!THREADED_RTS. This requires that any struct members that are
conditional on THREADED_RTS must occur after the unconditional struct
members. Hence we move all the ones that are conditional on
THREADED_RTS to the end.

Add a deriveConstants entry for the iomgr member of the Capability
struct, which was the motivation for this change.

Add warning messages to help our future selves. Debugging this took me
a couple hours in gdb!

- - - - -
1c33adc8 by Duncan Coutts at 2025-11-17T22:10:54+00:00
Make the IOManager API use CapIOManager rather than Capability

This makes the API somewhat more self-contained and more consistent.
Now the IOManager API and each of the backends takes just the I/O
manager structure. Previously we had a bit of a mixture, depending on
whether the function needed access to the Capability or just the
CapIOManager.

We still need access to the cap, so we introduce a back reference to
reach the capability, via iomgr->cap.

Convert all uses in select and poll backends, but not win32 ones.
Convert callers in the scheduler and elsewhere.

Also convert the three CMM primops that call IOManager APIs. They just
need to use Capability_iomgr(MyCapability()).

- - - - -
3f826512 by Duncan Coutts at 2025-11-17T22:13:01+00:00
Split posix/MIO.c out of posix/Signals.c

The MIO I/O manager was secretly living inside the Signals file.

Now it gets its own file, like any other self-respecting I/O manager.

- - - - -
e3d64554 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Rationalise some scheduler run queue utilities

Move them all to the same place in the file.

Make some static that were used only internally.

Also remove a redundant assignment after calling truncateRunQueue that
is already done within truncateRunQueue.

- - - - -
756776f1 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Rename initIOManager{AfterFork} to {re}startIOManager

These are more accurate names, since these actions happen after
initialisation and are really about starting (or restarting) background
threads.

- - - - -
bb775493 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Add a TODO to the MIO I/O manager

The direction of travel is to make I/O managers per-capability and have
all their state live in the struct CapIOManager. The MIO I/O manager
however still has a number of global variables.

It's not obvious how handle these globals however.

- - - - -
32d38f80 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Free per-cap I/O managers during shutdown and forkProcess

Historically this was not strictly necessary. The select and win32
legacy I/O managers did not maintain any dynamically allocated
resources. The new poll one does (an auxillary table), and so this
should be freed.

After forkProcess, all threads get deleted. This includes threads
waiting on I/O or timers. So as of this patch, resetting the I/O
manager is just about tidying things up. For example, for the poll
I/O manager this will reset the size of the AIOP table (which
otherwise grows but never shrinks).

In future however the re-initialising will become neeecessary for
functionality, since some I/O managers will need to re-initialise
wakeup fds that are set CLOEXEC.

- - - - -
731e40c7 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Add an FdWakup module for posix I/O managers

This will be used to implement wakeupIOManager for in-RTS I/O managers.

It provides a notification/wakeup mechanism using FDs, suitable for
situations when I/O managers are blocked on a set of fds anyway.

- - - - -
5c54ddf4 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Add wakeupIOManager support for select I/O manager

Uses the FdWakup mechanism.

- - - - -
27947809 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Add wakeupIOManager support for poll I/O manager

Uses the FdWakup mechanism.

A quirk we have to cope with is that we now need to poll one more fd --
the wakeup_fd_r -- but this fd has no corresponding entry in the
aiop_table. This is awkward since we have set up our aiop_poll_table to
be an auxilliary table with matching indicies.

The solution this patch uses (and described in the comments) is to have
two tables: struct pollfd *aiop_poll_table, *full_poll_table;
and to have the aiop_poll_table alias the tail of the full_poll_table.
The head entry in the full_poll_table is the extra fd. So we poll the
full_poll_table, while the aiop_poll_table still has matching indicies
with the aiop_table.

Hurrah for C aliasing rules.

- - - - -
2dcad70b by Duncan Coutts at 2025-11-17T22:20:33+00:00
Add wakeupIOManager support for win32 legacy I/O manager

- - - - -
d776d0c1 by Duncan Coutts at 2025-11-17T22:20:33+00:00
wakeupIOManager is now required for all I/O managers

We are going to rely on it. Previously it could be a no-op. Update the
docs in the header file.

Also, temporarily disable awaitCompletedTimeoutsOrIO post-condition
assertion. It will become more complicated due to wakeupIOManager, and
it's not yet clear how to express it.

We will re-introduce a post condition after a few more changes.

- - - - -
d45af267 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Make signal handling be a respondibility of the I/O manager(s)

Previously it was scattered between I/O managers and the scheduler, and
especially the scheduler's deadlock detection.

Previously the scheduler would poll for pending signals each iteration
of the scheduler loop. The scheduler also had some hairy signal
functionality in the deadlock detection: in the non-threaded RTS (only)
if there were still no threads running after deadlock detection then it
would block waiting for signals.

But signals can and (in my opinion) should be thought of as just a funny
kind of I/O, and thus should be a responsibility of the I/O manager.

So now we have the I/O managers poll for signals when they are polling
for I/O completion (and removing the separate poll in the scheduler).
And when I/O managers block waiting for I/O then they now also start
signal handlers if they get interrupted by a signal. Crucially, if there
is no pending I/O or timers, the awaitCompletedTimeoutsOrIO will still
block waiting for signals.

This patch puts us into an intermediate state: it temporarily breaks
deadlock detection in the non-threaded RTS. The waiting on I/O currently
happens before deadlock detection. This means we'll now wait forever on
signals before doing deadlock detection. We need to move waiting after
deadlock detection. We'll do that in a later patch.

- - - - -
a8c6d8ac by Duncan Coutts at 2025-11-17T22:20:33+00:00
Clean up signal handling internal API

Now that the I/O manager is responsible for signals, we can simplify the
API we present for signal handling.

We now just need startPendingSignalHandlers, which is called from the
I/O managers. We can get rid of awaitUserSignals. We also don't need
RtsSignals.h to re-export the platform-specific posix/Signals.h or
win32/ConsoleHandler.h

We can also hide more of the implementation of signals. Less has to be
exposed in posix/Signals.h or win32/ConsoleHandler.h. Partly this is
because we don't need inline functions (or macros) in the interface.

Also remove signal_handlers from RTS ABI exported symbols list. It does
not appear to have any users in the core libs, and its really an
internal implementation detail. It should not be exposed unless its
really necessary.

- - - - -
43460e3d by Duncan Coutts at 2025-11-17T22:20:33+00:00
In the scheduler, move I/O blocking after deadlock detection

To make deadlock detection effective in the non-threaded RTS when there
are deadlocked threads and other unrelated threads waiting on I/O, we
need to arrange to do deadlock detection before we block in scheduler
to wait on I/O.

The solution is to:

1. adjust scheduleFindWork, which runs before deadlock detection, to
   only poll for I/O and not block; and
2. add a step after deadlock detection to wait on I/O if there are
   still no threads to run (and there's any I/O or timeouts outstanding)

The scheduleCheckBlockedThreads is now so simple that it made more sense
to inline it into scheduleFindWork.

- - - - -
ddce158e by Duncan Coutts at 2025-11-17T22:20:33+00:00
Remove bogus anyPendingTimeoutsOrIO guard from scheduleDetectDeadlock

The deadlock detection was only invoked if both of these conditions
hold:
1. the run queue is empty
2. there is no pending I/O or timeouts

The second condition is unnecessary. The deadlock detection mechanism
can find deadlocks even if there are other threads waiting on I/O or
timers. Having this extra condition means that we fail to detect
blocked threads if there are any threads waiting on I/O or timers.

Part of fixing issue #26408

- - - - -
ca291ca9 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Don't consider pending I/O for early context switch optimisation

Context switches are normally initiated by the timer signal. If however
the user specifies "context switch as often as possible", with +RTS -C0
then the scheduler arranges for an early context switch (when it's just
about to run a Haskell thread).

Context switching very often is expensive, so as an optimisation there
cases where we do not arrange an early context switch:
1. if there's no other threads to run
2. if there is no pending I/O or timers

This patch eliminates case 2, leaving only case 1.

The rationale is as follows. The use of this was inconsistent across
platforms and threaded/non-threaded RTS ways. It only worked on the
non-threaded RTS and on Windows only worked for the win32-legacy I/O
manager. On all other combinations anyPendingTimeoutsOrIO would always
return false. The fact that nobody noticed and complained about this
inconsistency suggests that the feature is not relied upon.

If however it turns out that applications do rely on this, then the
proper thing to do is not to restore this check, but to add a new I/O
manager hint function that returns if there is any pending events that
are likely to happen *soon*: for example timeouts expiring within one
timeslice, or I/O waits on things likely to complete soon like disk I/O,
but not for example socket/pipe I/O.

The motivation to avoid this use of anyPendingTimeoutsOrIO is to
allow us to eliminate anyPendingTimeoutsOrIO entirely. All other uses
of this are just guards on {await,poll}CompletedTimeoutsOrIO and
the guards can safely be folded into those functions. This will better
cope with some I/O managers having no proper implementation of
anyPendingTimeoutsOrIO.

Ultimately this will let us simplify the scheduler which currently has
to have special #ifdef mingw32_HOST_OS cases to cope with the lack of a
working anyPendingTimeoutsOrIO for some Windows I/O managers

- - - - -
b84107c1 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Remove anyPendingTimeoutsOrIO guarding {poll,await}CompletedTimeoutsOrIO

Previously the API of the I/O manager used a two step process: check
anyPendingTimeoutsOrIO and then call {poll,await}CompletedTimeoutsOrIO.
This was primarily there as a performance thing, to cheaply check if we
need to do anything.

And then because anyPendingTimeoutsOrIO existed, it was used for other
things too. We have now eliminated the other uses, and are just left
with the performance pattern.

But this was problematic because not all I/O managers correctly
implement anyPendingTimeoutsOrIO (specifically the win32 ones), and now
that we also make I/O managers responsible for signals then we need to
poll/await even if there is no pending I/O or timeouts. If there is no
pending I/O or timeouts then poll/await needs to degenerate to just
waiting forever for any signals.

- - - - -
e2683163 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Remove anyPendingTimeoutsOrIO, it is no longer used

And this avoids the problems arising from the win32 I/O managers having
had a bogus implementation.

- - - - -
391574b0 by Duncan Coutts at 2025-11-17T22:20:33+00:00
Remove second scheduler call to awaitCompletedTimeoutsOrIO

Previously awaitCompletedTimeoutsOrIO was called both before and after
deadlock detection in the scheduler. The reason for that was that the
win32 I/O managers had a bogus implementation of anyPendingTimeoutsOrIO
and this was used to guard the call of awaitCompletedTimeoutsOrIO prior
to deadlock detection. This meant the first call site was never actually
called when using the win32 I/O managers. This was the reason for the
second call: the first one was never used. What a mess.

So now we have a simple design in the scheduler:
1. poll for completed I/O, timers or signals
2. if no runnable threads: do deadlock detection
3. if still no runnable threads: block waiting for I/O, timers or
   signals.

- - - - -
a2378e44 by Duncan Coutts at 2025-11-17T22:20:34+00:00
Lift emptyRunQueue guard out of scheduleDetectDeadlock

this improved the clarity of the logic when reading the scheduler code.

- - - - -
1ee37c94 by Duncan Coutts at 2025-11-17T22:20:34+00:00
Make non-threaded deadlock detection also rely on idle GC

Only do deadlock detection GC when idle GC kicks in. This also relies on
using wakeUpRts, so now do this unconditionally. Previously wakeUpRts
was for the threaded rts only.

- - - - -
0422fe30 by Duncan Coutts at 2025-11-17T22:20:34+00:00
Enable idle GC by default on non-threaded RTS.

The behaviour is now uniform between threaded and non-threaded. The
deadlock detection now relies on idle GC for both threaded and
non-threaded ways. Previously deadlock detection did not rely on idle
GC for the non-threaded way.

- - - - -
085ed94a by Duncan Coutts at 2025-11-17T22:20:34+00:00
Add a long Note [Deadlock detection]

It describes the historical and modern designs and their trade-offs.

The point is we've now unified the code for deadlock detection between
the threaded and non-threaded ways, by changing the non-threaded to
follow the same design as the threaded.

- - - - -
4ee93b3d by Duncan Coutts at 2025-11-17T22:20:34+00:00
Add a test for deadlock detection, issue #26408

- - - - -
5ca25224 by Duncan Coutts at 2025-11-17T22:20:34+00:00
Update the user guide with the revised idle GC behaviour

i.e. it's now not just for the threaded RTS, but general.

Also document the fact that disabling idle GC also disables deadlock
detection.

- - - - -

35 changed files:

- docs/users_guide/runtime_control.rst
- rts/Capability.c
- rts/Capability.h
- rts/IOManager.c
- rts/IOManager.h
- rts/IOManagerInternals.h
- rts/PrimOps.cmm
- rts/RaiseAsync.c
- rts/RtsFlags.c
- rts/RtsSignals.h
- rts/RtsStartup.c
- rts/RtsSymbols.c
- rts/Schedule.c
- rts/Schedule.h
- rts/Timer.c
- + rts/posix/FdWakeup.c
- + rts/posix/FdWakeup.h
- + rts/posix/MIO.c
- + rts/posix/MIO.h
- rts/posix/Poll.c
- rts/posix/Poll.h
- rts/posix/Select.c
- rts/posix/Select.h
- rts/posix/Signals.c
- rts/posix/Signals.h
- rts/posix/Timeout.c
- rts/posix/Timeout.h
- rts/rts.cabal
- rts/win32/AwaitEvent.c
- rts/win32/ConsoleHandler.c
- rts/win32/ConsoleHandler.h
- + testsuite/tests/rts/T26408.hs
- + testsuite/tests/rts/T26408.stderr
- testsuite/tests/rts/all.T
- utils/deriveConstants/Main.hs

The diff was not included because it is too large.

View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/050e03a4e64a64051a9b0f19cea9934...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/050e03a4e64a64051a9b0f19cea9934...
You're receiving this email because of your account on gitlab.haskell.org.

[Git][ghc/ghc][wip/io-manager-deadlock-detection] 27 commits: Adjust the style of notification for I/O manager events

Duncan Coutts (＠dcoutts)