
Simon Peyton Jones pushed to branch wip/T23162-spj at Glasgow Haskell Compiler / GHC
Commits:
62ae97de by Duncan Coutts at 2025-09-12T13:23:33-04:00
Handle heap allocation failure in I/O primops
The current I/O managers do not use allocateMightFail, but future ones
will. To support this properly we need to be able to return to the
primop with a failure. We simply use a bool return value.
Currently however, we will just throw an exception rather than calling
the GC because that's what all the other primops do too.
For the general issue of primops invoking GC and retrying, see
https://gitlab.haskell.org/ghc/ghc/-/issues/24105
- - - - -
cb9093f5 by Duncan Coutts at 2025-09-12T13:23:33-04:00
Move (and rename) scheduleStartSignalHandlers into RtsSignals.h
Previously it was a local helper (static) function in Schedule.c.
Rename it to startPendingSignalHandlers and deifine it as an inline
header function in RtsSignals.h. So it should still be fast.
Each (new style) I/O manager is going to need to do the same, so eliminating
the duplication now makes sense.
- - - - -
9736d44a by Duncan Coutts at 2025-09-12T13:23:33-04:00
Reduce detail in printThreadBlockage I/O blocking cases
The printThreadBlockage is used in debug tracing output.
For the cases BlockedOn{Read,Write,Delay} the output previously included
the fd that was being waited on, and the delay target wake time.
Superficially this sounds useful, but it's clearly not that useful
because it was already wrong for the Win32 non-threaded I/O manager. In
that situation it will print garbage (the async_result pointer, cast to
a fd or a time).
So given that it apparently never mattered that the information was
accurate, then it's hardly a big jump to say it doesn't matter if it is
present at all.
A good reason to remove it is that otherwise we have to make a new
API and a per-I/O manager implementation to fetch the information. And
for some I/O manager implementations, this information is not available.
It is not available in the win32 non-threaded I/O manager. And for some
future Linux ones, there is no need for the fd to be stored, so storing
it would be just extra space used for very little gain.
So the simplest thing is to just remove the detail.
- - - - -
bc0f2d5d by Duncan Coutts at 2025-09-12T13:23:33-04:00
Add TimeoutQueue.{c,h} and corresponding tests
A data structure used to efficiently manage a collection of timeouts.
It is a priority queue based on absolute expiry time. It uses 64bit
high-precision Time for the keys. The values are normal closures which
allows for example using MVars for unblocking.
It is common in many applications for timeouts to be created and then
deleted or altered before they expire. Thus the choice of data structure
for timeouts should support this efficiently. The implementation choice
here is a leftist heap with the extra feature that it supports deleting
arbitrary elements, provided the caller retain a pointer to the element.
While the deleteMin operation takes O(log n) time, as in all heap
structures, the delete operation for arbitrary elements /typically/
takes O(1), and only O(log n) in the worst case. In practice, when
managing thousands of timeouts it can be a factor of 10 faster to delete
a random timeout queue element than to remove the minimum element. This
supports the common use case.
The plan is to use it in some of the RTS-side I/O managers to support
their timer functionality. In this use case the heap value will be an
MVar used for each timeout to unblock waiting threads.
- - - - -
d1679c9d by Duncan Coutts at 2025-09-12T13:23:33-04:00
Add ClosureTable.{c,h} and corresponding tests
A table of pointers to closures on the GC heap with stable indexes.
It provides O(1) alloc, free and lookup. The table can be expanded
using a simple doubling strategy: in which case allocation is typically
O(1) and occasionally O(n) for overall amortised O(1). No shrinking is
used.
The table itself is heap allocated, and points to other heap objects.
As such it's necessary to use markClosureTable to ensure the table is
used as a GC root to keep the table entries alive, and maintain proper
pointers to them as the GC moves heap objects about.
It is designed to be allocated and accesses exclusively from a single
capability, enabling it to work without any locking. It is thus similar
to the StablePtr table, but per-capability which removes the need for
locking. It _should_ also provide lower GC pause times with the
non-moving GC by spending only O(1) time in markClosureTable, vs O(n)
for markStablePtrTable.
The plan is to use it in some of the I/O managers to keep track of
in-flight I/O operations (but not timers). This allows the tracking
info to be kept on the (unpinned) GC heap, and shared with Haskell
code, and by putting a pointer to the tracking information in a table,
the index remains stable and can be passed via foreign code (like the
kernel).
- - - - -
78cb8dd5 by Duncan Coutts at 2025-09-12T13:23:33-04:00
Add the StgAsyncIOOp closure type
This is intended to be used by multiple I/O managers to help with
tracking in-flight I/O operations.
It is called asynchronous because from the point of view of the RTS we
have many such operations in progress at once. From the point of view of
a Haskell thread of course it can look synchronous.
- - - - -
a2839896 by Duncan Coutts at 2025-09-12T13:23:33-04:00
Add StgAsyncIOOp and StgTimeoutQueue to tso->block_info
These will be used by new I/O managers, for threads blocked on I/O or
timeouts.
- - - - -
fdc2451c by Duncan Coutts at 2025-09-12T13:23:33-04:00
Add a new I/O manager based on poll()
This is a proof of concept I/O manager, to show how to add new ones
neatly, using the ClosureTable and TimeoutQueue infrastructure.
It uses the old unix poll() API, so it is of course limited in
performance by that, but it should have the benefit of wide
compatibility. Also we neatly avoid a name clash with the existing
select() I/O manager.
Compared to the select() I/O manager:
1. beause it uses poll() it is not limited to 1024 file descriptors
(but it's still O(n) so don't expect great performance);
2. it should have much faster threadDelay (when using it in lots of
threads at once) because it's based on the new TimeoutQueue which is
O(log n) rather than O(n).
Some of the code related to timers/timouts is put into a shared module
rts/posix/Timeout.{h,c} since it is intended to be shared with other
similar I/O managers.
- - - - -
6c273b76 by Duncan Coutts at 2025-09-12T13:23:34-04:00
Document the I/O managers in the user guide
and note the new poll I/O manager in the release notes.
- - - - -
824fab74 by Duncan Coutts at 2025-09-12T13:23:34-04:00
Use the poll() I/O manager by default
That is, for the non-threaded RTS, prefer the poll I/O manager over the
legacy select() one, if both can be enabled.
This patch is primarily for CI testing, so we should probably remove
this patch before merging. We can change defaults later after wider
testing and feedback.
- - - - -
39392532 by Luite Stegeman at 2025-09-12T13:24:16-04:00
Support larger unboxed sums
Change known constructor encoding for sums in interfaces to use
11 bits for both the arity and the alternative (up from 8 and 6,
respectively)
- - - - -
2af12e21 by Luite Stegeman at 2025-09-12T13:24:16-04:00
Decompose padding smallest-first in Cmm toplevel data constructors
This makes each individual padding value aligned
- - - - -
418fa78f by Luite Stegeman at 2025-09-12T13:24:16-04:00
Use slots smaller than word as tag for smaller unboxed sums
This packs unboxed sums more efficiently by allowing
Word8, Word16 and Word32 for the tag field if the number of
constructors is small enough
- - - - -
8d7e912f by Rodrigo Mesquita at 2025-09-12T17:57:24-04:00
ghc-toolchain: Use ByteOrder rather than new Endianness
Don't introduce a duplicate datatype when the previous one is equivalent
and already used elsewhere. This avoids unnecessary translation between
the two.
- - - - -
7d378476 by Rodrigo Mesquita at 2025-09-12T17:57:24-04:00
Read Toolchain.Target files rather than 'settings'
This commit makes GHC read `lib/targets/default.target`, a file with a
serialized value of `ghc-toolchain`'s `GHC.Toolchain.Target`.
Moreover, it removes all the now-redundant entries from `lib/settings`
that are configured as part of a `Target` but were being written into
`settings`.
This makes it easier to support multiple targets from the same compiler
(aka runtime retargetability). `ghc-toolchain` can be re-run many times
standalone to produce a `Target` description for different targets, and,
in the future, GHC will be able to pick at runtime amongst different
`Target` files.
This commit only makes it read the default `Target` configured in-tree
or configured when installing the bindist.
The remaining bits of `settings` need to be moved to `Target` in follow
up commits, but ultimately they all should be moved since they are
per-target relevant.
Fixes #24212
On Windows, the constant overhead of parsing a slightly more complex
data structure causes some small-allocation tests to wiggle around 1 to
2 extra MB (1-2% in these cases).
-------------------------
Metric Increase:
MultiLayerModulesTH_OneShot
T10421
T10547
T12234
T12425
T13035
T18140
T18923
T9198
TcPlugin_RewritePerf
-------------------------
- - - - -
e0780a16 by Rodrigo Mesquita at 2025-09-12T17:57:24-04:00
ghc-toolchain: Move TgtHasLibm to per-Target file
TargetHasLibm is now part of the per-target configuration
Towards #26227
- - - - -
8235dd8c by Rodrigo Mesquita at 2025-09-12T17:57:24-04:00
ghc-toolchain: Move UseLibdw to per-Target file
To support DWARF unwinding, the RTS must be built with the -f+libdw flag
and with the -DUSE_LIBDW macro definition. These flags are passed on
build by Hadrian when --enable-dwarf-unwinding is specified at configure
time.
Whether the RTS was built with support for DWARF is a per-target
property, and as such, it was moved to the per-target
GHC.Toolchain.Target.Target file.
Additionally, we keep in the target file the include and library paths
for finding libdw, since libdw should be checked at configure time (be
it by configure, or ghc-toolchain, that libdw is properly available).
Preserving the user-given include paths for libdw facilitates in the
future building the RTS on demand for a given target (if we didn't keep
that user input, we couldn't)
Towards #26227
- - - - -
d5ecf2e8 by Rodrigo Mesquita at 2025-09-12T17:57:25-04:00
ghc-toolchain: Make "Support SMP" a query on a Toolchain.Target
"Support SMP" is merely a function of target, so we can represent it as
such in `ghc-toolchain`.
Hadrian queries the Target using this predicate to determine how to
build GHC, and GHC queries the Target similarly to report under --info
whether it "Support SMP"
Towards #26227
- - - - -
e07b031a by Rodrigo Mesquita at 2025-09-12T17:57:25-04:00
ghc-toolchain: Make "tgt rts linker only supports shared libs" function on Target
Just like with "Support SMP", "target RTS linker only supports shared
libraries" is a predicate on a `Target` so we can just compute it when
necessary from the given `Target`.
Towards #26227
- - - - -
14123ee6 by Simon Peyton Jones at 2025-09-12T17:58:07-04:00
Solve forall-constraints via an implication, again
In this earlier commit:
commit 953fd8f1dc080f1c56e3a60b4b7157456949be29
Author: Simon Peyton Jones