-
718cd063
by Sylvain Henry at 2026-03-25T16:37:08-04:00
Check that shift values are valid
In GHC's codebase in non-DEBUG builds we silently substitute shiftL/R
with unsafeShiftL/R for performance reasons. However we were not
checking that the shift value was valid for unsafeShiftL/R, leading to
wrong computations, but only in non-DEBUG builds.
This patch adds the necessary checks and reports an error when a wrong
shift value is passed.
-
2ce1fcf2
by Sylvain Henry at 2026-03-25T16:37:08-04:00
Implement basic value range analysis (#25718)
Perform basic value range analysis to try to determine at compile time
the result of the application of some comparison primops (ltWord#, etc.).
This subsumes the built-in rewrite rules used previously to check if one
of the comparison argument was a bound (e.g. (x :: Word8) <= 255 is
always True). Our analysis is more powerful and handles type
conversions: e.g. word8ToWord x <= 255 is now detected as always True too.
We also use value range analysis to filter unreachable alternatives in
case-expressions. To support this, we had to allow case-expressions for
primitive types to not have a DEFAULT alternative (as was assumed before
and checked in Core lint).
-
1351253e
by ARATA Mizuki at 2026-03-25T16:37:15-04:00
rts: Align stack to 64-byte boundary in StgRun on x86
When LLVM spills AVX/AVX-512 vector registers to the stack, it requires
32-byte (__m256) or 64-byte (__m512) alignment. If the stack is not
sufficiently aligned, LLVM inserts a realignment prologue that reserves
%rbp as a frame pointer, conflicting with GHC's use of %rbp as an STG
callee-saved register and breaking the tail-call-based calling convention.
Previously, GHC worked around this by lying to LLVM about the stack
alignment and rewriting aligned vector loads/stores (VMOVDQA, VMOVAPS)
to unaligned ones (VMOVDQU, VMOVUPS) in the LLVM Mangler. This had two
problems:
- It did not extend to AVX-512, which requires 64-byte alignment. (#26595)
- When Haskell calls a C function that takes __m256/__m512 arguments on
the stack, the callee requires genuine alignment, which could cause a
segfault. (#26822)
This patch genuinely aligns the stack to 64 bytes in StgRun by saving
the original stack pointer before alignment and restoring it in
StgReturn. We now unconditionally advertise 64-byte stack alignment to
LLVM for all x86 targets, making rewriteAVX in the LLVM Mangler
unnecessary. STG_RUN_STACK_FRAME_SIZE is increased from 48 to 56 bytes
on non-Windows x86-64 to store the saved stack pointer.
Closes #26595 and #26822
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
-
b923e82f
by Teo Camarasu at 2026-03-25T16:37:16-04:00
ghc-internal: Float Generics to near top of module graph
We remove GHC.Internal.Generics from the critical path of the
`ghc-internal` module graph. GHC.Internal.Generics used to be in the
middle of the module graph, but now it is nearer the top (built later).
This change thins out the module graph and allows us to get rid of the
ByteOrder hs-boot file.
We implement this by moving Generics instances from the module where the
datatype is defined to the GHC.Internal.Generics module. This trades off
increasing the compiled size of GHC.Internal.Generics with reducing the
dependency footprint of datatype modules.
Not all instances are moved to GHC.Internal.Generics. For instance,
`GHC.Internal.Control.Monad.Fix` keeps its instance as it is one of the
very last modules compiled in `ghc-internal` and so inverting the
relationship here would risk adding GHC.Internal.Generics back onto the
critical path.
We also don't change modules that are re-exported from the `template-haskell` or `ghc-heap`.
This is done to make it easy to eventually move `Generics` to `base`
once something like #26657 is implemented.
Resolves #26930
Metric Decrease:
T21839c
-
080c82a0
by sheaf at 2026-03-25T16:37:22-04:00
Avoid infinite loop in deep subsumption
This commit ensures we only unify after we recur in the deep subsumption
code in the FunTy vs non-FunTy case of GHC.Tc.Utils.Unify.tc_sub_type_deep,
to avoid falling into an infinite loop.
See the new Wrinkle [Avoiding a loop in tc_sub_type_deep] in
Note [FunTy vs non-FunTy case in tc_sub_type_deep] in GHC.Tc.Utils.Unify.
Fixes #26823
Co-authored-by: simonpj <simon.peytonjones@gmail.com>
-
96f303dc
by Ian Duncan at 2026-03-25T16:37:25-04:00
AArch64: fix MOVK regUsageOfInstr to mark dst as both read and written
MOVK (move with keep) modifies only a 16-bit slice of the destination
register, so the destination is both read and written. The register
allocator must know this to avoid clobbering live values. Update
regUsageOfInstr to list the destination in both src and dst sets.
No regression test: triggering the misallocation requires specific
register pressure around a MOVK sequence, which is difficult to
reliably provoke from Haskell source.
-
eacbdde9
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #12002
Closes #12002.
-
0284522c
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #12046
Closes #12046.
Co-authored-by: Andreas Klebinger <klebinger.andreas@gmx.at>
-
3421f0d0
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #13180
Closes #13180.
-
42fc7427
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #11141
Closes #11141.
-
4914e16e
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #11505
Closes #11505.
-
3ba65fb9
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression perf test for #13820
Closes #13820.
-
14cdb5ef
by Simon Jakobi at 2026-03-25T16:37:26-04:00
Add regression test for #10381
Closes #10381.
-
fc0fdb99
by Benjamin Maurer at 2026-03-25T16:37:32-04:00
Generate assembly on x86 for word2float (#22252)
We used to emit C function call for MO_UF_Conv primitive.
Now emits direct assembly instead.
Co-Authored-By: Sylvain Henry <sylvain@haskus.fr>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
-
9ad37b90
by Matthew Pickering at 2026-03-25T16:37:34-04:00
rts: forward clone-stack messages after TSO migration
MSG_CLONE_STACK assumed that the target TSO was still owned by the
capability that received the message. This is not always true: the TSO
can migrate before the inbox entry is handled.
When that happened, handleCloneStackMessage could clone a live stack from
the wrong capability and use the wrong capability for allocation and
performTryPutMVar, leading to stack sanity failures such as
checkStackFrame: weird activation record found on stack.
Fix this by passing the current capability into
handleCloneStackMessage, rechecking msg->tso->cap at handling time, and
forwarding the message if the TSO has migrated. Once ownership matches,
use the executing capability consistently for cloneStack, rts_apply, and
performTryPutMVar.
Fixes #27008
-
2e7fcfc4
by mangoiv at 2026-03-25T16:37:35-04:00
release tracking: adopt release tracking ticket from #16816
-
84221fc1
by mangoiv at 2026-03-25T16:37:35-04:00
release tracking: add a release tracking ticket
Brings the information in the release tracking ticket up to date with
https://gitlab.haskell.org/ghc/ghc-hq/-/blob/main/release-management.mkd
Resolves #26691
-
16de84df
by Teo Camarasu at 2026-03-25T16:37:37-04:00
Revert "Set default eventlog-flush-interval to 5s"
Flushing the eventlog forces a synchronisation of all the capabilities
and there was a worry that this might lead to a performance cost for
some highly parallel workloads.
This reverts commit 66b96e2a591d8e3d60e74af3671344dfe4061cf2.
-
c7e85ba3
by Cheng Shao at 2026-03-25T16:37:38-04:00
ghc-boot: move GHC.Data.SmallArray to ghc-boot
This commit moves `GHC.Data.SmallArray` from the `ghc` library to
`ghc-boot`, so that it can be used by `ghci` as well:
- The `Binary` (from `ghc`) instance of `SmallArray` is moved to
`GHC.Utils.Binary`
- Util functions `replicateSmallArrayIO`, `mapSmallArrayIO`,
`mapSmallArrayM_`, `imapSmallArrayM_` , `smallArrayFromList` and
`smallArrayToList` are added
- The `Show` instance is added
- The `Binary` (from `binary`) instance is added
-
4b4999e3
by Cheng Shao at 2026-03-25T16:37:38-04:00
compiler: use `Binary` instance of `BCOByteArray` for bytecode objects
This commit defines `Binary` (from `compiler`) instance of
`BCOByteArray` which serializes the underlying buffer directly, and
uses it directly in bytecode object serialization. Previously we reuse
the `Binary` (from `binary`) instance, and this change allows us to
avoid double-copying via an intermediate `ByteString` when using
`put`/`get` in `binnary`. Also see added comment for explanation.
-
9fb2fe15
by Cheng Shao at 2026-03-25T16:37:38-04:00
ghci: use SmallArray directly in ResolvedBCO
This patch makes ghci use `SmallArray` directly in `ResolvedBCO` when
applicable, making the memory representation more compact and reducing
marshaling overhead. Closes #27058.
-
2c1d9a30
by Wen Kokke at 2026-03-25T16:37:41-04:00
Fix race condition between flushEventLog and start/endEventLogging.
This commit changes `flushEventLog` to acquire/release the `state_change` mutex to prevent interleaving with `startEventLogging` and `endEventLogging`. In the current RTS, `flushEventLog` _does not_ acquire this mutex, which may lead to eventlog corruption on the following interleaving:
- `startEventLogging` writes the new `EventLogWriter` to `event_log_writer`.
- `flushEventLog` flushes some events to `event_log_writer`.
- `startEventLogging` writes the eventlog header to `event_log_writer`.
This causes the eventlog to be written out in an unreadable state, with one or more events preceding the eventlog header.
This commit renames the old function to `flushEventLog_` and defines `flushEventLog` simply as:
```c
void flushEventLog(Capability **cap USED_IF_THREADS)
{
ACQUIRE_LOCK(&state_change_mutex);
flushEventLog_(cap);
RELEASE_LOCK(&state_change_mutex);
}
```
The old function is still needed internally within the compilation unit, where it is used in `endEventLogging` in a context where the `state_change` mutex has already been acquired. I've chosen to mark `flushEventLog_` as static and let other uses of `flushEventLog` within the RTS refer to the new version. There is one use in `hs_init_ghc` via `flushTrace`, where the new locking behaviour should be harmless, and one use in `handle_tick`, which I believe was likely vulnerable to the same race condition, so the new locking behaviour is desirable.
I have not added a test. The behaviour is highly non-deterministic and requires a program that concurrently calls `flushEventLog` and `startEventLogging`/`endEventLogging`. I encountered the issue while developing `eventlog-socket` and within that context have verified that my patch likely addresses the issue: a test that used to fail within the first dozen or so runs now has been running on repeat for several hours.
-
8b83825f
by Phil Hazelden at 2026-03-25T16:37:42-04:00
Fix build with werror on glibc 2.43.
We've been defining `_XOPEN_SOURCE` and `_POSIX_C_SOURCE` to the same
values as defined in glibc prior to 2.43. But in 2.43, glibc changes
them to new values, which means we get a warning when redefining them.
By `#undef`ing them first, we no longer get a warning.
Closes #27076.
-
f3fe26e8
by Tobias Haslop at 2026-03-25T16:37:47-04:00
Fix broken Haddock link to Bifunctor class in description of Functor class