[Git][ghc/ghc][wip/9.10.3-backports] 23 commits: perf: Replace uses of genericLength with strictGenericLength

Zubin pushed to branch wip/9.10.3-backports at Glasgow Haskell Compiler / GHC
Commits:
948735d6 by Matthew Pickering at 2025-07-11T17:39:07+05:30
perf: Replace uses of genericLength with strictGenericLength
genericLength is a recursive function and marked NOINLINE. It is not
going to specialise. In profiles, it can be seen that 3% of total compilation
time when computing bytecode is spend calling this non-specialised
function.
In addition, we can simplify `addListToSS` to avoid traversing the input
list twice and also allocating an intermediate list (after the call to
reverse).
Overall these changes reduce the time spend in 'assembleBCOs' from 5.61s
to 3.88s. Allocations drop from 8GB to 5.3G.
Fixes #25706
- - - - -
2efc27de by Matthew Craven at 2025-07-11T17:39:49+05:30
Fix bytecode generation for `tagToEnum# <LITERAL>`
Fixes #25975.
(cherry picked from commit a00eeaec8f0b98ec2b8c4630f359fdeb3a6ce04e)
- - - - -
5f369799 by sheaf at 2025-07-11T17:39:50+05:30
Use mkTrAppChecked in ds_ev_typeable
This change avoids violating the invariant of mkTrApp according to which
the argument should not be a fully saturated function type.
This ensures we don't return false negatives for type equality
involving function types.
Fixes #25998
(cherry picked from commit 9c6d2b1bf54310b6d9755aa2ba67fbe38feeac51)
- - - - -
53e12e66 by Ben Gamari at 2025-07-11T17:39:50+05:30
rts/linker: Don't fail due to RTLD_NOW
In !12264 we started using the NativeObj machinery introduced some time
ago for loading of shared objects. One of the side-effects of this
change is shared objects are now loaded eagerly (i.e. with `RTLD_NOW`).
This is needed by NativeObj to ensure full visibility of the mappings of
the loaded object, which is in turn needed for safe shared object
unloading.
Unfortunately, this change subtly regressed, causing compilation
failures in some programs. Specifically, shared objects which refer to
undefined symbols (e.g. which may be usually provided by either the
executable image or libraries loaded via `dlopen`) will fail to load
with eager binding. This is problematic as GHC loads all package
dependencies while, e.g., evaluating TemplateHaskell splices. This
results in compilation failures in programs depending upon (but not
using at compile-time) packages with undefined symbol references.
To mitigate this NativeObj now first attempts to load an object via
eager binding, reverting to lazy binding (and disabling unloading) on
failure.
See Note [Don't fail due to RTLD_NOW].
Fixes #25943.
(cherry picked from commit ce6cf240f4b39371777d484b4de30d746b7abd62)
- - - - -
6245a6e0 by Ben Gamari at 2025-07-11T17:39:50+05:30
base: Note strictness changes made in 4.16.0.0
Addresses #25886.
(cherry picked from commit 7722232c6f8f0b57db03d0439d77896d38191bf9)
- - - - -
f8e28144 by kwxm at 2025-07-11T17:39:51+05:30
Fix bugs in `integerRecipMod` and `integerPowMod`
This fixes #26017.
* `integerRecipMod x 1` now returns `(# 1 | #)` for all x; previously it
incorrectly returned `(# | () #)`, indicating failure.
* `integerPowMod 0 e m` now returns `(# | () #)` for e<0 and m>1, indicating
failure; previously it incorrectly returned `(# 0 | #)`.
(cherry picked from commit 8ded23300367c6e032b3c5a635fd506b8915374b)
- - - - -
3562b51b by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker: Factor out ProddableBlocks machinery
- - - - -
405b0e71 by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker: Improve efficiency of proddable blocks structure
Previously the linker's "proddable blocks" check relied on a simple
linked list of spans. This resulted in extremely poor complexity while
linking objects with lots of small sections (e.g. objects built with
split sections).
Rework the mechanism to instead use a simple interval set implemented
via binary search.
Fixes #26009.
- - - - -
15f5c679 by Ben Gamari at 2025-07-11T17:39:51+05:30
testsuite: Add simple functional test for ProddableBlockSet
- - - - -
43b453c3 by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker/PEi386: Drop check for LOAD_LIBRARY_SEARCH_*_DIRS
The `LOAD_LIBRARY_SEARCH_USER_DIRS` and
`LOAD_LIBRARY_SEARCH_DEFAULT_DIRS` were introduced in Windows Vista and
have been available every since. As we no longer support Windows XP we
can drop this check.
Addresses #26009.
- - - - -
d8d62e10 by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker/PEi386: Clean up code style
- - - - -
27d38354 by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/Hash: Factor out hashBuffer
This is a useful helper which can be used for non-strings as well.
- - - - -
e5eb65aa by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker/PEi386: Fix incorrect use of break in nested for
Previously the happy path of PEi386 used `break` in a double-`for` loop
resulting in redundant calls to `LoadLibraryEx`.
Fixes #26052.
- - - - -
9da7236c by Ben Gamari at 2025-07-11T17:39:51+05:30
rts: Correctly mark const arguments
- - - - -
f5b2359a by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker/PEi386: Don't repeatedly load DLLs
Previously every DLL-imported symbol would result in a call to
`LoadLibraryEx`. This ended up constituting over 40% of the runtime of
`ghc --interactive -e 42` on Windows. Avoid this by maintaining a
hash-set of loaded DLL names, skipping the call if we have already
loaded the requested DLL.
Addresses #26009.
- - - - -
65f415d7 by Ben Gamari at 2025-07-11T17:39:51+05:30
rts/linker: Expand comment describing ProddableBlockSet
- - - - -
8fac39c8 by Cheng Shao at 2025-07-11T17:39:51+05:30
rts: fix rts_clearMemory logic when sanity checks are enabled
This commit fixes an RTS assertion failure when invoking
rts_clearMemory with +RTS -DS. -DS implies -DZ which asserts that free
blocks contain 0xaa as the designated garbage value. Also adds the
sanity way to rts_clearMemory test to prevent future regression.
Closes #26011.
ChatGPT Codex automatically diagnosed the issue and proposed the
initial patch in a single shot, given a GHC checkout and the following
prompt:
---
Someone is reporting the following error when attempting to use `rts_clearMemory` with the RTS option `-DS`:
```
test.wasm: internal error: ASSERTION FAILED: file rts/sm/Storage.c, line 1216
(GHC version 9.12.2.20250327 for wasm32_unknown_wasi)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
```
What's the culprit? How do I look into this issue?
---
I manually reviewed & revised the patch, tested and submitted it.
(cherry picked from commit 86406f48659a5ab61ce1fd2a2d427faba2dcdb09)
- - - - -
1273331e by Hécate Kleidukos at 2025-07-11T17:39:51+05:30
Expose all of Backtraces' internals for ghc-internal
Closes #26049
(cherry picked from commit 16014bf84afa0d009b6254b103033bceca42233a)
- - - - -
4eab9fe0 by ARATA Mizuki at 2025-07-11T17:39:51+05:30
AArch64 NCG: Fix sub-word arithmetic right shift
As noted in Note [Signed arithmetic on AArch64], we should zero-extend sub-word values.
Fixes #26061
(cherry picked from commit 265d0024abc95be941f8e4769f24af128eedaa10)
- - - - -
360d5019 by Ben Gamari at 2025-07-11T17:39:51+05:30
base: Expose Backtraces constructor and fields
This was specified in the proposal (CLC #199) yet somehow didn't make it
into the implementation.
Fixes #26049.
(cherry picked from commit 17db44c5b32fff82ea988fa4f1a233d1a27bdf57)
- - - - -
980dd11c by ARATA Mizuki at 2025-07-11T17:39:51+05:30
x86 NCG: Fix code generation of bswap64 on i386
Co-authored-by: sheaf
participants (1)
-
Zubin (@wz1000)