GitLab

Simon Peyton Jones pushed to branch wip/T26315 at Glasgow Haskell Compiler / GHC

Commits:

79816cc4

by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00

cleanup: Move dehydrateCgBreakInfo to Stg2Bc

This no longer has anything to do with Core.

53da94ff
by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00
```
rts/Disassembler: Fix spacing of BRK_FUN
```

08c0cf85

by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00

debugger: Fix bciPtr in Step-out

We need to use `BCO_NEXT` to move bciPtr to ix=1, because ix=0 points to
the instruction itself!

I do not understand how this didn't crash before.

e7e021fa

by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00

debugger: Allow BRK_FUNs to head case continuation BCOs

When we start executing a BCO, we may want to yield to the scheduler:
this may be triggered by a heap/stack check, context switch, or a
breakpoint. To yield, we need to put the stack in a state such that
when execution is resumed we are back to where we yielded from.

Previously, a BKR_FUN could only head a function BCO because we only
knew how to construct a valid stack for yielding from one -- simply add
`apply_interp_info` + the BCO to resume executing. This is valid because
the stack at the start of run_BCO is headed by that BCO's arguments.

However, in case continuation BCOs (as per Note [Case continuation BCOs]),
we couldn't easily reconstruct a valid stack that could be resumed
because we dropped too soon the stack frames regarding the value
returned (stg_ret) and received (stg_ctoi) by that continuation.
This is especially tricky because of the variable type and size return
frames (e.g. pointer ret_p/ctoi_R1p vs a tuple ret_t/ctoi_t2).

The trick to being able to yield from a BRK_FUN at the start of a case
cont BCO is to stop removing the ret frame headers eagerly and instead
keep them until the BCO starts executing. The new layout at the start of
a case cont. BCO is described by the new Note [Stack layout when entering run_BCO].

Now, we keep the ret_* and ctoi_* frames when entering run_BCO.
A BRK_FUN is then executed if found, and the stack is yielded as-is with
the preserved ret and ctoi frames.
Then, a case cont BCO's instructions always SLIDE off the headers of the
ret and ctoi frames, in StgToByteCode.doCase, turning a stack like

   |     ....      |
   +---------------+
   |     fv2       |
   +---------------+
   |     fv1       |
   +---------------+
   |     BCO       |
   +---------------+
   | stg_ctoi_ret_ |
   +---------------+
   |    retval     |
   +---------------+
   | stg_ret_..... |
   +---------------+

into

   |     ....      |
   +---------------+
   |     fv2       |
   +---------------+
   |     fv1       |
   +---------------+
   |    retval     |
   +---------------+

for the remainder of the BCO.

Moreover, this more uniform approach of keeping the ret and ctoi frames
means we need less ad-hoc logic concerning the variable size of
ret_tuple vs ret_p/np frames in the code generator and interpreter:
Always keep the return to cont. stack intact at the start of run_BCO,
and the statically generated instructions will take care of adjusting
it.

Unlocks BRK_FUNs at the start of case cont. BCOs which will enable a
better user-facing step-out (#26042) which is free of the bugs the
current BRK_ALTS implementation suffers from (namely, using BRK_FUN
rather than BRK_ALTS in a case cont. means we'll never accidentally end
up in a breakpoint "deeper" than the continuation, because we stop at
the case cont itself rather than on the first breakpoint we evaluate
after it).

ade3c1e6

by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00

BRK_FUN with InternalBreakLocs for code-generation time breakpoints

At the start of a case continuation BCO, place a BRK_FUN.
This BRK_FUN uses the new "internal breakpoint location" -- allowing us
to come up with a valid source location for this breakpoint that is not associated with a source-level tick.

For case continuation BCOs, we use the last tick seen before it as the
source location. The reasoning is described in Note [Debugger: Stepout internal break locs].

Note how T26042c, which was broken because it displayed the incorrect
behavior of the previous step out when we'd end up at a deeper level
than the one from which we initiated step-out, is now fixed.

As of this commit, BRK_ALTS is now dead code and is thus dropped.

Note [Debugger: Stepout internal break locs]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Step-out tells the interpreter to run until the current function
returns to where it was called from, and stop there.

This is achieved by enabling the BRK_FUN found on the first RET_BCO
frame on the stack (See [Note Debugger: Step-out]).

Case continuation BCOs (which select an alternative branch) must
therefore be headed by a BRK_FUN. An example:

    f x = case g x of <--- end up here
        1 -> ...
        2 -> ...

    g y = ... <--- step out from here

- `g` will return a value to the case continuation BCO in `f`
- The case continuation BCO will receive the value returned from g
- Match on it and push the alternative continuation for that branch
- And then enter that alternative.

If we step-out of `g`, the first RET_BCO on the stack is the case
continuation of `f` -- execution should stop at its start, before
selecting an alternative. (One might ask, "why not enable the breakpoint
in the alternative instead?", because the alternative continuation is
only pushed to the stack *after* it is selected by the case cont. BCO)

However, the case cont. BCO is not associated with any source-level
tick, it is merely the glue code which selects alternatives which do
have source level ticks. Therefore, we have to come up at code
generation time with a breakpoint location ('InternalBreakLoc') to
display to the user when it is stopped there.

Our solution is to use the last tick seen just before reaching the case
continuation. This is robust because a case continuation will thus
always have a relevant breakpoint location:

    - The source location will be the last source-relevant expression
      executed before the continuation is pushed

    - So the source location will point to the thing you've just stepped
      out of

    - Doing :step-local from there will put you on the selected
      alternative (which at the source level may also be the e.g. next
      line in a do-block)

Examples, using angle brackets (<<...>>) to denote the breakpoint span:

    f x = case <<g x>> {- step in here -} of
        1 -> ...
        2 -> ...>

    g y = <<...>> <--- step out from here

    ...

    f x = <<case g x of <--- end up here, whole case highlighted
        1 -> ...
        2 -> ...>>

    doing :step-local ...

    f x = case g x of
        1 -> <<...>> <--- stop in the alternative
        2 -> ...

A second example based on T26042d2, where the source is a do-block IO
action, optimised to a chain of `case expressions`.

    main = do
      putStrLn "hello1"
      <<f>> <--- step-in here
      putStrLn "hello3"
      putStrLn "hello4"

    f = do
      <<putStrLn "hello2.1">> <--- step-out from here
      putStrLn "hello2.2"

    ...

    main = do
      putStrLn "hello1"
      <<f>> <--- end up here again, the previously executed expression
      putStrLn "hello3"
      putStrLn "hello4"

    doing step/step-local ...

    main = do
      putStrLn "hello1"
      f
      <<putStrLn "hello3">> <--- straight to the next line
      putStrLn "hello4"

Finishes #26042

c66910c0

by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00

debugger: Re-use the last BreakpointId whole in step-out

Previously, to come up with a location to stop at for `:stepout`, we
would store the location of the last BreakpointId surrounding the
continuation, as described by Note [Debugger: Stepout internal break locs].

However, re-using just the location from the last source breakpoint
isn't sufficient to provide the necessary information in the break
location. Specifically, it wouldn't bind any variables at that location.

Really, there is no reason not to re-use the last breakpoint wholesale,
and re-use all the information we had there. Step-out should behave just
as if we had stopped at the call, but s.t. continuing will not
re-execute the call.

This commit updates the CgBreakInfo to always store a BreakpointId, be
it the original one or the one we're emulating (for step-out).

It makes variable bindings on :stepout work

e4abed7b
by sheaf at 2025-09-02T12:20:40-04:00
```
Revert accidental changes to hie.yaml
```

003b715b

by meooow25 at 2025-09-02T23:48:51+02:00

Adjust the strictness of Data.List.iterate'

* Don't force the next element in advance when generating a (:).
* Force the first element to WHNF like every other element.

Now every element in the output list is forced to WHNF when the (:)
containing it is forced.

CLC proposal:
https://github.com/haskell/core-libraries-committee/issues/335

b2f6aad0

by Simon Hengel at 2025-09-03T04:36:10-04:00

Refactoring: More consistently use logOutput, logInfo, fatalErrorMsg

60a16db7

by Rodrigo Mesquita at 2025-09-03T10:55:50+01:00

bytecode: Don't PUSH_L 0; SLIDE 1 1

While looking through bytecode I noticed a quite common unfortunate
pattern:

...
PUSH_L 0
SLIDE 1 1

We do this often by generically constructing a tail call from a function
atom that may be somewhere arbitrary on the stack.
However, for the special case that the function can be found directly on
top of the stack, as part of the arguments, it's plain redundant to push
then slide it.

In this commit we add a small optimisation to the generation of
tailcalls in bytecode. Simply: lookahead for the function in the stack.
If it is the first thing on the stack and it is part of the arguments
which would be dropped as we entered the tail call, then don't push then
slide it.

In a simple example (T26042b), this already produced a drastic
improvement in generated code (left is old, right is with this patch):

```diff
3c3
< 2025-07-29 10:14:02.081277 UTC
---
> 2025-07-29 10:50:36.560949 UTC
160,161c160
<                                            PUSH_L   0
<                                            SLIDE    1 2
---
>                                            SLIDE    1 1
164,165d162
<                                       PUSH_L   0
<                                       SLIDE    1 1
175,176c172
<                             PUSH_L   0
<                             SLIDE    1 2
---
>                             SLIDE    1 1
179,180d174
<                        PUSH_L   0
<                        SLIDE    1 1
206,207d199
<                        PUSH_L   0
<                        SLIDE    1 1
210,211d201
<                   PUSH_L   0
<                   SLIDE    1 1
214,215d203
<              PUSH_L   0
<              SLIDE    1 1
218,219d205
<         PUSH_L   0
<         SLIDE    1 1
222,223d207
<    PUSH_L   0
<    SLIDE    1 1
...
600,601c566
<                                                 PUSH_L   0
<                                                 SLIDE    1 2
---
>                                                 SLIDE    1 1
604,605d568
<                                            PUSH_L   0
<                                            SLIDE    1 1
632,633d594
<                                            PUSH_L   0
<                                            SLIDE    1 1
636,637d596
<                                       PUSH_L   0
<                                       SLIDE    1 1
640,641d598
<                                  PUSH_L   0
<                                  SLIDE    1 1
644,645d600
<                             PUSH_L   0
<                             SLIDE    1 1
648,649d602
<                        PUSH_L   0
<                        SLIDE    1 1
652,653d604
<                   PUSH_L   0
<                   SLIDE    1 1
656,657d606
<              PUSH_L   0
<              SLIDE    1 1
660,661d608
<         PUSH_L   0
<         SLIDE    1 1
664,665d610
<    PUSH_L   0
<    SLIDE    1 1
```

I also compiled lib:Cabal to bytecode and counted the number of bytecode
lines with `find dist-newstyle -name "*.dump-BCOs" -exec wc {} +`:

    with unoptimized core:
    1190689 lines (before) - 1172891 lines (now)
    = 17798 less redundant instructions (-1.5% lines)

    with optimized core:
    1924818 lines (before) - 1864836 lines (now)
    = 59982 less redundant instructions (-3.1% lines)

8b2c72c0

by L0neGamer at 2025-09-04T06:32:03-04:00

Add Control.Monad.thenM and Control.Applicative.thenA

39e1b7cb

by Teo Camarasu at 2025-09-04T06:32:46-04:00

ghc-internal: invert dependency of GHC.Internal.TH.Syntax on Data.Data

This means that Data.Data no longer blocks building TH.Syntax, which
allows greater parallelism in our builds.

We move the Data.Data.Data instances to Data.Data. Quasi depends on
Data.Data for one of its methods, so,
we split the Quasi/Q, etc definition out of GHC.Internal.TH.Syntax
into its own module. This has the added benefit of splitting up this
quite large module.

Previously TH.Syntax was a bottleneck when compiling ghc-internal. Now
it is less of a bottle-neck and is also slightly quicker to
compile (since it no longer contains these instances) at the cost of
making Data.Data slightly more expensive to compile.
TH.Lift which depends on TH.Syntax can also compile quicker and no
longer blocks ghc-internal finishing to compile.

Resolves #26217

-------------------------
Metric Decrease:
    MultiLayerModulesTH_OneShot
    T13253
    T21839c
    T24471
Metric Increase:
    T12227
-------------------------

bdf82fd2

by Teo Camarasu at 2025-09-04T06:32:46-04:00

compiler: delete unused names in Builtins.Names.TH

returnQ and bindQ are no longer used in the compiler.
There was also a very old comment that referred to them that I have modernized

41a448e5

by Ben Gamari at 2025-09-04T19:21:43-04:00

hadrian: Pass lib & include directories to ghc `Setup configure`

46bb9a79

by Ben Gamari at 2025-09-04T19:21:44-04:00

rts/IPE: Fix compilation when zstd is enabled

This was broken by the refactoring undertaken in
c80dd91c0bf6ac034f0c592f16c548b9408a8481.

Closes #26312.

138a6e34

by sheaf at 2025-09-04T19:22:46-04:00

Make mkCast assertion a bit clearer

This commit changes the assertion message that gets printed when one
calls mkCast with a coercion whose kind does not match the type of the
inner expression. I always found the assertion message a bit confusing,
as it didn't clearly state what exactly was the error.

9d626be1

by sheaf at 2025-09-04T19:22:46-04:00

Simplifier/rules: fix mistakes in Notes & comments

a71f55a3

by Simon Peyton Jones at 2025-09-05T09:40:41+01:00

Solve forall-constraints via an implication, again

In this earlier commit:

  commit 953fd8f1dc080f1c56e3a60b4b7157456949be29
  Author: Simon Peyton Jones <simon.peytonjones@gmail.com>
  Date:   Mon Jul 21 10:06:43 2025 +0100

  Solve forall-constraints immediately, or not at all

I used a all-or-nothing strategy for quantified constraints
(aka forall-constraints).  But alas that fell foul of #26315,
and #26376.

So this MR goes back to solving a quantified constraint by
turning it into an implication; UNLESS we are simplifying
constraints from a SPECIALISE pragma, in which case the
all-or-nothing strategy is great.  See:

   Note [Solving a Wanted forall-constraint]

Other stuff in this MR:

* TcSMode becomes a record of flags, rather than an enumeration
  type; much nicer.

* Some fancy footwork to avoid error messages worsening again
  (The above MR made them better; we want to retain that.)
  See `GHC.Tc.Errors.Ppr.pprQCOriginExtra`.

129 changed files:

compiler/GHC/Builtin/Names/TH.hs
compiler/GHC/ByteCode/Asm.hs
compiler/GHC/ByteCode/Breakpoints.hs
compiler/GHC/ByteCode/Instr.hs
compiler/GHC/Core/Lint.hs
compiler/GHC/Core/Opt/Simplify.hs
compiler/GHC/Core/Opt/SpecConstr.hs
compiler/GHC/Core/Rules.hs
compiler/GHC/Core/Utils.hs
compiler/GHC/CoreToIface.hs
compiler/GHC/Data/IOEnv.hs
compiler/GHC/Driver/CodeOutput.hs
compiler/GHC/Driver/Pipeline/Execute.hs
compiler/GHC/Hs/Expr.hs
compiler/GHC/HsToCore/Binds.hs
compiler/GHC/HsToCore/Quote.hs
compiler/GHC/Iface/Load.hs
compiler/GHC/Linker/Loader.hs
compiler/GHC/Rename/Splice.hs
compiler/GHC/Runtime/Debugger/Breakpoints.hs
compiler/GHC/Runtime/Eval.hs
compiler/GHC/Stg/Lint.hs
compiler/GHC/StgToByteCode.hs
compiler/GHC/Tc/Deriv/Utils.hs
compiler/GHC/Tc/Errors/Ppr.hs
compiler/GHC/Tc/Gen/Sig.hs
compiler/GHC/Tc/Gen/Splice.hs
compiler/GHC/Tc/Gen/Splice.hs-boot
compiler/GHC/Tc/Solver.hs
compiler/GHC/Tc/Solver/Default.hs
compiler/GHC/Tc/Solver/Dict.hs
compiler/GHC/Tc/Solver/Equality.hs
compiler/GHC/Tc/Solver/InertSet.hs
compiler/GHC/Tc/Solver/Monad.hs
compiler/GHC/Tc/Solver/Solve.hs
compiler/GHC/Tc/Solver/Solve.hs-boot
compiler/GHC/Tc/Types/Constraint.hs
compiler/GHC/Tc/Types/Evidence.hs
compiler/GHC/Tc/Types/Origin.hs
compiler/GHC/Tc/Types/TH.hs
compiler/GHC/Tc/Utils/Monad.hs
compiler/GHC/Tc/Zonk/TcType.hs
compiler/GHC/Tc/Zonk/Type.hs
ghc/GHCi/UI.hs
hadrian/src/Settings/Packages.hs
hie.yaml
libraries/base/changelog.md
libraries/base/src/Control/Applicative.hs
libraries/base/src/Control/Monad.hs
libraries/base/src/Data/Array/Byte.hs
libraries/base/src/Data/Fixed.hs
+ libraries/ghc-boot-th/GHC/Boot/TH/Monad.hs
libraries/ghc-boot-th/ghc-boot-th.cabal.in
libraries/ghc-internal/ghc-internal.cabal.in
libraries/ghc-internal/src/GHC/Internal/Base.hs
libraries/ghc-internal/src/GHC/Internal/Control/Monad.hs
libraries/ghc-internal/src/GHC/Internal/Data/Data.hs
libraries/ghc-internal/src/GHC/Internal/List.hs
libraries/ghc-internal/src/GHC/Internal/TH/Lib.hs
libraries/ghc-internal/src/GHC/Internal/TH/Lift.hs
+ libraries/ghc-internal/src/GHC/Internal/TH/Monad.hs
libraries/ghc-internal/src/GHC/Internal/TH/Quote.hs
libraries/ghc-internal/src/GHC/Internal/TH/Syntax.hs
libraries/ghci/GHCi/Message.hs
libraries/ghci/GHCi/Run.hs
libraries/ghci/GHCi/TH.hs
libraries/template-haskell/Language/Haskell/TH/Quote.hs
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
rts/Disassembler.c
rts/IPE.c
rts/Interpreter.c
rts/Profiling.c
rts/include/rts/Bytecodes.h
testsuite/tests/backpack/should_fail/bkpfail11.stderr
testsuite/tests/backpack/should_fail/bkpfail43.stderr
testsuite/tests/count-deps/CountDepsAst.stdout
testsuite/tests/count-deps/CountDepsParser.stdout
testsuite/tests/deriving/should_compile/T14682.stderr
testsuite/tests/deriving/should_compile/drv-empty-data.stderr
testsuite/tests/deriving/should_fail/T12768.stderr
testsuite/tests/deriving/should_fail/T1496.stderr
testsuite/tests/deriving/should_fail/T21302.stderr
testsuite/tests/deriving/should_fail/T22696b.stderr
testsuite/tests/deriving/should_fail/T5498.stderr
testsuite/tests/deriving/should_fail/T7148.stderr
testsuite/tests/deriving/should_fail/T7148a.stderr
testsuite/tests/ghci.debugger/scripts/T26042b.script
testsuite/tests/ghci.debugger/scripts/T26042b.stdout
testsuite/tests/ghci.debugger/scripts/T26042c.script
testsuite/tests/ghci.debugger/scripts/T26042c.stdout
+ testsuite/tests/ghci.debugger/scripts/T26042d2.hs
+ testsuite/tests/ghci.debugger/scripts/T26042d2.script
+ testsuite/tests/ghci.debugger/scripts/T26042d2.stdout
testsuite/tests/ghci.debugger/scripts/T26042e.stdout
testsuite/tests/ghci.debugger/scripts/T26042f.script
testsuite/tests/ghci.debugger/scripts/T26042f1.stdout
testsuite/tests/ghci.debugger/scripts/T26042f2.stdout
testsuite/tests/ghci.debugger/scripts/T26042g.stdout
testsuite/tests/ghci.debugger/scripts/all.T
testsuite/tests/impredicative/T17332.stderr
testsuite/tests/interface-stability/base-exports.stdout
testsuite/tests/interface-stability/base-exports.stdout-javascript-unknown-ghcjs
testsuite/tests/interface-stability/base-exports.stdout-mingw32
testsuite/tests/interface-stability/base-exports.stdout-ws-32
testsuite/tests/interface-stability/template-haskell-exports.stdout
testsuite/tests/plugins/plugins10.stdout
testsuite/tests/profiling/should_run/callstack001.stdout
testsuite/tests/quantified-constraints/T19690.stderr
testsuite/tests/quantified-constraints/T19921.stderr
testsuite/tests/quantified-constraints/T21006.stderr
testsuite/tests/roles/should_fail/RolesIArray.stderr
testsuite/tests/simplCore/should_compile/DsSpecPragmas.hs
testsuite/tests/simplCore/should_compile/DsSpecPragmas.stderr
testsuite/tests/splice-imports/SI29.stderr
testsuite/tests/th/T11452.stderr
testsuite/tests/th/T15321.stderr
testsuite/tests/th/T7276.stderr
testsuite/tests/th/TH_NestedSplicesFail3.stderr
testsuite/tests/th/TH_NestedSplicesFail4.stderr
testsuite/tests/typecheck/should_compile/T14434.hs
+ testsuite/tests/typecheck/should_compile/T26376.hs
testsuite/tests/typecheck/should_compile/all.T
testsuite/tests/typecheck/should_fail/T15801.stderr
testsuite/tests/typecheck/should_fail/T19627.stderr
testsuite/tests/typecheck/should_fail/T20666.stderr
testsuite/tests/typecheck/should_fail/T20666a.stderr
testsuite/tests/typecheck/should_fail/T20666b.stderr
testsuite/tests/typecheck/should_fail/T22912.stderr
testsuite/tests/typecheck/should_fail/T23427.stderr

Simon Peyton Jones pushed to branch wip/T26315 at Glasgow Haskell Compiler / GHC

Commits:

129 changed files:

The diff was not included because it is too large.