[Git][ghc/ghc][wip/romes/step-out-11] 12 commits: Update comments on `OptKind` to reflect the code reality

1 Aug 2025

      Rodrigo Mesquita pushed to branch wip/romes/step-out-11 at Glasgow Haskell Compiler / GHC

Commits:
ee2dc248 by Simon Hengel at 2025-07-31T06:25:35-04:00
Update comments on `OptKind` to reflect the code reality

- - - - -
b029633a by Wen Kokke at 2025-07-31T06:26:21-04:00
rts: Disable --eventlog-flush-interval unless compiled with -threaded.

This commit fixes issue #26222:
Using --eventlog-flush-interval with the non-threaded RTS leads to eventlog corruption.
https://gitlab.haskell.org/ghc/ghc/-/issues/26222

This commit makes three changes when code is compiled against the non-threaded RTS:

1. It disables the --eventlog-flush-interval flag.
2. It disables the documentation for the --eventlog-flush-interval flag.
3. It disables the relevant state from RtsConfig and code from Timer.
4. It updates the entry for --eventlog-flush-interval in the users guide.

- - - - -
31159f1d by Wen Kokke at 2025-07-31T06:26:21-04:00
rts: Split T20006 into tests with and without -threaded

- - - - -
618687ef by Simon Hengel at 2025-07-31T06:27:03-04:00
docs/users_guide/win32-dlls.rst: Remove references to `readline`

- - - - -
083e40f1 by Rodrigo Mesquita at 2025-08-01T04:38:23-04:00
debugger: Uniquely identify breakpoints by internal id

Since b85b11994e0130ff2401dd4bbdf52330e0bcf776 (support inlining
breakpoints), a breakpoint has been identified at runtime by *two* pairs
of .

- The first, aka a 'BreakpointId', uniquely identifies a breakpoint in
  the source of a module by using the Tick index. A Tick index can index
  into ModBreaks.modBreaks_xxx to fetch source-level information about
  where that tick originated.

    - When a user specifies e.g. a line breakpoint using :break, we'll reverse
      engineer what a Tick index for that line
    - We update the `BreakArray` of that module (got from the
      LoaderState) at that tick index to `breakOn`.
    - A BCO we can stop at is headed by a BRK_FUN instruction. This
      instruction stores in an operand the `tick index` it is associated
      to. We look it up in the associated `BreakArray` (also an operand)
      and check wheter it was set to `breakOn`.

- The second, aka the `ibi_info_mod` + `ibi_info_ix` of the
  `InternalBreakpointId`, uniquely index into the `imodBreaks_breakInfo`
  -- the information we gathered during code generation about the
  existing breakpoint *ocurrences*.

  - Note that with optimisation there may be many occurrences of the
    same source-tick-breakpoint across different modules. The
    `ibi_info_ix` is unique per occurrence, but the `bi_tick_ix` may be
    shared. See Note [Breakpoint identifiers] about this.

  - Note that besides the tick ids, info ids are also stored in
    `BRK_FUN` so the break handler can refer to the associated
    `CgBreakInfo`.

In light of that, the driving changes come from the desire to have the
info_id uniquely identify the breakpoint at runtime, and the source tick
id being derived from it:

- An InternalBreakpointId should uniquely identify a breakpoint just
  from the code-generation identifiers of `ibi_info_ix` and `ibi_info_mod`.
  So we drop `ibi_tick_mod` and `ibi_tick_ix`.
- A BRK_FUN instruction need only record the "internal breakpoint id",
  not the tick-level id.
  So we drop the tick mod and tick index operands.
- A BreakArray should be indexed by InternalBreakpointId rather than
  BreakpointId

That means we need to do some more work when setting a breakpoint.
Specifically, we need to figure out the internal ids (occurrences of a
breakpoint) from the source-level BreakpointId we want to set the
breakpoint at (recall :break refers to breaks at the source level).

Besides this change being an improvement to the handling of breakpoints
(it's clearer to have a single unique identifier than two competing
ones), it unlocks the possibility of generating "internal" breakpoints
during Cg (needed for #26042).
It should also be easier to introduce multi-threaded-aware `BreakArrays`
following this change (needed for #26064).

Se also the new Note [ModBreaks vs InternalModBreaks]

On i386-linux:

-------------------------
Metric Decrease:
    interpreter_steplocal
-------------------------

- - - - -
bf03bbaa by Simon Hengel at 2025-08-01T04:39:05-04:00
Don't use MCDiagnostic for `ghcExit`

This changes the error message of `ghcExit` from

```
<no location info>: error:
Compilation had errors

```
to
```

Compilation had errors

```

- - - - -
a889ec75 by Simon Hengel at 2025-08-01T04:39:05-04:00
Respect `-fdiagnostics-as-json` for driver diagnostics (see #24113)

- - - - -
32d8c808 by Rodrigo Mesquita at 2025-08-01T10:45:19+01:00
cleanup: Move dehydrateCgBreakInfo to Stg2Bc

This no longer has anything to do with Core.

- - - - -
0189ad2b by Rodrigo Mesquita at 2025-08-01T10:45:19+01:00
rts/Disassembler: Fix spacing of BRK_FUN

- - - - -
1ef4fa39 by Rodrigo Mesquita at 2025-08-01T10:45:19+01:00
debugger: Fix bciPtr in Step-out

We need to use `BCO_NEXT` to move bciPtr to ix=1, because ix=0 points to
the instruction itself!

I do not understand how this didn't crash before.

- - - - -
3a667156 by Rodrigo Mesquita at 2025-08-01T10:45:19+01:00
debugger: Allow BRK_FUNs to head case continuation BCOs

When we start executing a BCO, we may want to yield to the scheduler:
this may be triggered by a heap/stack check, context switch, or a
breakpoint. To yield, we need to put the stack in a state such that
when execution is resumed we are back to where we yielded from.

Previously, a BKR_FUN could only head a function BCO because we only
knew how to construct a valid stack for yielding from one -- simply add
`apply_interp_info` + the BCO to resume executing. This is valid because
the stack at the start of run_BCO is headed by that BCO's arguments.

However, in case continuation BCOs (as per Note [Case continuation BCOs]),
we couldn't easily reconstruct a valid stack that could be resumed
because we dropped too soon the stack frames regarding the value
returned (stg_ret) and received (stg_ctoi) by that continuation.
This is especially tricky because of the variable type and size return
frames (e.g. pointer ret_p/ctoi_R1p vs a tuple ret_t/ctoi_t2).

The trick to being able to yield from a BRK_FUN at the start of a case
cont BCO is to stop removing the ret frame headers eagerly and instead
keep them until the BCO starts executing. The new layout at the start of
a case cont. BCO is described by the new Note [Stack layout when entering run_BCO].

Now, we keep the ret_* and ctoi_* frames when entering run_BCO.
A BRK_FUN is then executed if found, and the stack is yielded as-is with
the preserved ret and ctoi frames.
Then, a case cont BCO's instructions always SLIDE off the headers of the
ret and ctoi frames, in StgToByteCode.doCase, turning a stack like

   |     ....      |
   +---------------+
   |     fv2       |
   +---------------+
   |     fv1       |
   +---------------+
   |     BCO       |
   +---------------+
   | stg_ctoi_ret_ |
   +---------------+
   |    retval     |
   +---------------+
   | stg_ret_..... |
   +---------------+

into

   |     ....      |
   +---------------+
   |     fv2       |
   +---------------+
   |     fv1       |
   +---------------+
   |    retval     |
   +---------------+

for the remainder of the BCO.

Moreover, this more uniform approach of keeping the ret and ctoi frames
means we need less ad-hoc logic concerning the variable size of
ret_tuple vs ret_p/np frames in the code generator and interpreter:
Always keep the return to cont. stack intact at the start of run_BCO,
and the statically generated instructions will take care of adjusting
it.

Unlocks BRK_FUNs at the start of case cont. BCOs which will enable a
better user-facing step-out (#26042) which is free of the bugs the
current BRK_ALTS implementation suffers from (namely, using BRK_FUN
rather than BRK_ALTS in a case cont. means we'll never accidentally end
up in a breakpoint "deeper" than the continuation, because we stop at
the case cont itself rather than on the first breakpoint we evaluate
after it).

- - - - -
75599f78 by Rodrigo Mesquita at 2025-08-01T10:45:19+01:00
BRK_FUN with InternalBreakLocs for code-generation time breakpoints

At the start of a case continuation BCO, place a BRK_FUN.
This BRK_FUN uses the new "internal breakpoint location" -- allowing us
to come up with a valid source location for this breakpoint that is not associated with a source-level tick.

For case continuation BCOs, we use the last tick seen before it as the
source location. The reasoning is described in Note [Debugger: Stepout internal break locs].

Note how T26042c, which was broken because it displayed the incorrect
behavior of the previous step out when we'd end up at a deeper level
than the one from which we initiated step-out, is now fixed.

As of this commit, BRK_ALTS is now dead code and is thus dropped.

Note [Debugger: Stepout internal break locs]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Step-out tells the interpreter to run until the current function
returns to where it was called from, and stop there.

This is achieved by enabling the BRK_FUN found on the first RET_BCO
frame on the stack (See [Note Debugger: Step-out]).

Case continuation BCOs (which select an alternative branch) must
therefore be headed by a BRK_FUN. An example:

    f x = case g x of <--- end up here
        1 -> ...
        2 -> ...

    g y = ... <--- step out from here

- `g` will return a value to the case continuation BCO in `f`
- The case continuation BCO will receive the value returned from g
- Match on it and push the alternative continuation for that branch
- And then enter that alternative.

If we step-out of `g`, the first RET_BCO on the stack is the case
continuation of `f` -- execution should stop at its start, before
selecting an alternative. (One might ask, "why not enable the breakpoint
in the alternative instead?", because the alternative continuation is
only pushed to the stack *after* it is selected by the case cont. BCO)

However, the case cont. BCO is not associated with any source-level
tick, it is merely the glue code which selects alternatives which do
have source level ticks. Therefore, we have to come up at code
generation time with a breakpoint location ('InternalBreakLoc') to
display to the user when it is stopped there.

Our solution is to use the last tick seen just before reaching the case
continuation. This is robust because a case continuation will thus
always have a relevant breakpoint location:

    - The source location will be the last source-relevant expression
      executed before the continuation is pushed

    - So the source location will point to the thing you've just stepped
      out of

    - Doing :step-local from there will put you on the selected
      alternative (which at the source level may also be the e.g. next
      line in a do-block)

Examples, using angle brackets (<<...>>) to denote the breakpoint span:

    f x = case <<g x>> {- step in here -} of
        1 -> ...
        2 -> ...>

    g y = <<...>> <--- step out from here

    ...

    f x = < ...
        2 -> ...>>

    doing :step-local ...

    f x = case g x of
        1 -> <<...>> <--- stop in the alternative
        2 -> ...

A second example based on T26042d2, where the source is a do-block IO
action, optimised to a chain of `case expressions`.

    main = do
      putStrLn "hello1"
      <<f>> <--- step-in here
      putStrLn "hello3"
      putStrLn "hello4"

    f = do
      <> <--- step-out from here
      putStrLn "hello2.2"

    ...

    main = do
      putStrLn "hello1"
      <<f>> <--- end up here again, the previously executed expression
      putStrLn "hello3"
      putStrLn "hello4"

    doing step/step-local ...

    main = do
      putStrLn "hello1"
      f
      <> <--- straight to the next line
      putStrLn "hello4"

Finishes #26042

- - - - -

45 changed files:

- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Breakpoints.hs
- compiler/GHC/ByteCode/Instr.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/CoreToIface.hs
- compiler/GHC/Driver/CmdLine.hs
- compiler/GHC/Driver/Make.hs
- compiler/GHC/HsToCore/Breakpoints.hs
- compiler/GHC/Linker/Loader.hs
- compiler/GHC/Runtime/Debugger/Breakpoints.hs
- compiler/GHC/Runtime/Eval.hs
- compiler/GHC/Runtime/Interpreter.hs
- compiler/GHC/StgToByteCode.hs
- compiler/GHC/SysTools/Tasks.hs
- compiler/GHC/Utils/Error.hs
- docs/users_guide/runtime_control.rst
- docs/users_guide/win32-dlls.rst
- ghc/GHCi/UI.hs
- ghc/GHCi/UI/Monad.hs
- libraries/ghci/GHCi/Debugger.hs
- libraries/ghci/GHCi/Message.hs
- libraries/ghci/GHCi/Run.hs
- rts/Disassembler.c
- rts/Exception.cmm
- rts/Interpreter.c
- rts/Profiling.c
- rts/RtsFlags.c
- rts/Timer.c
- rts/include/rts/Bytecodes.h
- rts/include/rts/Flags.h
- testsuite/tests/corelint/T21115b.stderr
- testsuite/tests/count-deps/CountDepsAst.stdout
- testsuite/tests/count-deps/CountDepsParser.stdout
- testsuite/tests/ghci.debugger/scripts/T26042b.stdout
- testsuite/tests/ghci.debugger/scripts/T26042c.script
- testsuite/tests/ghci.debugger/scripts/T26042c.stdout
- + testsuite/tests/ghci.debugger/scripts/T26042d2.hs
- + testsuite/tests/ghci.debugger/scripts/T26042d2.script
- + testsuite/tests/ghci.debugger/scripts/T26042d2.stdout
- testsuite/tests/ghci.debugger/scripts/T26042e.stdout
- testsuite/tests/ghci.debugger/scripts/T26042f2.stdout
- testsuite/tests/ghci.debugger/scripts/T26042g.stdout
- testsuite/tests/ghci.debugger/scripts/all.T
- testsuite/tests/rts/flags/all.T

The diff was not included because it is too large.

View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/21f1d8c3bfe075fdc1d110b0c94106b...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/21f1d8c3bfe075fdc1d110b0c94106b...
You're receiving this email because of your account on gitlab.haskell.org.

Rodrigo Mesquita (＠alt-romes)

tags

participants (1)