-
8ddc7207
by Rodrigo Mesquita at 2025-07-07T16:53:50+00:00
debugger/rts: Allow toggling step-in per thread
The RTS global flag `rts_stop_next_breakpoint` globally sets the
interpreter to stop at the immediate next breakpoint.
With this commit, single step mode can additionally be set per thread in
the TSO flag (TSO_STOP_NEXT_BREAKPOINT).
Being able to toggle "stop at next breakpoint" per thread is an
important requirement for implementing "stepping out" of a function in a
multi-threaded context.
And, more generally, having a per-thread flag for single-stepping paves the
way for multi-threaded debugging.
That said, when we want to enable "single step" mode for the whole
interpreted program we still want to stop at the immediate next
breakpoint, whichever thread it belongs to.
That's why we also keep the global `rts_stop_next_breakpoint` flag, with
`rts_enableStopNextBreakpointAll` and `rts_disableStopNextBreakpointAll` helpers.
Preparation for #26042
-
5df6757f
by Rodrigo Mesquita at 2025-07-07T16:53:53+00:00
docs: Case continuation BCOs
This commit documents a subtle interaction between frames for case BCOs
and their parents frames. Namely, case continuation BCOs may refer to
(non-local) variables that are part of the parent's frame.
The note expanding a bit on these details is called [Case continuation BCOs]
-
8289de1d
by Rodrigo Mesquita at 2025-07-07T16:58:37+00:00
debugger: Implement step-out feature
Implements support for stepping-out of a function (aka breaking right after
returning from a function) in the interactive debugger.
It also introduces a GHCi command :stepout to step-out of a function
being debugged in the interpreter. The feature is described as:
Stop at the first breakpoint immediately after returning from the current
function scope.
Known limitations: because a function tail-call does not push a stack
frame, if step-out is used inside of a function that was tail-called,
execution will not be returned to its caller, but rather its caller's
first non-tail caller. On the other hand, it means the debugger
follows the more realistic execution of the program.
In the following example:
.. code-block:: none
f = do
a
b <--- (1) set breakpoint then step in here
c
b = do
...
d <--- (2) step-into this tail call
d = do
...
something <--- (3) step-out here
...
Stepping-out will stop execution at the `c` invokation in `f`, rather than
stopping at `b`.
The key idea is simple: When step-out is enabled, traverse the runtime
stack until a continuation BCO is found -- and enable the breakpoint
heading that BCO explicitly using its tick-index.
The details are specified in `Note [Debugger: Step-out]` in `rts/Interpreter.c`.
Since PUSH_ALTS BCOs (representing case continuations) were never headed
by a breakpoint (unlike the case alternatives they push), we introduced
the BRK_ALTS instruction to allow the debugger to set a case
continuation to stop at the breakpoint heading the alternative that is
taken. This is further described in `Note [Debugger: BRK_ALTS]`.
Fixes #26042
-
c7313450
by Rodrigo Mesquita at 2025-07-07T16:58:41+00:00
debugger: Filter step-out stops by SrcSpan
To implement step-out, the RTS looks for the first continuation frame on
the stack and explicitly enables its entry breakpoint. However, some
continuations will be contained in the function from which step-out was
initiated (trivial example is a case expression).
Similarly to steplocal, we will filter the breakpoints at which the RTS
yields to the debugger based on the SrcSpan. When doing step-out, only
stop if the breakpoint is /not/ contained in the function from which we
initiated it.
This is especially relevant in monadic statements such as IO which is
compiled to a long chain of case expressions.
See Note [Debugger: Filtering step-out stops]
-
37448c97
by Cheng Shao at 2025-07-07T18:35:04+01:00
compiler: make ModBreaks serializable
-
1973995a
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
refactor: "Inspecting the session" moved from GHC
Moved utilities for inspecting the session from the GHC module to
GHC.Driver.Session.Inspect
Purely a clean up
-
9a8e4245
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
cleanup: Pass the HUG to readModBreaks, not HscEnv
A minor cleanup. The associated history and setupBreakpoint functions
are changed accordingly.
-
83ba1657
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
cleanup: Move readModBreaks to GHC.Runtime.Interpreter
With some small docs changes
-
69085e85
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
cleanup: Move interpreterProfiled to Interp.Types
Moves interpreterProfiled and interpreterDynamic to
GHC.Runtime.Interpreter.Types from GHC.Runtime.Interpreter.
-
5ed6bdcb
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
cleanup: Don't import GHC in Debugger.Breakpoints
Remove the top-level
import GHC
from GHC.Runtime.Debugger.Breakpoints
This makes the module dependencies more granular and cleans up the
qualified imports from the code.
-
c5037468
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
refactor: Use BreakpointId in Core and Ifaces
-
3065c00d
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
stg2bc: Derive BcM via ReaderT StateT
A small refactor that simplifies GHC.StgToByteCode by deriving-via the
Monad instances for BcM. This is done along the lines of previous
similar refactors like 72b54c0760bbf85be1f73c1a364d4701e5720465.
-
1e1ad317
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
refact: Split InternalModBreaks out of ModBreaks
There are currently two competing ways of referring to a Breakpoint:
1. Using the Tick module + Tick index
2. Using the Info module + Info index
1. The Tick index is allocated during desugaring in `mkModBreaks`. It is
used to refer to a breakpoint associated to a Core Tick. For a given
Tick module, there are N Ticks indexed by Tick index.
2. The Info index is allocated during code generation (in StgToByteCode)
and uniquely identifies the breakpoints at runtime (and is indeed used
to determine which breakpoint was hit at runtime).
Why we need both is described by Note [Breakpoint identifiers].
For every info index we used to keep a `CgBreakInfo`, a datatype containing
information relevant to ByteCode Generation, in `ModBreaks`.
This commit splits out the `IntMap CgBreakInfo` out of `ModBreaks` into
a new datatype `InternalModBreaks`.
- The purpose is to separate the `ModBreaks` datatype, which stores
data associated from tick-level information which is fixed after
desugaring, from the unrelated `IntMap CgBreakInfo` information
accumulated during bytecode generation.
- We move `ModBreaks` to GHC.HsToCore.Breakpoints
The new `InternalModBreaks` simply combines the `IntMap CgBreakInfo`
with `ModBreaks`. After code generation we construct an
`InternalModBreaks` with the `CgBreakInfo`s we accumulated and the
existing `ModBreaks` and store that in the compiled BCO in `bc_breaks`.
- Note that we previously only updated the `modBreaks_breakInfo`
field of `ModBreaks` at this exact location, and then stored the
updated `ModBreaks` in the same `bc_breaks`.
- We put this new datatype in GHC.ByteCode.Breakpoints
The rest of the pipeline for which CgBreakInfo is relevant is
accordingly updated to also use `InternalModBreaks`
-
34bc244f
by Rodrigo Mesquita at 2025-07-07T18:35:05+01:00
cleanup: Use BreakpointIds in bytecode gen
Small clean up to use BreakpointId and InternalBreakpointId more
uniformly in bytecode generation rather than using Module + Ix pairs
-
c568fbd2
by Rodrigo Mesquita at 2025-07-07T18:35:06+01:00
ghci: Allocate BreakArrays at link time only
Previously, a BreakArray would be allocated with a slot for every tick
in a module at `mkModBreaks`, in HsToCore. However, this approach has
a few downsides:
- It interleaves interpreter behaviour (allocating arrays for
breakpoints) within the desugarer
- It is inflexible in the sense it is impossible for the bytecode
generator to add "internal" breakpoints that can be triggered at
runtime, because those wouldn't have a source tick. (This is relevant
for our intended implementation plan of step-out in #26042)
- It ties the BreakArray indices to the *tick* indexes, while at runtime
we would rather just have the *info* indexes (currently we have both
because BreakArrays are indexed by the *tick* one).
Paving the way for #26042 and #26064, this commit moves the allocation
of BreakArrays to bytecode-loading time -- akin to what is done for CCS
arrays.
Since a BreakArray is allocated only when bytecode is linked, if a
breakpoint is set (e.g. `:break 10`) before the bytecode is linked,
there will exist no BreakArray to trigger the breakpoint in.
Therefore, the function to allocate break arrays (`allocateBreakArrays`)
is exposed and also used in GHC.Runtime.Eval to allocate a break array
when a breakpoint is set, if it doesn't exist yet (in the linker env).
-
89c5a088
by Rodrigo Mesquita at 2025-07-07T18:35:47+01:00
debugger: Uniquely identify breakpoints by internal id
Since b85b11994e0130ff2401dd4bbdf52330e0bcf776 (support inlining
breakpoints), a breakpoint has been identified at runtime by *two* pairs
of <module,index>.
- The first, aka a 'BreakpointId', uniquely identifies a breakpoint in
the source of a module by using the Tick index. A Tick index can index
into ModBreaks.modBreaks_xxx to fetch source-level information about
where that tick originated.
- When a user specifies e.g. a line breakpoint using :break, we'll reverse
engineer what a Tick index for that line
- We update the `BreakArray` of that module (got from the
LoaderState) at that tick index to `breakOn`.
- A BCO we can stop at is headed by a BRK_FUN instruction. This
instruction stores in an operand the `tick index` it is associated
to. We look it up in the associated `BreakArray` (also an operand)
and check wheter it was set to `breakOn`.
- The second, aka the `ibi_info_mod` + `ibi_info_ix` of the
`InternalBreakpointId`, uniquely index into the `imodBreaks_breakInfo`
-- the information we gathered during code generation about the
existing breakpoint *ocurrences*.
- Note that with optimisation there may be many occurrences of the
same source-tick-breakpoint across different modules. The
`ibi_info_ix` is unique per occurrence, but the `bi_tick_ix` may be
shared. See Note [Breakpoint identifiers] about this.
- Note that besides the tick ids, info ids are also stored in
`BRK_FUN` so the break handler can refer to the associated
`CgBreakInfo`.
In light of that, the driving changes come from the desire to have the
info_id uniquely identify the breakpoint at runtime, and the source tick
id being derived from it:
- An InternalBreakpointId should uniquely identify a breakpoint just
from the code-generation identifiers of `ibi_info_ix` and `ibi_info_mod`.
So we drop `ibi_tick_mod` and `ibi_tick_ix`.
- A BRK_FUN instruction need only record the "internal breakpoint id",
not the tick-level id.
So we drop the tick mod and tick index operands.
- A BreakArray should be indexed by InternalBreakpointId rather than
BreakpointId
That means we need to do some more work when setting a breakpoint.
Specifically, we need to figure out the internal ids (occurrences of a
breakpoint) from the source-level BreakpointId we want to set the
breakpoint at (recall :break refers to breaks at the source level).
Besides this change being an improvement to the handling of breakpoints
(it's clearer to have a single unique identifier than two competing
ones), it unlocks the possibility of generating "internal" breakpoints
during Cg (needed for #26042).
It should also be easier to introduce multi-threaded-aware `BreakArrays`
following this change (needed for #26064).
Se also the new Note [ModBreaks vs InternalModBreaks]