[Git][ghc/ghc][wip/spj-try-opt-coercion] Wibbles
Simon Peyton Jones pushed to branch wip/spj-try-opt-coercion at Glasgow Haskell Compiler / GHC
Commits:
b8ca7215 by Simon Peyton Jones at 2026-01-01T23:30:49+00:00
Wibbles
Draft commit message
Drastically reduce the use of the coercion optimiser
This MR addresess #26679 by calling the coercion optimiser much less often;
the coercion optimiser is expensive, and often ineffective, especially when
it is called repeatedly.
See Note [Coercion optimisation] in GHC.Core.Corecion.Opt.
Specifically
* Make coercion optimisation into its own pass, controlled by its own flag
`-fopt-coercion` like any other pass. The pass runs early in the pipeline.
* The Simplifier now contents itself with merely applying the current substitution
to a coercion, and checking for reflexivity.
* Kill off OptCoercionOpts, which is now unused.
Another important, but only loosely related, improvement
* In `GHC.Core.Subst.substIdBndr`, `GHC.Core.TyCo.Subst.substTyVarBndrUsing`,
and `GHC.Core.TyCo.Subst.substTyVarBndrUsing`, check for an empty substitution.
This keeps substitutions empty in the no-op case.
See Note [Keeping the substitution empty] in GHC.Core.TyCo.Subst
Compile time improves. Here are the compile-time allocation change over 1%:
Baseline
Test value Change
------------------------------------------------------------
CoOpt_Singletons(normal) 721,677,974 -7.9% GOOD
LargeRecord(normal) 1,268,094,410 -20.4% GOOD
T12545(normal) 772,150,124 -9.2%
T12707(normal) 777,720,918 -1.5% GOOD
T14766(normal) 918,228,179 -50.2% GOOD
T15703(normal) 318,141,541 +1.0% BAD
T18223(normal) 371,496,473 -7.6% GOOD
T1969(normal) 670,830,326 +2.2% BAD
T20261(normal) 573,444,953 -2.2%
T24984(normal) 87,836,660 -1.5%
T3064(normal) 171,636,592 -6.2% GOOD
T5030(normal) 148,301,062 -16.4% GOOD
T5321Fun(normal) 263,120,022 -3.3% GOOD
T8095(normal) 777,033,006 -71.3% GOOD
T9020(optasm) 222,159,574 -2.5% GOOD
T9630(normal) 873,422,584 -15.4% GOOD
T9872b(normal) 1,906,890,318 -2.9% GOOD
T9872b_defer(normal) 2,878,170,737 -2.1% GOOD
T9872d(normal) 356,978,798 -7.7% GOOD
TcPlugin_RewritePerf(normal) 2,132,138,060 -2.6% GOOD
geo. mean -2.7%
minimum -71.3%
maximum +2.2%
Metric Decrease:
CoOpt_Singletons
LargeRecord
T12707
T14766
T18223
T3064
T5030
T5321Fun
T8095
T9020
T9630
T9872b
T9872b_defer
T9872d
TcPlugin_RewritePerf
Metric Increase:
T15703
T1969
- - - - -
13 changed files:
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Rules.hs
- compiler/GHC/Driver/Config.hs
- compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Session.hs
- testsuite/tests/count-deps/CountDepsAst.stdout
- testsuite/tests/count-deps/CountDepsParser.stdout
- testsuite/tests/simplCore/should_compile/T8331.stderr
Changes:
=====================================
compiler/GHC/Core/Coercion.hs
=====================================
@@ -2041,7 +2041,7 @@ This follows the lifting context extension definition in the
-- ----------------------------------------------------
data LiftingContext = LC Subst LiftCoEnv
- -- in optCoercion, we need to lift when optimizing InstCo.
+ -- In optCoercion, we need to lift when optimizing InstCo.
-- See Note [Optimising InstCo] in GHC.Core.Coercion.Opt
-- We thus propagate the substitution from GHC.Core.Coercion.Opt here.
=====================================
compiler/GHC/Core/Coercion/Opt.hs
=====================================
@@ -1,10 +1,7 @@
-- (c) The University of Glasgow 2006
{-# LANGUAGE CPP #-}
-module GHC.Core.Coercion.Opt
- ( optCoProgram, optCoercion
- , OptCoercionOpts (..)
- )
+module GHC.Core.Coercion.Opt( optCoProgram )
where
import GHC.Prelude
@@ -44,19 +41,25 @@ import Control.Monad ( zipWithM )
Note [Coercion optimisation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This module does coercion optimisation. See the paper
+This module does coercion optimisation. The purpose is to reduce the size
+of the coercions that the compiler carries around -- they are just proofs,
+and serve no runtime need. So the purpose of coercion optimisation is simply
+to shrink coercions and thereby reduce compile time.
+
+See the paper
Evidence normalization in Systtem FV (RTA'13)
https://simon.peytonjones.org/evidence-normalization/
The paper is also in the GHC repo, in docs/opt-coercion.
-However, although powerful and occasionally very effective, coercion optimisation
-can be very expensive (#26679). So we apply it sparingly:
+However, although powerful and occasionally very effective, coercion
+optimisation can itself be very expensive (#26679). So we apply it sparingly:
* In the Simplifier, function `rebuild_go`, we use `isReflexiveCo` (which
computes the type of the coercion) to eliminate reflexive coercion, just
before we build a cast (e |> co).
- (More precisely, we use `isReflexiveCoIgnoringMultiplicity.)
+ (More precisely, we use `isReflexiveCoIgnoringMultiplicity;
+ c.f. GHC.Core.Coercion.Opt.opt_univ.)
* We have a whole pass, `optCoProgram` that runs the coercion optimiser on all
the coercions in the program.
@@ -66,6 +69,7 @@ can be very expensive (#26679). So we apply it sparingly:
- We run it early in the optimisation pipeline
(see GHC.Core.Opt.Pipeline.getCoreToDo).
+ Controlled by a flag `-fopt-coercion`, on by default
Note [Optimising coercion optimisation]
@@ -218,7 +222,7 @@ optCoExpr is (Case e b ty alts) = Case (optCoExpr is e) b ty
(map (optCoAlt (is `extendInScopeSet` b)) alts)
optCo :: InScopeSet -> Coercion -> Coercion
-optCo is co = optCoercion' (mkEmptySubst is) co
+optCo is co = optCoercion (mkEmptySubst is) co
optCoAlt :: InScopeSet -> CoreAlt -> CoreAlt
optCoAlt is (Alt k bs e)
@@ -231,35 +235,13 @@ optCoAlt is (Alt k bs e)
%* *
%********************************************************************* -}
--- | Coercion optimisation options
-newtype OptCoercionOpts = OptCoercionOpts
- { optCoercionEnabled :: Bool -- ^ Enable coercion optimisation (reduce its size)
- }
-
-optCoercion :: OptCoercionOpts -> Subst -> Coercion -> NormalCo
+optCoercion :: Subst -> Coercion -> NormalCo
-- ^ optCoercion applies a substitution to a coercion,
-- *and* optimises it to reduce its size
-optCoercion opts env co
--- Experiment with no optCoercion at all
- | optCoercionEnabled opts
- = if False then optCoercion' env co
- else substCo env co
-
-{-
- = pprTrace "optCoercion {" (text "Co:" <> ppr co) $
- let result = optCoercion' env co in
- pprTrace "optCoercion }"
- (vcat [ text "Co:" <+> ppr (coercionSize co)
- , text "Optco:" <+> ppWhen (isReflCo result) (text "(refl)")
- <+> ppr result ]) $
- result
--}
-
- | otherwise
- = substCo env co
-
-optCoercion' :: Subst -> Coercion -> NormalCo
-optCoercion' env co
+-- The substitution is a vestige of an earlier era, when the coercion optimiser
+-- was called by the Simplifier; now it is always empty
+-- But I have not removed it in case we ever want it back.
+optCoercion env co
| debugIsOn
= let out_co = opt_co1 lc NotSwapped co
(Pair in_ty1 in_ty2, in_role) = coercionKindRole co
=====================================
compiler/GHC/Core/Opt/Pipeline.hs
=====================================
@@ -224,10 +224,12 @@ getCoreToDo dflags hpt_rule_base extra_vars
CoreDoPasses [ simpl_gently
, runWhen do_specialise CoreDoSpecialising ],
- -- Optimise coercions
+ -- Optimise coercions: see Note [Coercion optimisation]
+ -- in GHC.Core.Coercion.Opt
-- With -O do this after one run of the Simplifier.
-- Without -O, just take what the desugarer produced
- -- See Note [Coercion optimisation] in GHC.Core.Coercion.Opt
+ -- I tried running it right after desugaring, regardless of -O,
+ -- but that was worse (longer compile times).
runWhen do_co_opt CoreOptCoercion,
if full_laziness then
=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -15,7 +15,7 @@ module GHC.Core.Opt.Simplify.Env (
SimplPhase(..), isActive,
seArityOpts, seCaseCase, seCaseFolding, seCaseMerge, seCastSwizzle,
seDoEtaReduction, seEtaExpand, seFloatEnable, seInline, seNames,
- seOptCoercionOpts, sePhase, sePlatform, sePreInline,
+ sePhase, sePlatform, sePreInline,
seRuleOpts, seRules, seUnfoldingOpts,
mkSimplEnv, extendIdSubst, extendCvIdSubst,
extendTvSubst, extendCvSubst,
@@ -54,7 +54,6 @@ module GHC.Core.Opt.Simplify.Env (
import GHC.Prelude
-import GHC.Core.Coercion.Opt ( OptCoercionOpts )
import GHC.Core.FamInstEnv ( FamInstEnv )
import GHC.Core.Opt.Arity ( ArityOpts(..) )
import GHC.Core.Opt.Simplify.Monad
@@ -250,9 +249,6 @@ seInline env = sm_inline (seMode env)
seNames :: SimplEnv -> [String]
seNames env = sm_names (seMode env)
-seOptCoercionOpts :: SimplEnv -> OptCoercionOpts
-seOptCoercionOpts env = sm_co_opt_opts (seMode env)
-
sePhase :: SimplEnv -> SimplPhase
sePhase env = sm_phase (seMode env)
@@ -288,7 +284,6 @@ data SimplMode = SimplMode -- See comments in GHC.Core.Opt.Simplify.Monad
, sm_rule_opts :: !RuleOpts
, sm_case_folding :: !Bool
, sm_case_merge :: !Bool
- , sm_co_opt_opts :: !OptCoercionOpts -- ^ Coercion optimiser options
}
-- | See Note [SimplPhase]
=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -1531,8 +1531,10 @@ rebuild_go env expr cont
TickIt t cont -> rebuild_go env (mkTick t expr) cont
CastIt { sc_co = co, sc_cont = cont }
| isReflexiveCoIgnoringMultiplicity co
- -- ignoring multiplicity: c.f. GHC.Core.Coercion.Opt.opt_univ
+ -- isReflexiveCo: see Note [Coercion optimisation]
+ -- in GHc.Core.Coercion.Opt
-> rebuild_go env expr cont
+
| otherwise
-> rebuild_go env (mkCast expr co) cont
-- NB: mkCast implements the (Coercion co |> g) optimisation
=====================================
compiler/GHC/Core/Rules.hs
=====================================
@@ -852,17 +852,17 @@ bound on the LHS:
Now, if that binding is inlined, so that a=b=Int, we'd get
RULE forall (c :: Int~Int). f (x |> c) = e
and now when we simplify the LHS (GHC.Core.Opt.Simplify.Iteration.simplRules),
- optCoercion (look at the CoVarCo case) will turn that 'c' into Refl:
+ the Simplifier will turn that 'c' into Refl:
RULE forall (c :: Int~Int). f (x |> <Int>) = e
and then perhaps drop it altogether. Now 'c' is unbound.
+ (See the use of isReflexiveCo in GHC.Core.Opt.Simplify.Iteration.)
- It's tricky to be sure this never happens, so instead I
- say it's OK to have an unbound coercion binder in a RULE
- provided its type is (c :: t~t). Then, when the RULE
- fires we can substitute <t> for c.
+ It's tricky to be sure this never happens, so instead I say it's OK
+ to have an unbound coercion binder in a RULE provided its type is (c
+ :: t~t). Then, when the RULE fires we can substitute <t> for c.
- This actually happened (in a RULE for a local function)
- in #13410, and also in test T10602.
+ This actually happened (in a RULE for a local function) in #13410,
+ and also in test T10602.
Note [Cloning the template binders]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=====================================
compiler/GHC/Driver/Config.hs
=====================================
@@ -1,7 +1,6 @@
-- | Subsystem configuration
module GHC.Driver.Config
- ( initOptCoercionOpts
- , initSimpleOpts
+ ( initSimpleOpts
, initEvalOpts
, EvalStep(..)
)
@@ -11,15 +10,8 @@ import GHC.Prelude
import GHC.Driver.DynFlags
import GHC.Core.SimpleOpt
-import GHC.Core.Coercion.Opt
import GHCi.Message (EvalOpts(..))
--- | Initialise coercion optimiser configuration from DynFlags
-initOptCoercionOpts :: DynFlags -> OptCoercionOpts
-initOptCoercionOpts dflags = OptCoercionOpts
- { optCoercionEnabled = gopt Opt_OptCoercion dflags
- }
-
-- | Initialise Simple optimiser configuration from DynFlags
initSimpleOpts :: DynFlags -> SimpleOpts
initSimpleOpts dflags = SimpleOpts
=====================================
compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
=====================================
@@ -13,7 +13,6 @@ import GHC.Core.Opt.Simplify ( SimplifyExprOpts(..), SimplifyOpts(..) )
import GHC.Core.Opt.Simplify.Env ( FloatEnable(..), SimplMode(..), SimplPhase(..) )
import GHC.Core.Opt.Simplify.Monad ( TopEnvConfig(..) )
-import GHC.Driver.Config ( initOptCoercionOpts )
import GHC.Driver.Config.Core.Lint ( initLintPassResultConfig )
import GHC.Driver.Config.Core.Rules ( initRuleOpts )
import GHC.Driver.Config.Core.Opt.Arity ( initArityOpts )
@@ -73,7 +72,6 @@ initSimplMode dflags phase name = SimplMode
, sm_rule_opts = initRuleOpts dflags
, sm_case_folding = gopt Opt_CaseFolding dflags
, sm_case_merge = gopt Opt_CaseMerge dflags
- , sm_co_opt_opts = initOptCoercionOpts dflags
}
initGentleSimplMode :: DynFlags -> SimplMode
=====================================
compiler/GHC/Driver/DynFlags.hs
=====================================
@@ -6,7 +6,7 @@ module GHC.Driver.DynFlags (
Language(..),
FatalMessager, FlushOut(..),
ProfAuto(..),
- hasPprDebug, hasNoDebugOutput, hasNoStateHack, hasNoOptCoercion,
+ hasPprDebug, hasNoDebugOutput, hasNoStateHack,
dopt, dopt_set, dopt_unset,
gopt, gopt_set, gopt_unset,
wopt, wopt_set, wopt_unset,
@@ -994,9 +994,6 @@ hasNoDebugOutput = dopt Opt_D_no_debug_output
hasNoStateHack :: DynFlags -> Bool
hasNoStateHack = gopt Opt_G_NoStateHack
-hasNoOptCoercion :: DynFlags -> Bool
-hasNoOptCoercion flags = not (gopt Opt_OptCoercion flags)
-
-- | Test whether a 'DumpFlag' is set
dopt :: DumpFlag -> DynFlags -> Bool
dopt = getDumpFlagFrom verbosity dumpFlags
=====================================
compiler/GHC/Driver/Session.hs
=====================================
@@ -22,7 +22,7 @@ module GHC.Driver.Session (
FatalMessager, FlushOut(..),
ProfAuto(..),
glasgowExtsFlags,
- hasPprDebug, hasNoDebugOutput, hasNoStateHack, hasNoOptCoercion,
+ hasPprDebug, hasNoDebugOutput, hasNoStateHack,
dopt, dopt_set, dopt_unset,
gopt, gopt_set, gopt_unset, setGeneralFlag', unSetGeneralFlag',
wopt, wopt_set, wopt_unset,
=====================================
testsuite/tests/count-deps/CountDepsAst.stdout
=====================================
@@ -14,7 +14,6 @@ GHC.Core
GHC.Core.Class
GHC.Core.Coercion
GHC.Core.Coercion.Axiom
-GHC.Core.Coercion.Opt
GHC.Core.ConLike
GHC.Core.DataCon
GHC.Core.FVs
=====================================
testsuite/tests/count-deps/CountDepsParser.stdout
=====================================
@@ -14,7 +14,6 @@ GHC.Core
GHC.Core.Class
GHC.Core.Coercion
GHC.Core.Coercion.Axiom
-GHC.Core.Coercion.Opt
GHC.Core.ConLike
GHC.Core.DataCon
GHC.Core.FVs
=====================================
testsuite/tests/simplCore/should_compile/T8331.stderr
=====================================
@@ -20,9 +20,9 @@
= ($fApplicativeReaderT6 @s @r)
`cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
<a>_R
- ->_R
participants (1)
-
Simon Peyton Jones (@simonpj)