Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC
Commits:
98fa0d36 by Simon Hengel at 2025-11-27T17:54:57-05:00
Fix typo in docs/users_guide/exts/type_families.rst
- - - - -
5b97e5ce by Simon Hengel at 2025-11-27T17:55:37-05:00
Fix broken RankNTypes example in user's guide
- - - - -
fa2aaa00 by Simon Peyton Jones at 2025-11-27T17:56:18-05:00
Switch off specialisation in ExactPrint
In !15057 (where we re-introduced -fpolymoprhic-specialisation) we found
that ExactPrint's compile time blew up by a factor of 5. It turned out
to be caused by bazillions of specialisations of `markAnnotated`.
Since ExactPrint isn't perf-critical, it does not seem worth taking
the performance hit, so this patch switches off specialisation in
this one module.
- - - - -
1fd25987 by Simon Peyton Jones at 2025-11-27T17:56:18-05:00
Switch -fpolymorphic-specialisation on by default
This patch addresses #23559.
Now that !10479 has landed and #26329 is fixed, we can switch on
polymorphic specialisation by default, addressing a bunch of other
tickets listed in #23559.
Metric changes:
* CoOpt_Singleton: +4% compiler allocations: we just get more
specialisations
* info_table_map_perf: -20% decrease in compiler allocations.
This is caused by using -fno-specialise in ExactPrint.hs
Without that change we get a 4x blow-up in compile time;
see !15058 for details
Metric Decrease:
info_table_map_perf
Metric Increase:
CoOpt_Singletons
- - - - -
b7fe7445 by Matthew Pickering at 2025-11-27T17:56:59-05:00
rts: Fix a deadlock with eventlog flush interval and RTS shutdown
The ghc_ticker thread attempts to flush at the eventlog tick interval, this requires
waiting to take all capabilities.
At the same time, the main thread is shutting down, the schedule is
stopped and then we wait for the ticker thread to finish.
Therefore we are deadlocked.
The solution is to use `newBoundTask/exitMyTask`, so that flushing can
cooperate with the scheduler shutdown.
Fixes #26573
- - - - -
1d4a1229 by sheaf at 2025-11-27T17:58:02-05:00
SimpleOpt: don't subst in pushCoercionIntoLambda
It was noticed in #26589 that the change in 15b311be was incorrect:
the simple optimiser carries two different substitution-like pieces of
information: 'soe_subst' (from InVar to OutExpr) and 'soe_inl'
(from InId to InExpr). It is thus incorrect to have 'pushCoercionIntoLambda'
apply the substitution from 'soe_subst' while discarding 'soe_inl'
entirely, which is what was done in 15b311be.
Instead, we change back pushCoercionIntoLambda to take an InScopeSet,
and optimise the lambda before calling 'pushCoercionIntoLambda' to avoid
mixing InExpr with OutExpr, or mixing two InExpr with different
environments. We can then call 'soeZapSubst' without problems.
Fixes #26588 #26589
- - - - -
a987c4d9 by Sylvain Henry at 2025-11-28T12:04:55-05:00
Fix PIC jump tables on Windows (#24016)
Avoid overflows in jump tables by using a base label closer to the jump
targets. See added Note [Jump tables]
- - - - -
f3ccdc29 by Zubin Duggal at 2025-11-28T12:04:58-05:00
rts/linker/PEi386: Copy strings before they are inserted into LoadedDllCache. The original strings are temporary and might be freed at an arbitrary point.
Fixes #26613
- - - - -
23 changed files:
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/Instr.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Tc/Gen/App.hs
- docs/users_guide/exts/rank_polymorphism.rst
- docs/users_guide/exts/type_families.rst
- docs/users_guide/phases.rst
- docs/users_guide/using-optimisation.rst
- rts/eventlog/EventLog.c
- rts/linker/PEi386.c
- + testsuite/tests/codeGen/should_run/T24016.hs
- + testsuite/tests/codeGen/should_run/T24016.stdout
- testsuite/tests/codeGen/should_run/all.T
- testsuite/tests/rts/all.T
- + testsuite/tests/simplCore/should_compile/T26588.hs
- + testsuite/tests/simplCore/should_compile/T26589.hs
- testsuite/tests/simplCore/should_compile/T8331.stderr
- testsuite/tests/simplCore/should_compile/all.T
- utils/check-exact/ExactPrint.hs
Changes:
=====================================
compiler/GHC/CmmToAsm/X86/CodeGen.hs
=====================================
@@ -376,7 +376,7 @@ stmtToInstrs bid stmt = do
--We try to arrange blocks such that the likely branch is the fallthrough
--in GHC.Cmm.ContFlowOpt. So we can assume the condition is likely false here.
CmmCondBranch arg true false _ -> genCondBranch bid true false arg
- CmmSwitch arg ids -> genSwitch arg ids
+ CmmSwitch arg ids -> genSwitch arg ids bid
CmmCall { cml_target = arg
, cml_args_regs = gregs } -> genJump arg (jumpRegs platform gregs)
_ ->
@@ -489,13 +489,6 @@ is32BitInteger i = i64 <= 0x7fffffff && i64 >= -0x80000000
where i64 = fromIntegral i :: Int64
--- | Convert a BlockId to some CmmStatic data
-jumpTableEntry :: NCGConfig -> Maybe BlockId -> CmmStatic
-jumpTableEntry config Nothing = CmmStaticLit (CmmInt 0 (ncgWordWidth config))
-jumpTableEntry _ (Just blockid) = CmmStaticLit (CmmLabel blockLabel)
- where blockLabel = blockLbl blockid
-
-
-- -----------------------------------------------------------------------------
-- General things for putting together code sequences
@@ -5375,11 +5368,52 @@ index (1),
indexExpr = UU_Conv(indexOffset); // == 1::I64
See #21186.
--}
-genSwitch :: CmmExpr -> SwitchTargets -> NatM InstrBlock
+Note [Jump tables]
+~~~~~~~~~~~~~~~~~~
+The x86 backend has a virtual JMP_TBL instruction which payload can be used to
+generate both the jump instruction and the jump table contents. `genSwitch` is
+responsible for generating these JMP_TBL instructions.
+
+Depending on `-fPIC` flag and on the architecture, we generate the following
+jump table variants:
+
+ | Variant | Arch | Table's contents | Reference to the table |
+ |---------|--------|----------------------------------------|------------------------|
+ | PIC | Both | Relative offset: target_lbl - base_lbl | PIC |
+ | Non-PIC | 64-bit | Absolute: target_lbl | Non-PIC (rip-relative) |
+ | Non-PIC | 32-bit | Absolute: target_lbl | Non-PIC (absolute) |
+
+For the PIC variant, we store relative entries (`target_lbl - base_lbl`) in the
+jump table. Using absolute entries with PIC would require target_lbl symbols to
+be resolved at link time, hence to be global labels (currently they are local
+labels).
+
+We use the block_id of the code containing the jump as `base_lbl`. It ensures
+that target_lbl and base_lbl are close enough to each others, avoiding
+overflows.
+
+Historical note: in the past we used the table label `table_lbl` as base_lbl. It
+allowed the jumping code to only compute one global address (table_lbl) both to
+read the table and to compute the target address. However:
-genSwitch expr targets = do
+ * the table could be too far from the jump and on Windows which only
+ has 32-bit relative relocations (IMAGE_REL_AMD64_REL64 doesn't exist),
+ `dest_lbl - table_lbl` overflowed (see #24016)
+
+ * Mac OS X/x86-64 linker was unable to handle `.quad L1 - L0`
+ relocations if L0 wasn't preceded by a non-anonymous label in its
+ section (which was the case with table_lbl). Hence we used to put the
+ jump table in the .text section in this case.
+
+
+-}
+
+-- | Generate a JMP_TBL instruction
+--
+-- See Note [Jump tables]
+genSwitch :: CmmExpr -> SwitchTargets -> BlockId -> NatM InstrBlock
+genSwitch expr targets bid = do
config <- getConfig
let platform = ncgPlatform config
expr_w = cmmExprWidth platform expr
@@ -5390,79 +5424,76 @@ genSwitch expr targets = do
indexExpr = CmmMachOp
(MO_UU_Conv expr_w (platformWordWidth platform))
[indexExpr0]
- if ncgPIC config
- then do
- (reg,e_code) <- getNonClobberedReg indexExpr
- -- getNonClobberedReg because it needs to survive across t_code
- lbl <- getNewLabelNat
- let is32bit = target32Bit platform
- os = platformOS platform
- -- Might want to use .rodata. instead, but as
- -- long as it's something unique it'll work out since the
- -- references to the jump table are in the appropriate section.
- rosection = case os of
- -- on Mac OS X/x86_64, put the jump table in the text section to
- -- work around a limitation of the linker.
- -- ld64 is unable to handle the relocations for
- -- .quad L1 - L0
- -- if L0 is not preceded by a non-anonymous label in its section.
- OSDarwin | not is32bit -> Section Text lbl
- _ -> Section ReadOnlyData lbl
- dynRef <- cmmMakeDynamicReference config DataReference lbl
- (tableReg,t_code) <- getSomeReg $ dynRef
- let op = OpAddr (AddrBaseIndex (EABaseReg tableReg)
- (EAIndex reg (platformWordSizeInBytes platform)) (ImmInt 0))
-
- return $ e_code `appOL` t_code `appOL` toOL [
- ADD (intFormat (platformWordWidth platform)) op (OpReg tableReg),
- JMP_TBL (OpReg tableReg) ids rosection lbl
- ]
- else do
- (reg,e_code) <- getSomeReg indexExpr
- lbl <- getNewLabelNat
- let is32bit = target32Bit platform
- if is32bit
- then let op = OpAddr (AddrBaseIndex EABaseNone (EAIndex reg (platformWordSizeInBytes platform)) (ImmCLbl lbl))
- jmp_code = JMP_TBL op ids (Section ReadOnlyData lbl) lbl
- in return $ e_code `appOL` unitOL jmp_code
- else do
+
+ (offset, blockIds) = switchTargetsToTable targets
+ ids = map (fmap DestBlockId) blockIds
+
+ is32bit = target32Bit platform
+ fmt = archWordFormat is32bit
+
+ table_lbl <- getNewLabelNat
+ let bid_lbl = blockLbl bid
+ let table_section = Section ReadOnlyData table_lbl
+
+ -- see Note [Jump tables] for a description of the following 3 variants.
+ if
+ | ncgPIC config -> do
+ -- PIC support: store relative offsets in the jump table to allow the code
+ -- to be relocated without updating the table. The table itself and the
+ -- block label used to make the relative labels absolute are read in a PIC
+ -- way (via cmmMakeDynamicReference).
+ (reg,e_code) <- getNonClobberedReg indexExpr -- getNonClobberedReg because it needs to survive across t_code and j_code
+ (tableReg,t_code) <- getNonClobberedReg =<< cmmMakeDynamicReference config DataReference table_lbl
+ (targetReg,j_code) <- getSomeReg =<< cmmMakeDynamicReference config DataReference bid_lbl
+ pure $ e_code `appOL` t_code `appOL` j_code `appOL` toOL
+ [ ADD fmt (OpAddr (AddrBaseIndex (EABaseReg tableReg) (EAIndex reg (platformWordSizeInBytes platform)) (ImmInt 0)))
+ (OpReg targetReg)
+ , JMP_TBL (OpReg targetReg) ids table_section table_lbl (Just bid_lbl)
+ ]
+
+ | not is32bit -> do
+ -- 64-bit non-PIC code
+ (reg,e_code) <- getSomeReg indexExpr
+ tableReg <- getNewRegNat (intFormat (platformWordWidth platform))
+ targetReg <- getNewRegNat (intFormat (platformWordWidth platform))
+ pure $ e_code `appOL` toOL
-- See Note [%rip-relative addressing on x86-64].
- tableReg <- getNewRegNat (intFormat (platformWordWidth platform))
- targetReg <- getNewRegNat (intFormat (platformWordWidth platform))
- let op = OpAddr (AddrBaseIndex (EABaseReg tableReg) (EAIndex reg (platformWordSizeInBytes platform)) (ImmInt 0))
- fmt = archWordFormat is32bit
- code = e_code `appOL` toOL
- [ LEA fmt (OpAddr (AddrBaseIndex EABaseRip EAIndexNone (ImmCLbl lbl))) (OpReg tableReg)
- , MOV fmt op (OpReg targetReg)
- , JMP_TBL (OpReg targetReg) ids (Section ReadOnlyData lbl) lbl
- ]
- return code
- where
- (offset, blockIds) = switchTargetsToTable targets
- ids = map (fmap DestBlockId) blockIds
+ [ LEA fmt (OpAddr (AddrBaseIndex EABaseRip EAIndexNone (ImmCLbl table_lbl))) (OpReg tableReg)
+ , MOV fmt (OpAddr (AddrBaseIndex (EABaseReg tableReg) (EAIndex reg (platformWordSizeInBytes platform)) (ImmInt 0)))
+ (OpReg targetReg)
+ , JMP_TBL (OpReg targetReg) ids table_section table_lbl Nothing
+ ]
+
+ | otherwise -> do
+ -- 32-bit non-PIC code is a straightforward jump to &table[entry].
+ (reg,e_code) <- getSomeReg indexExpr
+ pure $ e_code `appOL` unitOL
+ ( JMP_TBL (OpAddr (AddrBaseIndex EABaseNone (EAIndex reg (platformWordSizeInBytes platform)) (ImmCLbl table_lbl)))
+ ids table_section table_lbl Nothing
+ )
generateJumpTableForInstr :: NCGConfig -> Instr -> Maybe (NatCmmDecl (Alignment, RawCmmStatics) Instr)
-generateJumpTableForInstr config (JMP_TBL _ ids section lbl)
- = let getBlockId (DestBlockId id) = id
- getBlockId _ = panic "Non-Label target in Jump Table"
- blockIds = map (fmap getBlockId) ids
- in Just (createJumpTable config blockIds section lbl)
-generateJumpTableForInstr _ _ = Nothing
-
-createJumpTable :: NCGConfig -> [Maybe BlockId] -> Section -> CLabel
- -> GenCmmDecl (Alignment, RawCmmStatics) h g
-createJumpTable config ids section lbl
- = let jumpTable
- | ncgPIC config =
- let ww = ncgWordWidth config
- jumpTableEntryRel Nothing
- = CmmStaticLit (CmmInt 0 ww)
- jumpTableEntryRel (Just blockid)
- = CmmStaticLit (CmmLabelDiffOff blockLabel lbl 0 ww)
- where blockLabel = blockLbl blockid
- in map jumpTableEntryRel ids
- | otherwise = map (jumpTableEntry config) ids
- in CmmData section (mkAlignment 1, CmmStaticsRaw lbl jumpTable)
+generateJumpTableForInstr config = \case
+ JMP_TBL _ ids section table_lbl mrel_lbl ->
+ let getBlockId (DestBlockId id) = id
+ getBlockId _ = panic "Non-Label target in Jump Table"
+ block_ids = map (fmap getBlockId) ids
+
+ jumpTable = case mrel_lbl of
+ Nothing -> map mk_absolute block_ids -- absolute entries
+ Just rel_lbl -> map (mk_relative rel_lbl) block_ids -- offsets relative to rel_lbl
+
+ mk_absolute = \case
+ Nothing -> CmmStaticLit (CmmInt 0 (ncgWordWidth config))
+ Just blockid -> CmmStaticLit (CmmLabel (blockLbl blockid))
+
+ mk_relative rel_lbl = \case
+ Nothing -> CmmStaticLit (CmmInt 0 (ncgWordWidth config))
+ Just blockid -> CmmStaticLit (CmmLabelDiffOff (blockLbl blockid) rel_lbl 0 (ncgWordWidth config))
+
+ in Just (CmmData section (mkAlignment 1, CmmStaticsRaw table_lbl jumpTable))
+
+ _ -> Nothing
extractUnwindPoints :: [Instr] -> [UnwindPoint]
extractUnwindPoints instrs =
=====================================
compiler/GHC/CmmToAsm/X86/Instr.hs
=====================================
@@ -252,6 +252,7 @@ data Instr
[Maybe JumpDest] -- Targets of the jump table
Section -- Data section jump table should be put in
CLabel -- Label of jump table
+ !(Maybe CLabel) -- Label used to compute relative offsets. Otherwise we store absolute addresses.
-- | X86 call instruction
| CALL (Either Imm Reg) -- ^ Jump target
[RegWithFormat] -- ^ Arguments (required for register allocation)
@@ -486,7 +487,7 @@ regUsageOfInstr platform instr
JXX _ _ -> mkRU [] []
JXX_GBL _ _ -> mkRU [] []
JMP op regs -> mkRU (use_R addrFmt op regs) []
- JMP_TBL op _ _ _ -> mkRU (use_R addrFmt op []) []
+ JMP_TBL op _ _ _ _ -> mkRU (use_R addrFmt op []) []
CALL (Left _) params -> mkRU params (map mkFmt $ callClobberedRegs platform)
CALL (Right reg) params -> mkRU (mk addrFmt reg:params) (map mkFmt $ callClobberedRegs platform)
CLTD fmt -> mkRU [mk fmt eax] [mk fmt edx]
@@ -812,7 +813,7 @@ patchRegsOfInstr platform instr env
POP fmt op -> patch1 (POP fmt) op
SETCC cond op -> patch1 (SETCC cond) op
JMP op regs -> JMP (patchOp op) regs
- JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl
+ JMP_TBL op ids s tl jl -> JMP_TBL (patchOp op) ids s tl jl
FMA3 fmt perm var x1 x2 x3 -> patch3 (FMA3 fmt perm var) x1 x2 x3
@@ -1016,9 +1017,9 @@ isJumpishInstr instr
canFallthroughTo :: Instr -> BlockId -> Bool
canFallthroughTo insn bid
= case insn of
- JXX _ target -> bid == target
- JMP_TBL _ targets _ _ -> all isTargetBid targets
- _ -> False
+ JXX _ target -> bid == target
+ JMP_TBL _ targets _ _ _ -> all isTargetBid targets
+ _ -> False
where
isTargetBid target = case target of
Nothing -> True
@@ -1031,9 +1032,9 @@ jumpDestsOfInstr
jumpDestsOfInstr insn
= case insn of
- JXX _ id -> [id]
- JMP_TBL _ ids _ _ -> [id | Just (DestBlockId id) <- ids]
- _ -> []
+ JXX _ id -> [id]
+ JMP_TBL _ ids _ _ _ -> [id | Just (DestBlockId id) <- ids]
+ _ -> []
patchJumpInstr
@@ -1042,8 +1043,8 @@ patchJumpInstr
patchJumpInstr insn patchF
= case insn of
JXX cc id -> JXX cc (patchF id)
- JMP_TBL op ids section lbl
- -> JMP_TBL op (map (fmap (patchJumpDest patchF)) ids) section lbl
+ JMP_TBL op ids section table_lbl rel_lbl
+ -> JMP_TBL op (map (fmap (patchJumpDest patchF)) ids) section table_lbl rel_lbl
_ -> insn
where
patchJumpDest f (DestBlockId id) = DestBlockId (f id)
@@ -1504,14 +1505,14 @@ shortcutJump fn insn = shortcutJump' fn (setEmpty :: LabelSet) insn
Just (DestBlockId id') -> shortcutJump' fn seen' (JXX cc id')
Just (DestImm imm) -> shortcutJump' fn seen' (JXX_GBL cc imm)
where seen' = setInsert id seen
- shortcutJump' fn _ (JMP_TBL addr blocks section tblId) =
+ shortcutJump' fn _ (JMP_TBL addr blocks section table_lbl rel_lbl) =
let updateBlock (Just (DestBlockId bid)) =
case fn bid of
Nothing -> Just (DestBlockId bid )
Just dest -> Just dest
updateBlock dest = dest
blocks' = map updateBlock blocks
- in JMP_TBL addr blocks' section tblId
+ in JMP_TBL addr blocks' section table_lbl rel_lbl
shortcutJump' _ _ other = other
-- Here because it knows about JumpDest
=====================================
compiler/GHC/CmmToAsm/X86/Ppr.hs
=====================================
@@ -895,7 +895,7 @@ pprInstr platform i = case i of
JMP op _
-> line $ text "\tjmp *" <> pprOperand platform (archWordFormat (target32Bit platform)) op
- JMP_TBL op _ _ _
+ JMP_TBL op _ _ _ _
-> pprInstr platform (JMP op [])
CALL (Left imm) _
=====================================
compiler/GHC/Core/Opt/Arity.hs
=====================================
@@ -2993,12 +2993,12 @@ pushCoValArg co
Pair tyL tyR = coercionKind co
pushCoercionIntoLambda
- :: HasDebugCallStack => Subst -> InVar -> InExpr -> OutCoercionR -> Maybe (OutVar, OutExpr)
+ :: HasDebugCallStack => InScopeSet -> Var -> CoreExpr -> CoercionR -> Maybe (Var, CoreExpr)
-- This implements the Push rule from the paper on coercions
-- (\x. e) |> co
-- ===>
-- (\x'. e |> co')
-pushCoercionIntoLambda subst x e co
+pushCoercionIntoLambda in_scope x e co
| assert (not (isTyVar x) && not (isCoVar x)) True
, Pair s1s2 t1t2 <- coercionKind co
, Just {} <- splitFunTy_maybe s1s2
@@ -3011,9 +3011,9 @@ pushCoercionIntoLambda subst x e co
-- Should we optimize the coercions here?
-- Otherwise they might not match too well
x' = x `setIdType` t1 `setIdMult` w1
- in_scope' = substInScopeSet subst `extendInScopeSet` x'
+ in_scope' = in_scope `extendInScopeSet` x'
subst' =
- extendIdSubst (setInScope subst in_scope')
+ extendIdSubst (setInScope emptySubst in_scope')
x
(mkCast (Var x') (mkSymCo co1))
-- We substitute x' for x, except we need to preserve types.
=====================================
compiler/GHC/Core/SimpleOpt.hs
=====================================
@@ -393,12 +393,19 @@ simple_app env e0@(Lam {}) as0@(_:_)
= wrapLet mb_pr $ do_beta env'' body as
where (env', b') = subst_opt_bndr env b
- do_beta env e@(Lam b body) as@(CastIt co:rest)
- -- See Note [Desugaring unlifted newtypes]
+ -- See Note [Eliminate casts in function position]
+ do_beta env e@(Lam b _) as@(CastIt out_co:rest)
| isNonCoVarId b
- , Just (b', body') <- pushCoercionIntoLambda (soe_subst env) b body co
+ -- Optimise the inner lambda to make it an 'OutExpr', which makes it
+ -- possible to call 'pushCoercionIntoLambda' with the 'OutCoercion' 'co'.
+ -- This is kind of horrible, as for nested casted lambdas with a big body,
+ -- we will repeatedly optimise the body (once for each binder). However,
+ -- we need to do this to avoid mixing 'InExpr' and 'OutExpr', or two
+ -- 'InExpr' with different environments (getting this wrong caused #26588 & #26589.)
+ , Lam out_b out_body <- simple_app env e []
+ , Just (b', body') <- pushCoercionIntoLambda (soeInScope env) out_b out_body out_co
= do_beta (soeZapSubst env) (Lam b' body') rest
- -- soeZapSubst: pushCoercionIntoLambda applies the substitution
+ -- soeZapSubst: we've already optimised everything (the lambda and 'rest') by now.
| otherwise
= rebuild_app env (simple_opt_expr env e) as
@@ -511,7 +518,31 @@ TL;DR: To avoid the rest of the compiler pipeline seeing these bad lambas, we
rely on the simple optimiser to both inline the newtype unfolding and
subsequently deal with the resulting lambdas (either beta-reducing them
altogether or pushing coercions into them so that they satisfy the
-representation-polymorphism invariants).
+representation-polymorphism invariants). See Note [Eliminate casts in function position].
+
+[Alternative approach] (GHC ticket #26608)
+
+ We could instead, in the typechecker, emit a special form (a new constructor
+ of XXExprGhcTc) for instantiations of representation-polymorphic unlifted
+ newtypes (whether applied to a value argument or not):
+
+ UnliftedNT :: DataCon -> [Type] -> Coercion -> XXExprGhcTc
+
+ where "UnliftedNT nt_con [ty1, ...] co" represents the expression:
+
+ ( nt_con @ty1 ... ) |> co
+
+ The desugarer would then turn these AST nodes into appropriate Core, doing
+ what the simple optimiser does today:
+ - inline the compulsory unfolding of the newtype constructor
+ - apply it to its type arguments and beta reduce
+ - push the coercion into the resulting lambda
+
+ This would have several advantages:
+ - the desugarer would never produce "invalid" Core that needs to be
+ tidied up by the simple optimiser,
+ - the ugly and inefficient implementation described in
+ Note [Eliminate casts in function position] could be removed.
Wrinkle [Unlifted newtypes with wrappers]
@@ -717,50 +748,49 @@ rhss here.
Note [Eliminate casts in function position]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider the following program:
+Due to the current implementation strategy for representation-polymorphic
+unlifted newtypes, as described in Note [Desugaring unlifted newtypes], we rely
+on the simple optimiser to push coercions into lambdas, such as in the following
+example:
type R :: Type -> RuntimeRep
- type family R a where { R Float = FloatRep; R Double = DoubleRep }
- type F :: forall (a :: Type) -> TYPE (R a)
- type family F a where { F Float = Float# ; F Double = Double# }
+ type family R a where { R Int = IntRep }
+ type F :: forall a -> TYPE (R a)
+ type family F a where { F Int = Int# }
- type N :: forall (a :: Type) -> TYPE (R a)
newtype N a = MkN (F a)
-As MkN is a newtype, its unfolding is a lambda which wraps its argument
-in a cast:
-
- MkN :: forall (a :: Type). F a -> N a
- MkN = /\a \(x::F a). x |> co_ax
- -- recall that F a :: TYPE (R a)
-
-This is a representation-polymorphic lambda, in which the binder has an unknown
-representation (R a). We can't compile such a lambda on its own, but we can
-compile instantiations, such as `MkN @Float` or `MkN @Double`.
+Now, an instantiated occurrence of 'MkN', such as 'MkN @Int' (whether applied
+to a value argument or not) will lead, after inlining the compulsory unfolding
+of 'MkN', to a lambda fo the form:
-Our strategy to avoid running afoul of the representation-polymorphism
-invariants of Note [Representation polymorphism invariants] in GHC.Core is thus:
+ ( \ ( x :: F Int ) -> body ) |> co
- 1. Give the newtype a compulsory unfolding (it has no binding, as we can't
- define lambdas with representation-polymorphic value binders in source Haskell).
- 2. Rely on the optimiser to beta-reduce away any representation-polymorphic
- value binders.
+ where
+ co :: ( F Int -> res ) ~# ( Int# -> res )
-For example, consider the application
+The problem is that we now have a lambda abstraction whose binder does not have a
+fixed RuntimeRep in the sense of Note [Fixed RuntimeRep] in GHC.Tc.Utils.Concrete.
- MkN @Float 34.0#
+However, if we use 'pushCoercionIntoLambda', we end up with:
-After inlining MkN we'll get
+ ( \ ( x' :: Int# ) -> body' )
- ((/\a \(x:F a). x |> co_ax) @Float) |> co 34#
+which satisfies the representation-polymorphism invariants of
+Note [Representation polymorphism invariants] in GHC.Core.
-where co :: (F Float -> N Float) ~ (Float# ~ N Float)
+In conclusion:
-But to actually beta-reduce that lambda, we need to push the 'co'
-inside the `\x` with pushCoercionIntoLambda. Hence the extra
-equation for Cast-of-Lam in simple_app.
+ 1. The simple optimiser must push casts into lambdas.
+ 2. It must also deal with a situation such as (MkN @Int) |> co, where we first
+ inline the compulsory unfolding of N. This means the simple optimiser must
+ "peel off" the casts and optimise the inner expression first, to determine
+ whether it is a lambda abstraction or not.
-This is regrettably delicate.
+This is regrettably delicate. If we could make sure the typechecker/desugarer
+did not produce these bad lambdas in the first place (as described in
+[Alternative approach] in Note [Desugaring unlifted newtypes]), we could
+get rid of this ugly logic.
Note [Preserve join-binding arity]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1673,7 +1703,7 @@ exprIsLambda_maybe ise@(ISE in_scope_set _) (Cast casted_e co)
-- this implies that x is not in scope in gamma (makes this code simpler)
, not (isTyVar x) && not (isCoVar x)
, assert (not $ x `elemVarSet` tyCoVarsOfCo co) True
- , Just (x',e') <- pushCoercionIntoLambda (mkEmptySubst in_scope_set) x e co
+ , Just (x',e') <- pushCoercionIntoLambda in_scope_set x e co
, let res = Just (x',e',ts)
= --pprTrace "exprIsLambda_maybe:Cast" (vcat [ppr casted_e,ppr co,ppr res)])
res
=====================================
compiler/GHC/Driver/DynFlags.hs
=====================================
@@ -1268,6 +1268,7 @@ optLevelFlags -- see Note [Documenting optimisation flags]
, ([1,2], Opt_CfgBlocklayout) -- Experimental
, ([1,2], Opt_Specialise)
+ , ([1,2], Opt_PolymorphicSpecialisation) -- Now on by default (#23559)
, ([1,2], Opt_CrossModuleSpecialise)
, ([1,2], Opt_InlineGenerics)
, ([1,2], Opt_Strictness)
=====================================
compiler/GHC/Driver/Flags.hs
=====================================
@@ -909,6 +909,7 @@ optimisationFlags = EnumSet.fromList
, Opt_SpecialiseAggressively
, Opt_CrossModuleSpecialise
, Opt_StaticArgumentTransformation
+ , Opt_PolymorphicSpecialisation
, Opt_CSE
, Opt_StgCSE
, Opt_StgLiftLams
=====================================
compiler/GHC/Tc/Gen/App.hs
=====================================
@@ -749,13 +749,13 @@ tcInstFun do_ql inst_final (tc_fun, fun_ctxt) fun_sigma rn_args
go1 _pos acc fun_ty []
| XExpr (ConLikeTc (RealDataCon dc)) <- tc_fun
, isNewDataCon dc
- , [Scaled _ arg_ty] <- dataConOrigArgTys dc
+ , [Scaled _ orig_arg_ty] <- dataConOrigArgTys dc
, n_val_args == 0
-- If we're dealing with an unsaturated representation-polymorphic
-- UnliftedNewype, then perform a representation-polymorphism check.
-- See Note [Representation-polymorphism checks for unsaturated unlifted newtypes]
-- in GHC.Tc.Utils.Concrete.
- , not $ typeHasFixedRuntimeRep arg_ty
+ , not $ typeHasFixedRuntimeRep orig_arg_ty
= do { (wrap_co, arg_ty, res_ty) <-
matchActualFunTy (FRRRepPolyUnliftedNewtype dc)
(Just $ HsExprTcThing tc_fun)
=====================================
docs/users_guide/exts/rank_polymorphism.rst
=====================================
@@ -195,7 +195,7 @@ For example: ::
g3c :: Int -> forall x y. y -> x -> x
f4 :: (Int -> forall a. (Eq a, Show a) => a -> a) -> Bool
- g4 :: Int -> forall x. (Show x, Eq x) => x -> x) -> Bool
+ g4 :: Int -> forall x. (Show x, Eq x) => x -> x
Then the application ``f3 g3a`` is well-typed, because ``g3a`` has a type that matches the type
expected by ``f3``. But ``f3 g3b`` is not well typed, because the foralls are in different places.
=====================================
docs/users_guide/exts/type_families.rst
=====================================
@@ -680,7 +680,7 @@ thus: ::
When doing so, we (optionally) may drop the "``family``" keyword.
The type parameters must all be type variables, of course, and some (but
-not necessarily all) of then can be the class parameters. Each class
+not necessarily all) of them can be the class parameters. Each class
parameter may only be used at most once per associated type, but some
may be omitted and they may be in an order other than in the class head.
Hence, the following contrived example is admissible: ::
=====================================
docs/users_guide/phases.rst
=====================================
@@ -770,10 +770,9 @@ Options affecting code generation
:type: dynamic
:category: codegen
- Generate position-independent code (code that can be put into shared
- libraries). This currently works on Linux x86 and x86-64. On
- Windows, position-independent code is never used so the flag is a
- no-op on that platform.
+ Generate position-independent code (PIC). This code can be put into shared
+ libraries and is sometimes required by operating systems, e.g. systems using
+ Address Space Layout Randomization (ASLR).
.. ghc-flag:: -fexternal-dynamic-refs
:shortdesc: Generate code for linking against dynamic libraries
@@ -790,9 +789,7 @@ Options affecting code generation
:category: codegen
Generate code in such a way to be linkable into a position-independent
- executable This currently works on Linux x86 and x86-64. On Windows,
- position-independent code is never used so the flag is a no-op on that
- platform. To link the final executable use :ghc-flag:`-pie`.
+ executable. To link the final executable use :ghc-flag:`-pie`.
.. ghc-flag:: -dynamic
:shortdesc: Build dynamically-linked object files and executables
=====================================
docs/users_guide/using-optimisation.rst
=====================================
@@ -1325,10 +1325,7 @@ as such you shouldn't need to set any of them explicitly. A flag
:reverse: -fno-polymorphic-specialisation
:category:
- :default: off
-
- Warning, this feature is highly experimental and may lead to incorrect runtime
- results. Use at your own risk (:ghc-ticket:`23469`, :ghc-ticket:`23109`, :ghc-ticket:`21229`, :ghc-ticket:`23445`).
+ :default: on
Enable specialisation of function calls to known dictionaries with free type variables.
The created specialisation will abstract over the type variables free in the dictionary.
=====================================
rts/eventlog/EventLog.c
=====================================
@@ -491,13 +491,7 @@ endEventLogging(void)
eventlog_enabled = false;
- // Flush all events remaining in the buffers.
- //
- // N.B. Don't flush if shutting down: this was done in
- // finishCapEventLogging and the capabilities have already been freed.
- if (getSchedState() != SCHED_SHUTTING_DOWN) {
- flushEventLog(NULL);
- }
+ flushEventLog(NULL);
ACQUIRE_LOCK(&eventBufMutex);
@@ -1626,15 +1620,24 @@ void flushEventLog(Capability **cap USED_IF_THREADS)
return;
}
+ // N.B. Don't flush if shutting down: this was done in
+ // finishCapEventLogging and the capabilities have already been freed.
+ // This can also race against the shutdown if the flush is triggered by the
+ // ticker thread. (#26573)
+ if (getSchedState() == SCHED_SHUTTING_DOWN) {
+ return;
+ }
+
ACQUIRE_LOCK(&eventBufMutex);
printAndClearEventBuf(&eventBuf);
RELEASE_LOCK(&eventBufMutex);
#if defined(THREADED_RTS)
- Task *task = getMyTask();
+ Task *task = newBoundTask();
stopAllCapabilitiesWith(cap, task, SYNC_FLUSH_EVENT_LOG);
flushAllCapsEventsBufs();
releaseAllCapabilities(getNumCapabilities(), cap ? *cap : NULL, task);
+ exitMyTask();
#else
flushLocalEventsBuf(getCapability(0));
#endif
=====================================
rts/linker/PEi386.c
=====================================
@@ -552,7 +552,12 @@ static int compare_path(StgWord key1, StgWord key2)
static void addLoadedDll(LoadedDllCache *cache, const pathchar *dll_name, HINSTANCE instance)
{
- insertHashTable_(cache->hash, (StgWord) dll_name, instance, hash_path);
+ // dll_name might be deallocated, we need to copy it to have a stable reference to the contents
+ // See #26613
+ size_t size = wcslen(dll_name) + 1;
+ pathchar* dll_name_copy = stgMallocBytes(size * sizeof(pathchar), "addLoadedDll");
+ wcsncpy(dll_name_copy, dll_name, size);
+ insertHashTable_(cache->hash, (StgWord) dll_name_copy, instance, hash_path);
}
static HINSTANCE isDllLoaded(const LoadedDllCache *cache, const pathchar *dll_name)
=====================================
testsuite/tests/codeGen/should_run/T24016.hs
=====================================
@@ -0,0 +1,24 @@
+module Main (main) where
+
+data Command
+ = Command1
+ | Command2
+ | Command3
+ | Command4
+ | Command5
+ | Command6 -- Commenting this line works with -fPIC, uncommenting leads to a crash.
+
+main :: IO ()
+main = do
+ let x = case cmd of
+ Command1 -> 1 :: Int
+ Command2 -> 2
+ Command3 -> 3
+ Command4 -> 4
+ Command5 -> 5
+ Command6 -> 6
+ putStrLn (show x)
+
+{-# NOINLINE cmd #-}
+cmd :: Command
+cmd = Command6
=====================================
testsuite/tests/codeGen/should_run/T24016.stdout
=====================================
@@ -0,0 +1 @@
+6
=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -257,3 +257,4 @@ test('CCallConv', [req_c], compile_and_run, ['CCallConv_c.c'])
test('T25364', normal, compile_and_run, [''])
test('T26061', normal, compile_and_run, [''])
test('T26537', normal, compile_and_run, ['-O2 -fregs-graph'])
+test('T24016', normal, compile_and_run, ['-O1 -fPIC'])
=====================================
testsuite/tests/rts/all.T
=====================================
@@ -2,6 +2,11 @@ test('testblockalloc',
[c_src, only_ways(['normal','threaded1']), extra_run_opts('+RTS -I0')],
compile_and_run, [''])
+test('numeric_version_eventlog_flush',
+ [ignore_stdout, req_ghc_with_threaded_rts],
+ run_command,
+ ['{compiler} --numeric-version +RTS -l --eventlog-flush-interval=1 -RTS'])
+
test('testmblockalloc',
[c_src, only_ways(['normal','threaded1']), extra_run_opts('+RTS -I0 -xr0.125T'),
when(arch('wasm32'), skip)], # MBlocks can't be freed on wasm32, see Note [Megablock allocator on wasm] in rts
=====================================
testsuite/tests/simplCore/should_compile/T26588.hs
=====================================
@@ -0,0 +1,32 @@
+module T26588 ( getOptionSettingFromText ) where
+
+import Control.Applicative ( Const(..) )
+import Data.Map (Map)
+import qualified Data.Map.Strict as Map
+
+------------------------------------------------------------------------
+-- ConfigState
+
+data ConfigLeaf
+data ConfigTrie = ConfigTrie !(Maybe ConfigLeaf) !ConfigMap
+
+type ConfigMap = Map Int ConfigTrie
+
+freshLeaf :: [Int] -> ConfigLeaf -> ConfigTrie
+freshLeaf [] l = ConfigTrie (Just l) mempty
+freshLeaf (a:as) l = ConfigTrie Nothing (Map.singleton a (freshLeaf as l))
+
+adjustConfigTrie :: Functor t => [Int] -> (Maybe ConfigLeaf -> t (Maybe ConfigLeaf)) -> Maybe (ConfigTrie) -> t (Maybe ConfigTrie)
+adjustConfigTrie as f Nothing = fmap (freshLeaf as) <$> f Nothing
+adjustConfigTrie (a:as) f (Just (ConfigTrie x m)) = Just . ConfigTrie x <$> adjustConfigMap a as f m
+adjustConfigTrie [] f (Just (ConfigTrie x m)) = g <$> f x
+ where g Nothing | Map.null m = Nothing
+ g x' = Just (ConfigTrie x' m)
+
+adjustConfigMap :: Functor t => Int -> [Int] -> (Maybe ConfigLeaf -> t (Maybe ConfigLeaf)) -> ConfigMap -> t ConfigMap
+adjustConfigMap a as f = Map.alterF (adjustConfigTrie as f) a
+
+getOptionSettingFromText :: Int -> [Int] -> ConfigMap -> IO ()
+getOptionSettingFromText p ps = getConst . adjustConfigMap p ps f
+ where
+ f _ = Const (return ())
=====================================
testsuite/tests/simplCore/should_compile/T26589.hs
=====================================
@@ -0,0 +1,44 @@
+module T26589 ( executeTest ) where
+
+-- base
+import Data.Coerce ( coerce )
+import Data.Foldable ( foldMap )
+
+--------------------------------------------------------------------------------
+
+newtype Traversal f = Traversal { getTraversal :: f () }
+
+instance Applicative f => Semigroup (Traversal f) where
+ Traversal f1 <> Traversal f2 = Traversal $ f1 *> f2
+instance Applicative f => Monoid (Traversal f) where
+ mempty = Traversal $ pure ()
+
+newtype Seq a = Seq (FingerTree (Elem a))
+newtype Elem a = Elem { getElem :: a }
+
+data FingerTree a
+ = EmptyT
+ | Deep !a (FingerTree a) !a
+
+executeTest :: Seq () -> IO ()
+executeTest fins = destroyResources
+ where
+ destroyResources :: IO ()
+ destroyResources =
+ getTraversal $
+ flip foldMap1 fins $ \ _ ->
+ Traversal $ return ()
+
+foldMap1 :: forall m a. Monoid m => (a -> m) -> Seq a -> m
+foldMap1 = coerce (foldMap2 :: (Elem a -> m) -> FingerTree (Elem a) -> m)
+
+foldMap2 :: Monoid m => (Elem a -> m) -> FingerTree (Elem a) -> m
+foldMap2 _ EmptyT = mempty
+foldMap2 f' (Deep pr' m' sf') = f' pr' <> foldMapTree f' m' <> f' sf'
+ where
+ foldMapTree :: Monoid m => (a -> m) -> FingerTree a -> m
+ foldMapTree _ EmptyT = mempty
+ foldMapTree f (Deep pr m sf) =
+ f pr <>
+ foldMapTree f m <>
+ f sf
=====================================
testsuite/tests/simplCore/should_compile/T8331.stderr
=====================================
@@ -1,5 +1,148 @@
==================== Tidy Core rules ====================
+"SPEC $c*> @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT_$c*> @(ST s) @r $dApplicative
+ = ($fApplicativeReaderT2 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ _R
+ ->_R _R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <b>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <b>_N)
+ :: Coercible
+ (forall a b.
+ ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> STRep s b)
+ (forall a b.
+ ReaderT r (ST s) a -> ReaderT r (ST s) b -> ReaderT r (ST s) b))
+"SPEC $c<$ @(ST s) @_"
+ forall (@s) (@r) ($dFunctor :: Functor (ST s)).
+ $fFunctorReaderT_$c<$ @(ST s) @r $dFunctor
+ = ($fApplicativeReaderT6 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ <a>_R
+ ->_R _R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <a>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <a>_N)
+ :: Coercible
+ (forall a b. a -> ReaderT r (ST s) b -> r -> STRep s a)
+ (forall a b. a -> ReaderT r (ST s) b -> ReaderT r (ST s) a))
+"SPEC $c<* @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT_$c<* @(ST s) @r $dApplicative
+ = ($fApplicativeReaderT1 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ _R
+ ->_R _R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <a>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <a>_N)
+ :: Coercible
+ (forall a b.
+ ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> STRep s a)
+ (forall a b.
+ ReaderT r (ST s) a -> ReaderT r (ST s) b -> ReaderT r (ST s) a))
+"SPEC $c<*> @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT9 @(ST s) @r $dApplicative
+ = ($fApplicativeReaderT4 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ b)>_R
+ ->_R _R
+ ->_R <r>_R
+ ->_R Sym (N:ST <s>_N <b>_R)
+ :: Coercible
+ (forall a b.
+ ReaderT r (ST s) (a -> b) -> ReaderT r (ST s) a -> r -> STRep s b)
+ (forall a b.
+ ReaderT r (ST s) (a -> b) -> ReaderT r (ST s) a -> r -> ST s b))
+"SPEC $c>> @(ST s) @_"
+ forall (@s) (@r) ($dMonad :: Monad (ST s)).
+ $fMonadReaderT1 @(ST s) @r $dMonad
+ = $fMonadAbstractIOSTReaderT_$s$c>> @s @r
+"SPEC $c>>= @(ST s) @_"
+ forall (@s) (@r) ($dMonad :: Monad (ST s)).
+ $fMonadReaderT2 @(ST s) @r $dMonad
+ = ($fMonadAbstractIOSTReaderT2 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ _R
+ ->_R ReaderT r (ST s) b>_R
+ ->_R <r>_R
+ ->_R Sym (N:ST <s>_N <b>_R)
+ :: Coercible
+ (forall a b.
+ ReaderT r (ST s) a -> (a -> ReaderT r (ST s) b) -> r -> STRep s b)
+ (forall a b.
+ ReaderT r (ST s) a -> (a -> ReaderT r (ST s) b) -> r -> ST s b))
+"SPEC $cfmap @(ST s) @_"
+ forall (@s) (@r) ($dFunctor :: Functor (ST s)).
+ $fFunctorReaderT_$cfmap @(ST s) @r $dFunctor
+ = ($fApplicativeReaderT7 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N).
+ b>_R
+ ->_R _R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <b>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <b>_N)
+ :: Coercible
+ (forall a b. (a -> b) -> ReaderT r (ST s) a -> r -> STRep s b)
+ (forall a b. (a -> b) -> ReaderT r (ST s) a -> ReaderT r (ST s) b))
+"SPEC $cliftA2 @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT_$cliftA2 @(ST s) @r $dApplicative
+ = ($fApplicativeReaderT3 @s @r)
+ `cast` (forall (a ::~ <*>_N) (b ::~ <*>_N) (c ::~ <*>_N).
+ b -> c>_R
+ ->_R _R
+ ->_R _R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <c>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <c>_N)
+ :: Coercible
+ (forall a b c.
+ (a -> b -> c)
+ -> ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> STRep s c)
+ (forall a b c.
+ (a -> b -> c)
+ -> ReaderT r (ST s) a -> ReaderT r (ST s) b -> ReaderT r (ST s) c))
+"SPEC $cp1Applicative @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT_$cp1Applicative @(ST s) @r $dApplicative
+ = $fApplicativeReaderT_$s$fFunctorReaderT @s @r
+"SPEC $cp1Monad @(ST s) @_"
+ forall (@s) (@r) ($dMonad :: Monad (ST s)).
+ $fMonadReaderT_$cp1Monad @(ST s) @r $dMonad
+ = $fApplicativeReaderT_$s$fApplicativeReaderT @s @r
+"SPEC $cpure @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT_$cpure @(ST s) @r $dApplicative
+ = ($fApplicativeReaderT5 @s @r)
+ `cast` (forall (a ::~ <*>_N).
+ <a>_R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <a>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <a>_N)
+ :: Coercible
+ (forall a. a -> r -> STRep s a)
+ (forall a. a -> ReaderT r (ST s) a))
+"SPEC $creturn @(ST s) @_"
+ forall (@s) (@r) ($dMonad :: Monad (ST s)).
+ $fMonadReaderT_$creturn @(ST s) @r $dMonad
+ = ($fApplicativeReaderT5 @s @r)
+ `cast` (forall (a ::~ <*>_N).
+ <a>_R
+ ->_R <r>_R ->_R Sym (N:ST <s>_N <a>_R)
+ ; Sym (N:ReaderT <*>_N <r>_R <ST s>_R <a>_N)
+ :: Coercible
+ (forall a. a -> r -> STRep s a)
+ (forall a. a -> ReaderT r (ST s) a))
+"SPEC $fApplicativeReaderT @(ST s) @_"
+ forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+ $fApplicativeReaderT @(ST s) @r $dApplicative
+ = $fApplicativeReaderT_$s$fApplicativeReaderT @s @r
+"SPEC $fFunctorReaderT @(ST s) @_"
+ forall (@s) (@r) ($dFunctor :: Functor (ST s)).
+ $fFunctorReaderT @(ST s) @r $dFunctor
+ = $fApplicativeReaderT_$s$fFunctorReaderT @s @r
+"SPEC $fMonadReaderT @(ST s) @_"
+ forall (@s) (@r) ($dMonad :: Monad (ST s)).
+ $fMonadReaderT @(ST s) @r $dMonad
+ = $fMonadAbstractIOSTReaderT_$s$fMonadReaderT @s @r
"USPEC useAbstractMonad @(ReaderT Int (ST s))"
forall (@s)
($dMonadAbstractIOST :: MonadAbstractIOST (ReaderT Int (ST s))).
=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -544,6 +544,9 @@ test('T25883b', normal, compile_grep_core, [''])
test('T25883c', normal, compile_grep_core, [''])
test('T25883d', [extra_files(['T25883d_import.hs'])], multimod_compile_filter, ['T25883d', '-O -ddump-simpl -dno-typeable-binds -dsuppress-all -dsuppress-uniques', r'grep -e "y ="'])
+test('T26588', normal, compile, ['-package containers -O'])
+test('T26589', normal, compile, ['-O'])
+
test('T25976', [grep_errmsg('Dead Code')], compile, ['-O -ddump-simpl -dsuppress-uniques -dno-typeable-binds'])
test('T25965', normal, compile, ['-O'])
=====================================
utils/check-exact/ExactPrint.hs
=====================================
@@ -19,6 +19,13 @@
{-# LANGUAGE UndecidableInstances #-} -- For the (StmtLR GhcPs GhcPs (LocatedA (body GhcPs))) ExactPrint instance
{-# OPTIONS_GHC -Wno-incomplete-uni-patterns -Wno-incomplete-record-updates #-}
+-- We switch off specialisation in this module. Otherwise we get lots of functions
+-- specialised on lots of (GHC syntax tree) data types. Compilation time allocation
+-- (at least with -fpolymorphic-specialisation; see !15058) blows up from 17G to 108G.
+-- Bad! ExactPrint is not a performance-critical module so it's not worth taking the
+-- largely-fruitless hit in compile time.
+{-# OPTIONS_GHC -fno-specialise #-}
+
module ExactPrint
(
ExactPrint(..)
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8b566aa62c0828795cd9763fe9822a64fcabde01...f3ccdc29eb988b3be39937051d12dce6a1e4f496
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8b566aa62c0828795cd9763fe9822a64fcabde01...f3ccdc29eb988b3be39937051d12dce6a1e4f496
You're receiving this email because of your account on gitlab.haskell.org.