[Git][ghc/ghc][wip/romes/25636] Where we're at:
Rodrigo Mesquita pushed to branch wip/romes/25636 at Glasgow Haskell Compiler / GHC
Commits:
9fa2c2c2 by Rodrigo Mesquita at 2025-12-19T20:14:41+00:00
Where we're at:
We're almost there, but just bumped into a sort of wall.
For bytecode objects, we allocate string literals which are never freed,
and when compiling the ByteCode which references theses string literals,
we PUSH_ADDR a NON-POINTER reference to the Addr# in the string literals
environment kept by the ghci linker.
I believe that these Addr are put in the non-pointers list of data of a
BCO (Rather than on the ptrs list of data) because we don't want them to
be GC-d. They are allocated by the linker and should not be touched by
the RTS.
The problem is that when constructing a static data con application, we
will find addresses and put them as part of the POINTERS fields of the
data application. Now, I expect that the info table NPTRS/PTRS
directives will expect the Addr# to be in the PTRS because it may be
collected by the GC as far as the data con is concerned.
this didn't come up before because we only ever constructed BCOs in
bytecode, never the constructors the directly. Soooooooo
I don't think I should be changing the order of Ids vs Lits after they
come in the proto static con ids. But perhaps it's fine? Perhaps I'm
writing them as Ids but really they should be moved to the literals once
I find them?
Yes, that could make sense.... So, check the actual info table to figure
out where things go
And try an example with lits and ptrs and see what comes first.
Yup. Turns out it's all fine. The Addr#s are non-pointers everywhere.
.data
.balign 8
.globl _X.x_closure
_X.x_closure:
.quad _X.K_con_info
.quad _stg_INTLIKE_closure+289
.quad _stg_CHARLIKE_closure+1953
.quad _stg_INTLIKE_closure+321
.quad 1
.quad _Lx2_ryD_bytes
.quad 3
That makes sense. They're top-level DATA. The collecting is only an
issue when we allocate them dynamically (GHCi). But we're already doing
that anyway.
- - - - -
7 changed files:
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Instr.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/ByteCode/Serialize.hs
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/Linker/Loader.hs
- compiler/GHC/StgToByteCode.hs
Changes:
=====================================
compiler/GHC/ByteCode/Asm.hs
=====================================
@@ -95,13 +95,15 @@ bcoFreeNames bco
mkUniqDSet [ n | BCONPtrItbl n <- elemsFlatBag unlinkedBCOLits ] :
map bco_refs [ bco | BCOPtrBCO bco <- elemsFlatBag unlinkedBCOPtrs ]
) `uniqDSetMinusUniqSet` mkNameSet [unlinkedBCOName]
- bco_refs UnlinkedStaticCon{unlinkedStaticConName, unlinkedStaticConLits, unlinkedStaticConPtrs}
+ bco_refs UnlinkedStaticCon{ unlinkedStaticConName, unlinkedStaticConDataConName
+ , unlinkedStaticConLits, unlinkedStaticConPtrs }
= unionManyUniqDSets (
+ mkUniqDSet [ unlinkedStaticConDataConName ] :
mkUniqDSet [ n | BCOPtrName n <- elemsFlatBag unlinkedStaticConPtrs ] :
mkUniqDSet [ n | BCONPtrItbl n <- elemsFlatBag unlinkedStaticConLits ] :
map bco_refs [ bco | BCOPtrBCO bco <- elemsFlatBag unlinkedStaticConPtrs ]
)
- `uniqDSetMinusUniqSet` mkNameSet [unlinkedStaticConName]
+ `uniqDSetMinusUniqSet` mkNameSet [ unlinkedStaticConName ]
-- -----------------------------------------------------------------------------
-- The bytecode assembler
@@ -207,14 +209,17 @@ assembleInspectAsm p i = assembleI @InspectAsm p i
assembleBCO :: Platform -> ProtoBCO -> IO UnlinkedBCO
assembleBCO platform
- (ProtoStaticCon { protoStaticCon = dc
+ (ProtoStaticCon { protoStaticConName
+ , protoStaticCon = dc
, protoStaticConLits = lits
, protoStaticConIds = ids
}) = do
+ pprTraceM "assembleBCO: static con" (ppr dc <+> ppr lits <+> ppr ids)
let ptrs = foldr mappendFlatBag emptyFlatBag (map idBCOArg ids)
let nonptrs = foldr mappendFlatBag emptyFlatBag (map litBCOArg lits)
pure UnlinkedStaticCon
- { unlinkedStaticConName = dataConName dc
+ { unlinkedStaticConName = protoStaticConName
+ , unlinkedStaticConDataConName = dataConName dc
, unlinkedStaticConLits = nonptrs
, unlinkedStaticConPtrs = ptrs
}
=====================================
compiler/GHC/ByteCode/Instr.hs
=====================================
@@ -52,6 +52,9 @@ data ProtoBCO
-- | A top-level static constructor application object
-- See Note [Static constructors in Bytecode]
| ProtoStaticCon {
+ protoStaticConName :: Name,
+ -- ^ The name to which this static constructor is bound,
+ -- not to be confused with the DataCon itself.
protoStaticCon :: DataCon,
protoStaticConLits :: [Literal],
protoStaticConIds :: [Id],
@@ -290,8 +293,8 @@ data BCInstr
-- Printing bytecode instructions
instance Outputable ProtoBCO where
- ppr (ProtoStaticCon con lits ids origin)
- = text "ProtoStaticCon" <+> ppr con <> colon
+ ppr (ProtoStaticCon nm con lits ids origin)
+ = text "ProtoStaticCon" <+> ppr nm <+> text "for constructor" <+> ppr con <> colon
$$ nest 3 (pprStgRhsShort shortStgPprOpts origin)
$$ nest 3 (text "lits: " <+> ppr lits)
$$ nest 3 (text "ids: " <+> ppr ids)
@@ -486,7 +489,8 @@ instance Outputable BCInstr where
-- stack high water mark, but it doesn't seem worth the hassle.
protoBCOStackUse :: ProtoBCO -> Word
-protoBCOStackUse bco = sum (map bciStackUse (protoBCOInstrs bco))
+protoBCOStackUse ProtoBCO{protoBCOInstrs} = sum (map bciStackUse protoBCOInstrs)
+protoBCOStackUse ProtoStaticCon{} = 0
bciStackUse :: BCInstr -> Word
bciStackUse STKCHECK{} = 0
=====================================
compiler/GHC/ByteCode/Linker.hs
=====================================
@@ -80,9 +80,13 @@ linkBCO interp pkgs_loaded bytecode_state bco_ix unl_bco = do
{ unlinkedStaticConLits = lits0
, unlinkedStaticConPtrs = ptrs0
} -> do
- Ptr itbl_ptr# <- lookupIE interp pkgs_loaded bytecode_state (unlinkedStaticConName unl_bco)
+ pprTraceM "linkBCO: linking static constructor" (ppr unl_bco)
+ Ptr itbl_ptr# <- lookupIE interp pkgs_loaded bytecode_state (unlinkedStaticConDataConName unl_bco)
+ pprTraceM "linkBCO: itbl_ptr#" (ppr (unlinkedStaticConDataConName unl_bco) <+> text (show (Ptr itbl_ptr#)))
lits <- doLits lits0
+ pprTraceM "linkBCO: lits done" (empty)
ptrs <- doPtrs ptrs0
+ pprTraceM "linkBCO: ptrs done" (empty)
return ResolvedStaticCon
{ resolvedBCOIsLE = isLittleEndian
, resolvedStaticConInfoPtr = W# (int2Word# (addr2Int# itbl_ptr#))
@@ -99,7 +103,7 @@ linkBCO interp pkgs_loaded bytecode_state bco_ix unl_bco = do
mapM (resolvePtr interp pkgs_loaded bytecode_state bco_ix) (elemsFlatBag ptrs0)
lookupLiteral :: Interp -> PkgsLoaded -> BytecodeLoaderState -> BCONPtr -> IO Word
-lookupLiteral interp pkgs_loaded bytecode_state ptr = case ptr of
+lookupLiteral interp pkgs_loaded bytecode_state ptr = pprTrace "lookupLiteral" (ppr ptr) $ case ptr of
BCONPtrWord lit -> return lit
BCONPtrLbl sym -> do
Ptr a# <- lookupStaticPtr interp sym
@@ -183,20 +187,24 @@ resolvePtr
-> NameEnv (Int, Bool)
-> BCOPtr
-> IO ResolvedBCOPtr
-resolvePtr interp pkgs_loaded bco_loader_state bco_ix ptr = case ptr of
+resolvePtr interp pkgs_loaded bco_loader_state bco_ix ptr = pprTrace "resolvePtr" (ppr ptr) $ case ptr of
BCOPtrName nm
| Just (ix, b) <- lookupNameEnv bco_ix nm
- -> if b then
+ -> if b then do
+ pprTraceM "resolvePtr: ResolvedBCORef" (ppr nm <+> ppr ix)
return (ResolvedBCORef ix) -- ref to another BCO in this group
- else
+ else do
+ pprTraceM "resolvePtr: StaticConRef" (ppr nm <+> ppr ix)
return (ResolvedStaticConRef ix) -- ref to another StaticCon in this group
| Just (_, rhv) <- lookupNameBytecodeState bco_loader_state nm
- -> return (ResolvedBCOPtr (unsafeForeignRefToRemoteRef rhv))
+ -> do
+ pprTraceM "resolvePtr: BCOPtr" (ppr nm)
+ return (ResolvedBCOPtr (unsafeForeignRefToRemoteRef rhv))
| otherwise
-> assertPpr (isExternalName nm) (ppr nm) $
- do
+ pprTrace "resolvePtr: OTHERWISE? ASSERTION FIALURE?" (ppr nm) $ do
let sym_to_find = IClosureSymbol nm
m <- lookupHsSymbol interp pkgs_loaded sym_to_find
case m of
=====================================
compiler/GHC/ByteCode/Serialize.hs
=====================================
@@ -327,6 +327,7 @@ instance Binary UnlinkedBCO where
<*> get bh
1 -> UnlinkedStaticCon
<$> getViaBinName bh
+ <*> getViaBinName bh
<*> get bh
<*> get bh
_ -> panic "Binary UnlinkedBCO: invalid byte"
@@ -342,6 +343,7 @@ instance Binary UnlinkedBCO where
put_ bh UnlinkedStaticCon {..} = do
putByte bh 1
putViaBinName bh unlinkedStaticConName
+ putViaBinName bh unlinkedStaticConDataConName
put_ bh unlinkedStaticConLits
put_ bh unlinkedStaticConPtrs
=====================================
compiler/GHC/ByteCode/Types.hs
=====================================
@@ -31,6 +31,7 @@ module GHC.ByteCode.Types
) where
import GHC.Prelude
+import qualified Data.ByteString.Char8 as BS8
import GHC.Data.FastString
import GHC.Data.FlatBag
@@ -259,6 +260,10 @@ data UnlinkedBCO
-- See Note [Static constructors in Bytecode]
| UnlinkedStaticCon {
unlinkedStaticConName :: !Name,
+ -- ^ The name to which this static constructor is bound, not to be
+ -- confused with the name of the static constructor itself
+ -- ('unlinkedStaticConDataConName')
+ unlinkedStaticConDataConName :: !Name,
unlinkedStaticConLits :: !(FlatBag BCONPtr), -- non-ptrs
unlinkedStaticConPtrs :: !(FlatBag BCOPtr) -- ptrs
}
@@ -282,6 +287,12 @@ instance NFData BCOPtr where
rnf (BCOPtrBCO bco) = rnf bco
rnf x = x `seq` ()
+instance Outputable BCOPtr where
+ ppr (BCOPtrName nm) = text "BCOPtrName" <+> ppr nm
+ ppr (BCOPtrPrimOp op) = text "BCOPtrPrimOp" <+> ppr op
+ ppr (BCOPtrBCO bco) = text "BCOPtrBCO" <+> ppr bco
+ ppr (BCOPtrBreakArray mod) = text "
participants (1)
-
Rodrigo Mesquita (@alt-romes)