[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 4 commits: Remove the `profile_id` parameter from various RTS functions.
Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: e8acd2e7 by Wen Kokke at 2025-07-16T08:37:04-04:00 Remove the `profile_id` parameter from various RTS functions. Various RTS functions took a `profile_id` parameter, intended to be used to distinguish parallel heap profile breakdowns (e.g., `-hT` and `-hi`). However, this feature was never implemented and the `profile_id` parameter was set to 0 throughout the RTS. This commit removes the parameter but leaves the hardcoded profile ID in the functions that emit the encoded eventlog events as to not change the protocol. The affected functions are `traceHeapProfBegin`, `postHeapProfBegin`, `traceHeapProfSampleString`, `postHeapProfSampleString`, `traceHeapProfSampleCostCentre`, and `postHeapProfSampleCostCentre`. - - - - - 76d392a2 by Wen Kokke at 2025-07-16T08:37:04-04:00 Make `traceHeapProfBegin` an init event. - - - - - e2196818 by Peng Fan at 2025-07-16T09:09:57-04:00 NCG/LA64: Support finer-grained DBAR hints For LA664 and newer uarchs, they have made finer granularity hints available: Bit4: ordering or completion (0: completion, 1: ordering) Bit3: barrier for previous read (0: true, 1: false) Bit2: barrier for previous write (0: true, 1: false) Bit1: barrier for succeeding read (0: true, 1: false) Bit0: barrier for succeeding write (0: true, 1: false) And not affect the existing models because other hints are treated as 'dbar 0' there. - - - - - 847209fa by Andreas Klebinger at 2025-07-16T09:09:58-04:00 Disable -fprof-late-overloaded-calls for join points. Currently GHC considers cost centres as destructive to join contexts. Or in other words this is not considered valid: join f x = ... in ... -> scc<tick> jmp This makes the functionality of `-fprof-late-overloaded-calls` not feasible for join points in general. We used to try to work around this by putting the ticks on the rhs of the join point rather than around the jump. However beyond the loss of accuracy this was broken for recursive join points as we ended up with something like: rec-join f x = scc<tick> ... jmp f x Which similarly is not valid as the tick once again destroys the tail call. One might think we could limit ourselves to non-recursive tail calls and do something clever like: join f x = scc<tick> ... in ... jmp f x And sometimes this works! But sometimes the full rhs would look something like: join g x = .... join f x = scc<tick> ... -> jmp g x Which, would again no longer be valid. I believe in the long run we can make cost centre ticks non-destructive to join points. Or we could keep track of where we are/are not allowed to insert a cost centre. But in the short term I will simply disable the annotation of join calls under this flag. - - - - - 11 changed files: - compiler/GHC/CmmToAsm/LA64/CodeGen.hs - compiler/GHC/CmmToAsm/LA64/Instr.hs - compiler/GHC/CmmToAsm/LA64/Ppr.hs - compiler/GHC/Core/LateCC/OverloadedCalls.hs - docs/users_guide/profiling.rst - rts/ProfHeap.c - rts/RetainerSet.c - rts/Trace.c - rts/Trace.h - rts/eventlog/EventLog.c - rts/eventlog/EventLog.h Changes: ===================================== compiler/GHC/CmmToAsm/LA64/CodeGen.hs ===================================== @@ -1910,13 +1910,12 @@ genCCall target dest_regs arg_regs = do MO_W64X2_Max -> unsupported mop -- Memory Ordering - -- A hint value of 0 is mandatory by default, and it indicates a fully functional synchronization barrier. - -- Only after all previous load/store access operations are completely executed, the DBAR 0 instruction can be executed; - -- and only after the execution of DBAR 0 is completed, all subsequent load/store access operations can be executed. - - MO_AcquireFence -> pure (unitOL (DBAR Hint0)) - MO_ReleaseFence -> pure (unitOL (DBAR Hint0)) - MO_SeqCstFence -> pure (unitOL (DBAR Hint0)) + -- Support finer-grained DBAR hints for LA664 and newer uarchs. + -- These are treated as DBAR 0 on older uarchs, so we can start + -- to unconditionally emit the new hints right away. + MO_AcquireFence -> pure (unitOL (DBAR HintAcquire)) + MO_ReleaseFence -> pure (unitOL (DBAR HintRelease)) + MO_SeqCstFence -> pure (unitOL (DBAR HintSeqcst)) MO_Touch -> pure nilOL -- Keep variables live (when using interior pointers) -- Prefetch @@ -1954,12 +1953,11 @@ genCCall target dest_regs arg_regs = do MemOrderAcquire -> toOL [ ann moDescr (LD (intFormat w) (OpReg w dst) (OpAddr $ AddrReg p)), - DBAR Hint0 + DBAR HintAcquire ] - MemOrderSeqCst -> toOL [ - ann moDescr (DBAR Hint0), - LD (intFormat w) (OpReg w dst) (OpAddr $ AddrReg p), - DBAR Hint0 + MemOrderSeqCst -> toOL [ + ann moDescr (LD (intFormat w) (OpReg w dst) (OpAddr $ AddrReg p)), + DBAR HintSeqcst ] _ -> panic $ "Unexpected MemOrderRelease on an AtomicRead: " ++ show mo dst = getRegisterReg platform (CmmLocal dst_reg) @@ -1974,15 +1972,9 @@ genCCall target dest_regs arg_regs = do (val, fmt_val, code_val) <- getSomeReg val_reg let instrs = case ord of MemOrderRelaxed -> unitOL $ ann moDescr (ST fmt_val (OpReg w val) (OpAddr $ AddrReg p)) - MemOrderRelease -> toOL [ - ann moDescr (DBAR Hint0), - ST fmt_val (OpReg w val) (OpAddr $ AddrReg p) - ] - MemOrderSeqCst -> toOL [ - ann moDescr (DBAR Hint0), - ST fmt_val (OpReg w val) (OpAddr $ AddrReg p), - DBAR Hint0 - ] + -- implement with AMSWAPDB + MemOrderRelease -> unitOL $ ann moDescr (AMSWAPDB fmt_val (OpReg w zeroReg) (OpReg w val) (OpReg w p)) + MemOrderSeqCst -> unitOL $ ann moDescr (AMSWAPDB fmt_val (OpReg w zeroReg) (OpReg w val) (OpReg w p)) _ -> panic $ "Unexpected MemOrderAcquire on an AtomicWrite" ++ show mo moDescr = (text . show) mo code = ===================================== compiler/GHC/CmmToAsm/LA64/Instr.hs ===================================== @@ -169,6 +169,7 @@ regUsageOfInstr platform instr = case instr of -- LDCOND dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst) -- STCOND dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst) -- 7. Atomic Memory Access Instructions -------------------------------------- + AMSWAPDB _ dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst) -- 8. Barrier Instructions --------------------------------------------------- DBAR _hint -> usage ([], []) IBAR _hint -> usage ([], []) @@ -343,13 +344,13 @@ patchRegsOfInstr instr env = case instr of STX f o1 o2 -> STX f (patchOp o1) (patchOp o2) LDPTR f o1 o2 -> LDPTR f (patchOp o1) (patchOp o2) STPTR f o1 o2 -> STPTR f (patchOp o1) (patchOp o2) - PRELD o1 o2 -> PRELD (patchOp o1) (patchOp o2) + PRELD o1 o2 -> PRELD (patchOp o1) (patchOp o2) -- 6. Bound Check Memory Access Instructions --------------------------------- -- LDCOND o1 o2 o3 -> LDCOND (patchOp o1) (patchOp o2) (patchOp o3) -- STCOND o1 o2 o3 -> STCOND (patchOp o1) (patchOp o2) (patchOp o3) -- 7. Atomic Memory Access Instructions -------------------------------------- + AMSWAPDB f o1 o2 o3 -> AMSWAPDB f (patchOp o1) (patchOp o2) (patchOp o3) -- 8. Barrier Instructions --------------------------------------------------- - -- TODO: need fix DBAR o1 -> DBAR o1 IBAR o1 -> IBAR o1 -- 11. Floating Point Instructions ------------------------------------------- @@ -734,6 +735,7 @@ data Instr | PRELD Operand Operand -- 6. Bound Check Memory Access Instructions --------------------------------- -- 7. Atomic Memory Access Instructions -------------------------------------- + | AMSWAPDB Format Operand Operand Operand -- 8. Barrier Instructions --------------------------------------------------- | DBAR BarrierType | IBAR BarrierType @@ -755,8 +757,13 @@ data Instr -- fnmadd: d = - r1 * r2 - r3 | FMA FMASign Operand Operand Operand Operand --- TODO: Not complete. -data BarrierType = Hint0 +data BarrierType + = Hint0 + | Hint700 + | HintAcquire + | HintRelease + | HintSeqcst + deriving (Eq, Show) instrCon :: Instr -> String instrCon i = @@ -847,6 +854,7 @@ instrCon i = LDPTR{} -> "LDPTR" STPTR{} -> "STPTR" PRELD{} -> "PRELD" + AMSWAPDB{} -> "AMSWAPDB" DBAR{} -> "DBAR" IBAR{} -> "IBAR" FCVT{} -> "FCVT" ===================================== compiler/GHC/CmmToAsm/LA64/Ppr.hs ===================================== @@ -1015,6 +1015,10 @@ pprInstr platform instr = case instr of -- LD{GT/LE}.{B/H/W/D}, ST{GT/LE}.{B/H/W/D} -- 7. Atomic Memory Access Instructions -------------------------------------- -- AM{SWAP/ADD/AND/OR/XOR/MAX/MIN}[DB].{W/D}, AM{MAX/MIN}[_DB].{WU/DU} + AMSWAPDB II8 o1 o2 o3 -> op3 (text "\tamswap_db.b") o1 o2 o3 + AMSWAPDB II16 o1 o2 o3 -> op3 (text "\tamswap_db.h") o1 o2 o3 + AMSWAPDB II32 o1 o2 o3 -> op3 (text "\tamswap_db.w") o1 o2 o3 + AMSWAPDB II64 o1 o2 o3 -> op3 (text "\tamswap_db.d") o1 o2 o3 -- AM.{SWAP/ADD}[_DB].{B/H} -- AMCAS[_DB].{B/H/W/D} -- LL.{W/D}, SC.{W/D} @@ -1112,19 +1116,28 @@ pprInstr platform instr = case instr of op3 op o1 o2 o3 = line $ op <+> pprOp platform o1 <> comma <+> pprOp platform o2 <> comma <+> pprOp platform o3 op4 op o1 o2 o3 o4 = line $ op <+> pprOp platform o1 <> comma <+> pprOp platform o2 <> comma <+> pprOp platform o3 <> comma <+> pprOp platform o4 {- - -- TODO: Support dbar with different hints. + Support dbar with different hints. On LoongArch uses "dbar 0" (full completion barrier) for everything. But the full completion barrier has no performance to tell, so Loongson-3A6000 and newer processors have made finer granularity hints available: + Hint 0x700: barrier for "read after read" from the same address. Bit4: ordering or completion (0: completion, 1: ordering) Bit3: barrier for previous read (0: true, 1: false) Bit2: barrier for previous write (0: true, 1: false) Bit1: barrier for succeeding read (0: true, 1: false) Bit0: barrier for succeeding write (0: true, 1: false) + + DBAR 0b10100: acquire + DBAR 0b10010: release + DBAR 0b10000: seqcst -} pprBarrierType Hint0 = text "0x0" + pprBarrierType HintSeqcst = text "0x10" + pprBarrierType HintRelease = text "0x12" + pprBarrierType HintAcquire = text "0x14" + pprBarrierType Hint700 = text "0x700" floatPrecission o | isSingleOp o = text "s" | isDoubleOp o = text "d" | otherwise = pprPanic "Impossible floating point precission: " (pprOp platform o) ===================================== compiler/GHC/Core/LateCC/OverloadedCalls.hs ===================================== @@ -20,7 +20,6 @@ import GHC.Core.Make import GHC.Core.Predicate import GHC.Core.Type import GHC.Core.Utils -import GHC.Tc.Utils.TcType import GHC.Types.Id import GHC.Types.Name import GHC.Types.SrcLoc @@ -29,6 +28,41 @@ import GHC.Types.Var type OverloadedCallsCCState = Strict.Maybe SrcSpan +{- Note [Overloaded Calls and join points] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Currently GHC considers cost centres as destructive to +join contexts. Or in other words this is not considered valid: + + join f x = ... + in + ... -> scc<tick> jmp + +This makes the functionality of `-fprof-late-overloaded-calls` not feasible +for join points in general. We used to try to work around this by putting the +ticks on the rhs of the join point rather than around the jump. However beyond +the loss of accuracy this was broken for recursive join points as we ended up +with something like: + + rec-join f x = scc<tick> ... jmp f x + +Which similarly is not valid as the tick once again destroys the tail call. +One might think we could limit ourselves to non-recursive tail calls and do +something clever like: + + join f x = scc<tick> ... + in ... jmp f x + +And sometimes this works! But sometimes the full rhs would look something like: + + join g x = .... + join f x = scc<tick> ... -> jmp g x + +Which, would again no longer be valid. I believe in the long run we can make +cost centre ticks non-destructive to join points. Or we could keep track of +where we are/are not allowed to insert a cost centre. But in the short term I will +simply disable the annotation of join calls under this flag. +-} + -- | Insert cost centres on function applications with dictionary arguments. The -- source locations attached to the cost centres is approximated based on the -- "closest" source note encountered in the traversal. @@ -52,21 +86,10 @@ overloadedCallsCC = CoreBndr -> LateCCM OverloadedCallsCCState CoreExpr -> LateCCM OverloadedCallsCCState CoreExpr - wrap_if_join b pexpr = do + wrap_if_join _b pexpr = do + -- See Note [Overloaded Calls and join points] expr <- pexpr - if isJoinId b && isOverloadedTy (exprType expr) then do - let - cc_name :: FastString - cc_name = fsLit "join-rhs-" `appendFS` getOccFS b - - cc_srcspan <- - fmap (Strict.fromMaybe (UnhelpfulSpan UnhelpfulNoLocationInfo)) $ - lift $ gets lateCCState_extra - - insertCC cc_name cc_srcspan expr - else - return expr - + return expr processExpr :: CoreExpr -> LateCCM OverloadedCallsCCState CoreExpr processExpr expr = @@ -99,6 +122,7 @@ overloadedCallsCC = -- Avoid instrumenting join points. -- (See comment in processBind above) + -- Also see Note [Overloaded Calls and join points] && not (isJoinVarExpr f) then do -- Extract a name and source location from the function being ===================================== docs/users_guide/profiling.rst ===================================== @@ -571,9 +571,7 @@ of your profiled program will be different to that of the unprofiled one. Some overloaded calls may not be annotated, specifically in cases where the optimizer turns an overloaded function into a join point. Calls to such functions will not be wrapped in ``SCC`` annotations, since it would make - them non-tail calls, which is a requirement for join points. Instead, - ``SCC`` annotations are added around the body of overloaded join variables - and given distinct names (``join-rhs-<var>``) to avoid confusion. + them non-tail calls, which is a requirement for join points. .. ghc-flag:: -fprof-cafs :shortdesc: Auto-add ``SCC``\\ s to all CAFs ===================================== rts/ProfHeap.c ===================================== @@ -557,7 +557,7 @@ initHeapProfiling(void) restore_locale(); - traceHeapProfBegin(0); + traceInitEvent(traceHeapProfBegin); } void @@ -896,17 +896,17 @@ dumpCensus( Census *census ) // Eventlog - traceHeapProfSampleString(0, "VOID", + traceHeapProfSampleString("VOID", (census->void_total * sizeof(W_))); - traceHeapProfSampleString(0, "LAG", + traceHeapProfSampleString("LAG", ((census->not_used - census->void_total) * sizeof(W_))); - traceHeapProfSampleString(0, "USE", + traceHeapProfSampleString("USE", ((census->used - census->drag_total) * sizeof(W_))); - traceHeapProfSampleString(0, "INHERENT_USE", + traceHeapProfSampleString("INHERENT_USE", (census->prim * sizeof(W_))); - traceHeapProfSampleString(0, "DRAG", + traceHeapProfSampleString("DRAG", (census->drag_total * sizeof(W_))); traceHeapProfSampleEnd(era); @@ -941,33 +941,33 @@ dumpCensus( Census *census ) switch (RtsFlags.ProfFlags.doHeapProfile) { case HEAP_BY_CLOSURE_TYPE: fprintf(hp_file, "%s", (char *)ctr->identity); - traceHeapProfSampleString(0, (char *)ctr->identity, + traceHeapProfSampleString((char *)ctr->identity, count * sizeof(W_)); break; case HEAP_BY_INFO_TABLE: fprintf(hp_file, "%p", ctr->identity); char str[100]; sprintf(str, "%p", ctr->identity); - traceHeapProfSampleString(0, str, count * sizeof(W_)); + traceHeapProfSampleString(str, count * sizeof(W_)); break; #if defined(PROFILING) case HEAP_BY_CCS: fprint_ccs(hp_file, (CostCentreStack *)ctr->identity, RtsFlags.ProfFlags.ccsLength); - traceHeapProfSampleCostCentre(0, (CostCentreStack *)ctr->identity, + traceHeapProfSampleCostCentre((CostCentreStack *)ctr->identity, count * sizeof(W_)); break; case HEAP_BY_ERA: fprintf(hp_file, "%" FMT_Word, (StgWord)ctr->identity); char str_era[100]; sprintf(str_era, "%" FMT_Word, (StgWord)ctr->identity); - traceHeapProfSampleString(0, str_era, count * sizeof(W_)); + traceHeapProfSampleString(str_era, count * sizeof(W_)); break; case HEAP_BY_MOD: case HEAP_BY_DESCR: case HEAP_BY_TYPE: fprintf(hp_file, "%s", (char *)ctr->identity); - traceHeapProfSampleString(0, (char *)ctr->identity, + traceHeapProfSampleString((char *)ctr->identity, count * sizeof(W_)); break; case HEAP_BY_RETAINER: ===================================== rts/RetainerSet.c ===================================== @@ -238,7 +238,7 @@ printRetainerSetShort(FILE *f, RetainerSet *rs, W_ total_size, uint32_t max_leng } } fputs(tmp, f); - traceHeapProfSampleString(0, tmp, total_size); + traceHeapProfSampleString(tmp, total_size); } /* ----------------------------------------------------------------------------- ===================================== rts/Trace.c ===================================== @@ -647,10 +647,10 @@ void traceTaskDelete_ (Task *task) } } -void traceHeapProfBegin(StgWord8 profile_id) +void traceHeapProfBegin(void) { if (eventlog_enabled) { - postHeapProfBegin(profile_id); + postHeapProfBegin(); } } void traceHeapBioProfSampleBegin(StgInt era, StgWord64 time) @@ -674,11 +674,10 @@ void traceHeapProfSampleEnd(StgInt era) } } -void traceHeapProfSampleString(StgWord8 profile_id, - const char *label, StgWord residency) +void traceHeapProfSampleString(const char *label, StgWord residency) { if (eventlog_enabled) { - postHeapProfSampleString(profile_id, label, residency); + postHeapProfSampleString(label, residency); } } @@ -718,11 +717,10 @@ void traceHeapProfCostCentre(StgWord32 ccID, } // This one is for .hp samples -void traceHeapProfSampleCostCentre(StgWord8 profile_id, - CostCentreStack *stack, StgWord residency) +void traceHeapProfSampleCostCentre(CostCentreStack *stack, StgWord residency) { if (eventlog_enabled) { - postHeapProfSampleCostCentre(profile_id, stack, residency); + postHeapProfSampleCostCentre(stack, residency); } } ===================================== rts/Trace.h ===================================== @@ -303,20 +303,18 @@ void traceTaskMigrate_ (Task *task, void traceTaskDelete_ (Task *task); -void traceHeapProfBegin(StgWord8 profile_id); +void traceHeapProfBegin(void); void traceHeapProfSampleBegin(StgInt era); void traceHeapBioProfSampleBegin(StgInt era, StgWord64 time); void traceHeapProfSampleEnd(StgInt era); -void traceHeapProfSampleString(StgWord8 profile_id, - const char *label, StgWord residency); +void traceHeapProfSampleString(const char *label, StgWord residency); #if defined(PROFILING) void traceHeapProfCostCentre(StgWord32 ccID, const char *label, const char *module, const char *srcloc, StgBool is_caf); -void traceHeapProfSampleCostCentre(StgWord8 profile_id, - CostCentreStack *stack, StgWord residency); +void traceHeapProfSampleCostCentre(CostCentreStack *stack, StgWord residency); void traceProfSampleCostCentre(Capability *cap, CostCentreStack *stack, StgWord ticks); @@ -369,14 +367,14 @@ void flushTrace(void); #define traceTaskCreate_(taskID, cap) /* nothing */ #define traceTaskMigrate_(taskID, cap, new_cap) /* nothing */ #define traceTaskDelete_(taskID) /* nothing */ -#define traceHeapProfBegin(profile_id) /* nothing */ +#define traceHeapProfBegin() /* nothing */ #define traceHeapProfCostCentre(ccID, label, module, srcloc, is_caf) /* nothing */ #define traceIPE(ipe) /* nothing */ #define traceHeapProfSampleBegin(era) /* nothing */ #define traceHeapBioProfSampleBegin(era, time) /* nothing */ #define traceHeapProfSampleEnd(era) /* nothing */ -#define traceHeapProfSampleCostCentre(profile_id, stack, residency) /* nothing */ -#define traceHeapProfSampleString(profile_id, label, residency) /* nothing */ +#define traceHeapProfSampleCostCentre(stack, residency) /* nothing */ +#define traceHeapProfSampleString(label, residency) /* nothing */ #define traceConcMarkBegin() /* nothing */ #define traceConcMarkEnd(marked_obj_count) /* nothing */ ===================================== rts/eventlog/EventLog.c ===================================== @@ -95,6 +95,13 @@ bool eventlog_enabled; // protected by state_change_mutex to ensure * buffer size, EVENT_LOG_SIZE. We must ensure that no variable-length event * exceeds this limit. For this reason we impose maximum length limits on * fields which may have unbounded values. + * + * Note [Profile ID] + * ~~~~~~~~~~~~~~~~~ + * The profile ID field of eventlog entries is reserved for future use, + * with an eye towards supporting multiple parallel heap profiles. + * In the current RTS, the profile ID is hardcoded to 0. + * */ static const EventLogWriter *event_log_writer = NULL; @@ -1219,7 +1226,7 @@ static HeapProfBreakdown getHeapProfBreakdown(void) } } -void postHeapProfBegin(StgWord8 profile_id) +void postHeapProfBegin(void) { ACQUIRE_LOCK(&eventBufMutex); PROFILING_FLAGS *flags = &RtsFlags.ProfFlags; @@ -1244,7 +1251,8 @@ void postHeapProfBegin(StgWord8 profile_id) CHECK(!ensureRoomForVariableEvent(&eventBuf, len)); postEventHeader(&eventBuf, EVENT_HEAP_PROF_BEGIN); postPayloadSize(&eventBuf, len); - postWord8(&eventBuf, profile_id); + // See Note [Profile ID]. + postWord8(&eventBuf, 0); postWord64(&eventBuf, TimeToNS(flags->heapProfileInterval)); postWord32(&eventBuf, getHeapProfBreakdown()); postStringLen(&eventBuf, flags->modSelector, modSelector_len); @@ -1286,8 +1294,7 @@ void postHeapProfSampleEnd(StgInt era) RELEASE_LOCK(&eventBufMutex); } -void postHeapProfSampleString(StgWord8 profile_id, - const char *label, +void postHeapProfSampleString(const char *label, StgWord64 residency) { ACQUIRE_LOCK(&eventBufMutex); @@ -1296,7 +1303,8 @@ void postHeapProfSampleString(StgWord8 profile_id, CHECK(!ensureRoomForVariableEvent(&eventBuf, len)); postEventHeader(&eventBuf, EVENT_HEAP_PROF_SAMPLE_STRING); postPayloadSize(&eventBuf, len); - postWord8(&eventBuf, profile_id); + // See Note [Profile ID]. + postWord8(&eventBuf, 0); postWord64(&eventBuf, residency); postStringLen(&eventBuf, label, label_len); RELEASE_LOCK(&eventBufMutex); @@ -1325,8 +1333,7 @@ void postHeapProfCostCentre(StgWord32 ccID, RELEASE_LOCK(&eventBufMutex); } -void postHeapProfSampleCostCentre(StgWord8 profile_id, - CostCentreStack *stack, +void postHeapProfSampleCostCentre(CostCentreStack *stack, StgWord64 residency) { ACQUIRE_LOCK(&eventBufMutex); @@ -1340,7 +1347,8 @@ void postHeapProfSampleCostCentre(StgWord8 profile_id, CHECK(!ensureRoomForVariableEvent(&eventBuf, len)); postEventHeader(&eventBuf, EVENT_HEAP_PROF_SAMPLE_COST_CENTRE); postPayloadSize(&eventBuf, len); - postWord8(&eventBuf, profile_id); + // See Note [Profile ID]. + postWord8(&eventBuf, 0); postWord64(&eventBuf, residency); postWord8(&eventBuf, depth); for (ccs = stack; ===================================== rts/eventlog/EventLog.h ===================================== @@ -163,14 +163,13 @@ void postTaskMigrateEvent (EventTaskId taskId, void postTaskDeleteEvent (EventTaskId taskId); -void postHeapProfBegin(StgWord8 profile_id); +void postHeapProfBegin(void); void postHeapProfSampleBegin(StgInt era); void postHeapBioProfSampleBegin(StgInt era, StgWord64 time_ns); void postHeapProfSampleEnd(StgInt era); -void postHeapProfSampleString(StgWord8 profile_id, - const char *label, +void postHeapProfSampleString(const char *label, StgWord64 residency); #if defined(PROFILING) @@ -180,8 +179,7 @@ void postHeapProfCostCentre(StgWord32 ccID, const char *srcloc, StgBool is_caf); -void postHeapProfSampleCostCentre(StgWord8 profile_id, - CostCentreStack *stack, +void postHeapProfSampleCostCentre(CostCentreStack *stack, StgWord64 residency); void postProfSampleCostCentre(Capability *cap, View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/3049e37d2a10450da84aa596fb7392c... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/3049e37d2a10450da84aa596fb7392c... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Marge Bot (@marge-bot)