
On Wed, 26 Feb 2020 at 18:48, Ömer Sinan Ağacan
So the key points from this thread are:
- PAP payloads are scavenged using the function's bitmap. Because a PAPs payload will have less number of closures than the function's arity the bitmap will always have enough bits.
- A bit in a function bitmap is NOT for liveness (e.g. does not indicate whether an argument used or not), but for pointers vs. non-pointers. Function bitmaps are called "liveness bits" in the code generator which is misleading.
I think of all bitmaps as representing "liveness" (or equivalently "pointerhood") for the purposes of GC. There's no difference from the GC's perspective between a non-pointer and a pointer that it doesn't need to follow. In fact there's nothing to prevent us using the function bitmap to indicate dead arguments too - it would require zero changes in the RTS, the compiler would only need to mark unused pointer arguments as non-pointers in the bitmap. Probably wouldn't be worth very much overall, but I do recall one space leak that would have been cured by this. - In a function bitmap (small or large), 0 means pointer, 1 means
non-pointer.
This is true of bitmaps generally I think, not just function bitmaps. This is really what confused me in my last email above. For some reason I
intuitively expected 1 to mean pointer, not 0. Simon M also got this wrong
Oops :) I think there may originally have been a good reason to have it this way around: before eval/apply, we used bitmaps to describe stack frames, but we didn't need to encode a size in the bitmap because the default was for the stack contents to be pointers unless there was something to tell us otherwise. So a zero suffix of a bitmap just meant "the rest is just normal stack". This changed with eval/apply, but we kept the convention that zero meant pointer in a bitmap.
("So a 0 in the bitmap always means non-pointer.") so maybe this is confusing to others too.
- For functions with known argument patterns we don't use the function's bitmap. These function's type are greater than ARG_BCO (2), and for those we use the stg_arg_bitmaps array to get the bitmap.
For example, the bitmap for ARG_PPP (function with 3 pointer arguments) is at index 23 in this array, which is 0b11. For ARG_PNN it's 0b110000011. The least significant 6 bits are for the size (3), the remaining 0b110 means the first argument is a pointer, rest of the two are non-pointers.
Actually I think documentation on this is missing in the wiki, I guess I never got around to updating it when we implemented eval/apply. This page should really describe function info tables: https://gitlab.haskell.org/ghc/ghc/wikis/commentary/rts/storage/heap-objects... If you want to add documentation that would be a good place. Cheers Simon I still don't understand why this assertion
ASSERT(BITMAP_SIZE(bitmap) >= size);
I added to scavenge_small_bitmap in !2727 is failing though.
Ömer
Simon Peyton Jones
, 24 Şub 2020 Pzt, 13:45 tarihinde şunu yazdı: I’m not following this in detail, but do please make sure that the
results of this discussion end up in a suitable Note. Obviously it’s not transparently clear as-is, and I can see clarity emerging
Thanks!
Simon
From: ghc-devs
On Behalf Of Simon Marlow Sent: 24 February 2020 08:22 To: Ömer Sinan Ağacan Cc: ghc-devs Subject: Re: Confused about PAP object layout On Thu, 20 Feb 2020 at 09:21, Ömer Sinan Ağacan
wrote:
I'm not sure what you mean by "garbage". The bitmap merely determines
a field is a pointer,
I think the bitmap is for liveness, not for whether a field is pointer or not. Relevant code for building an info table for a function:
mk_pieces (Fun arity (ArgGen arg_bits)) srt_label = do { (liveness_lit, liveness_data) <- mkLivenessBits dflags arg_bits ; let fun_type | null liveness_data = aRG_GEN | otherwise = aRG_GEN_BIG extra_bits = [ packIntsCLit dflags fun_type arity ] ++ (if inlineSRT dflags then [] else [ srt_lit ]) ++ [ liveness_lit, slow_entry ] ; return (Nothing, Nothing, extra_bits, liveness_data) }
This uses the word "liveness" rather than "pointers".
However I just realized that the word "garbage" is still not the best way to describe what I'm trying to say. In the example
[pap_info, x, y, z]
If the function's bitmap is [1, 0, 1], then `y` may be a dead (an unused argument, or "garbage" as I describe in my previous email) OR it may be a non-pointer, but used (i.e. not a garbage).
I don't think we ever put a zero in the bitmap for a
whether pointer-but-not-used argument. We don't do liveness analysis for function arguments, as far as I'm aware. So a 0 in the bitmap always means "non-pointer".
The only reaosn the code uses the terminology "liveness" here is that
it's sharing code with the code that handles bitmaps for stack frames, which do deal with liveness.
So maybe "liveness" is also not the best way to describe this bitmap, as
not mean dead but rather "don't follow in GC".
On my quest to understand and document this code better I have one more question. When generating info tables for functions with know argument
(ArgSpec) we initialize the bitmap as 0. Relevant code:
mk_pieces (Fun arity (ArgSpec fun_type)) srt_label = do { let extra_bits = packIntsCLit dflags fun_type arity : srt_label ; return (Nothing, Nothing, extra_bits, []) }
Here the last return value is for the liveness data. I don't understand how can this be correct, because when we use this function in a PAP this will cause NOT scavenging the PAP payload. Relevant code (simplified):
STATIC_INLINE GNUC_ATTR_HOT StgPtr scavenge_PAP_payload (StgClosure *fun, StgClosure **payload, StgWord size) { const StgFunInfoTable *fun_info = get_fun_itbl(UNTAG_CONST_CLOSURE(fun));
StgPtr p = (StgPtr)payload;
StgWord bitmap; switch (fun_info->f.fun_type) { ...
default: bitmap = BITMAP_BITS(stg_arg_bitmaps[fun_info->f.fun_type]); small_bitmap: p = scavenge_small_bitmap(p, size, bitmap); break; } return p; }
Here if I have a function with three pointer args (ARG_PPP) the shown branch that will be taken, but because the bitmap is 0 (as shown in the mk_pieces code above) nothing in the PAPs payload will be scavenged.
It gets the bitmap from stg_arg_bitmaps[fun_info->f.fun_type], not from
0 does patterns the info table. Hope this helps.
Cheers
Simon
Here's an example from a debugging session:
>>> print pap $10 = (StgPAP *) 0x42001fe030
>>> print *pap $11 = { header = { info = 0x7fbdd1f06640
}, arity = 2, n_args = 1, fun = 0x7fbdd2d23ffb, payload = 0x42001fe048 } So this PAP is applied one argument, which is a boxed object (a FUN_2_0):
>>> print *get_itbl(UNTAG_CLOSURE(pap->payload[0])) $20 = { layout = { payload = { ptrs = 2, nptrs = 0 }, bitmap = 2, large_bitmap_offset = 2, __pad_large_bitmap_offset = 2, selector_offset = 2 }, type = 11, srt = 1914488, code = 0x7fbdd2b509c0 "H\215E\370L9\370r[I\203\304 M;\245X\003" }
However if I look at the function of this PAP:
>>> print *get_fun_itbl(UNTAG_CLOSURE(pap->fun)) $21 = { f = { slow_apply_offset = 16, __pad_slow_apply_offset = 3135120895, b = { bitmap = 74900193017889, bitmap_offset = 258342945, __pad_bitmap_offset = 258342945 }, fun_type = 23, arity = 3 }, i = { layout = { payload = { ptrs = 0, nptrs = 0 }, bitmap = 0, large_bitmap_offset = 0, __pad_large_bitmap_offset = 0, selector_offset = 0 }, type = 14, srt = 1916288, code = 0x7fbdd2b50260
"I\203\304(M;\245X\003" } } It has arity 3. Since the first argument is a boxed object and this
arity 3, if the argument is actually live in the function (i.e. not an unused argument), then the bitmap should have a 1 for this. But because the argument pattern is known (ARG_PPP) we initialized the bitmap as 0! Not sure how
can work.
What am I missing?
Thanks,
Ömer
Ben Gamari
, 14 Şub 2020 Cum, 20:25 tarihinde şunu yazdı: Ömer Sinan Ağacan
writes: I think that makes sense, with the invariant that n_args <=
bitmap_size. We
evacuate the arguments used by the function but not others. Thanks.
It's somewhat weird to see an object with useful stuff, then garbage, then useful stuff again in the heap, but that's not an issue by itself. For example if I have something like
[pap_info, x, y, z]
and according to the function `y` is dead, then after evacuating I get
[pap_info, x, <garbage>, z]
This "garbage" is evacuated again and again every time we evacuate
function has this this PAP.
I'm not sure what you mean by "garbage". The bitmap merely determines whether a field is a pointer, not whether it is copied during evacuation. A field's bitmap bit not being set merely means that we won't evacuate the value of that field during scavenging.
Nevertheless, this all deserves a comment in scavenge_PAP.
Cheers,
- Ben