Thanks - I already did this for alloca/malloc, I'll add the others from your patch.
We go to quite a lot of trouble to avoid locking in the common cases and fast paths - most of our data structures are CPU-local. Where in particular have you encountered locking that could be reduced?
The pinned_object_block is CPU-local, usually no locking is required. Only when the block is full do we have to get a new block from the block allocator, and that requires a lock, but it's a rare case.
extern bdescr * pinned_object_block;
bdescr *pinned_object_block;
StgPtr
allocatePinned( lnat n )
{
StgPtr p;
bdescr *bd = pinned_object_block;
// If the request is for a large object, then allocate()
// will give us a pinned object anyway.
if (n >= LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
p = allocate(n);
Bdescr(p)->flags |= BF_PINNED;
return p;
}
ACQUIRE_SM_LOCK; // [RTVD: here we acquire the lock]
TICK_ALLOC_HEAP_NOCTR(n);
CCS_ALLOC(CCCS,n);
// If we don't have a block of pinned objects yet, or the current
// one isn't large enough to hold the new object, allocate a new one.
if (bd == NULL || (bd->free + n) > (bd->start + BLOCK_SIZE_W)) {
pinned_object_block = bd = allocBlock();
dbl_link_onto(bd, &g0s0->large_objects);
g0s0->n_large_blocks++;
bd->gen_no = 0;
bd->step = g0s0;
bd->flags = BF_PINNED | BF_LARGE;
bd->free = bd->start;
alloc_blocks++;
}
p = bd->free;
bd->free += n;
RELEASE_SM_LOCK; // [RTVD: here we release the lock]
return p;
}
Of course, TICK_ALLOC_HEAP_NOCTR and CCS_ALLOC may require synchronization if they use shared state (which is, again, probably unnecessary). However, in case no profiling goes on and "pinned_object_block" is TSO-local, isn't it possible to remove locking completely from this code? The only case when locking will be necessary is when a fresh block has to be allocated, and that can be done within the "allocBlock" method (or, more precisely, by using "allocBlock_lock".
ACQUIRE_SM_LOCK/RELEASE_SM_LOCK pair is present in other places too, but I have not analysed yet if it is really necessary there. For example, things like newCAF and newDynCAF are wrapped into it.
With kind regards,
Denys Rtveliashvili