Performance of small allocations via prim ops

5 Apr 2023

      I was looking at the RTS code for allocating small objects via prim ops
e.g. newByteArray# . The code looks like:

stg_newByteArrayzh ( W_ n )
{
    MAYBE_GC_N(stg_newByteArrayzh, n);

    payload_words = ROUNDUP_BYTES_TO_WDS(n);
    words = BYTES_TO_WDS(SIZEOF_StgArrBytes) + payload_words;
    ("ptr" p) = ccall allocateMightFail(MyCapability() "ptr", words);

We are making a foreign call here (ccall). I am wondering how much overhead
a ccall adds? I guess it may have to save and restore registers. Would it
be better to do the fast path case of allocating small objects from the
nursery using cmm code like in stg_gc_noregs?

-harendra

Harendra Kumar

Carter Schonwald

Harendra Kumar

Carter Schonwald

Ben Gamari

Harendra Kumar

Harendra Kumar

Harendra Kumar

Simon Peyton Jones

Harendra Kumar

Harendra Kumar

Ben Gamari

Sylvain Henry

tags

participants (5)