Great /fast experimentation! 

I will admit I’m pleased that my dated intuition is still correct, but more importantly we have more current data!

Thanks for the exploration and sharing what you found! 

On Fri, Apr 7, 2023 at 7:35 AM Harendra Kumar <harendra.kumar@gmail.com> wrote:


On Fri, 7 Apr 2023 at 02:18, Carter Schonwald <carter.schonwald@gmail.com> wrote:
That sounds like a worthy experiment! 

I  guess that would look like having an inline macro’d up path that checks if it can get the job done that falls back to the general code?

Last I checked, the overhead for this sort of c call was on the order of 10nanoseconds or less which seems like it’d be very unlikely to be a bottleneck, but do you have any natural or artificial benchmark programs that would show case this?

I converted my example code into a loop and ran it a million times with a 1 byte array size (would be 8 bytes after alignment). So roughly 3 words would be allocated per array, including the header and length. It took 5 ms using the statically known size optimization which inlines the alloc completely, and 10 ms using an unknown size (from program arg) which makes a call to newByteArray# . That turns out to be of the order of 5ns more per allocation. It does not sound like a big deal.

-harendra