
While alloca is not as cheap as, say, C's alloca, you should find that it is much quicker than C's malloc. I'm sure there's room for optimisation if it's critical for you. There may well be low-hanging fruit: take a look at the Core for alloca.
The problem with using the stack is that alloca needs to allocate non-movable memory, and in GHC thread stacks are movable.
Cheers, Simon
Thank you for reply.
I think I have had a few wrong assumptions. One of them is that stack is
non-movable. Of course, for this purpose I need a non-movable region and
a pinned array on a heap is probably the only choice.
Also, I was hoping it is possible to use the low-level stack (the one
which is being used when instructions such as "push" and "pop" are
executed), but I guess it is not possible in case of GHC-generated code.
As for the performance of "alloca", I though it would be faster than
"malloc". However, in a simple test I have just written it is actually
slower. The test allocates 16-bytes arrays and immediately de-allocates
them. This operation is repeated 1000000000 times. On my computer the C
program takes 27 seconds to complete while Haskell version takes about
41.
------------
#include