
On 14/04/10 06:02, Denys Rtveliashvili wrote:
Good morning,
Yesterday I did a few tests to measure the performance of FFI calls and found that the calls themselves are very quick (1-2 nanosecond). However, there is a kind of FFI calls when one have to allocate a temporary memory block (for a struct, or a temporary buffer). One of examples is a call to "gettimeofday" or "clock_gettime". Unfortunately, the usual way of doing it (using the "alloca" function) is quite slow (~40 nanoseconds).
I was wondering, is there any way to allocate a small chunk of data on a thread's stack? That should be cheap, as by large it is just a shift of a pointer and, perhaps, a few other trivial operations. I understand that it is not safe to allocate large blocks on the stack. But we could possibly have a function similar to "alloca" which would allocate small blocks on stack and would use the "alloca" for big ones.
While alloca is not as cheap as, say, C's alloca, you should find that it is much quicker than C's malloc. I'm sure there's room for optimisation if it's critical for you. There may well be low-hanging fruit: take a look at the Core for alloca. The problem with using the stack is that alloca needs to allocate non-movable memory, and in GHC thread stacks are movable. Cheers, Simon