
On 12/05/2011 16:04, David Mazieres expires 2011-08-10 PDT wrote:
At Thu, 12 May 2011 09:57:13 +0100, Simon Marlow wrote:
So to answer my own question from earlier, I did a bit of benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon 3060, running linux 2.6.38), I get the following costs:
9 ns - return () :: IO () -- baseline (meaningless in itself) 13 ns - unsafeUnmask $ return () -- with interrupts enabled 18 ns - unsafeUnmask $ return () -- inside a mask_
13 ns - ffi -- a null FFI call (getpid cached by libc) 18 ns - unsafeUnmask ffi -- with interrupts enabled 22 ns - unsafeUnmask ffi -- inside a mask_
Those are lower than I was expecting, but look plausible. There's room for improvement too (by inlining some or all of unsafeUnmask#).
Do you mean inline unsafeUnmask, or unmaskAsyncExceptions#? I tried inlining unsafeUnmask by writing my own version and giving it the INLINE pragma, and it didn't affect performance at all.
Right, I meant inlining unmaskAsyncExceptions#, which would require compiler support.
However, the general case of unsafeUnmask E, where E is something more complex than return (), will be more expensive because a new closure for E has to be created. e.g. try "return x" instead of "return ()", and try to make sure that the closure has to be created once per unsafeUnmask, not lifted out and shared.
Okay. I'm surprised by getpid example wasn't already stressing this, but, indeed, I see a tiny difference with the following code:
ffi>>= return . (1 +) -- where ffi calls getpid
13 ns - no unmasking 20 ns - unsafeUnmask when not inside _mask 25 ns - unsafeUnmask when benchmark loop in inside one big _mask
So now we're talking about 28 cycles or something instead of 22. Still not a huge deal.
Ok, sounds reasonable.
There are no locks here, thanks to the message-passing implementation we use for throwTo between processors.
Okay, that sounds good. So then there is no guarantee about ordering of throwTo exceptions? That seems like a good thing since there are other mechanisms for synchronization.
What kind of ordering guarantee did you have in mind? We do guarantee that in throwTo t e1 throwTo t e2 Thread t will receive e1 before e2 (obviously, because throwTo is synchronous and only returns when the exception has been raised). Pending exceptions are processed in LIFO order (for no good reason other than performance), so there's no kind of fairness guarantee of the kind you get with MVars. One thread doing throwTo can be starved by others. I don't think that's a serious problem. Cheers, Simon