Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

12 May 2011

      On 12/05/2011 16:04, David Mazieres expires 2011-08-10 PDT wrote:
...
At Thu, 12 May 2011 09:57:13 +0100,
Simon Marlow wrote:
...
...
So to answer my own question from earlier, I did a bit of
benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
3060, running linux 2.6.38), I get the following costs:
9 ns - return () :: IO ()       -- baseline (meaningless in itself)
      13 ns - unsafeUnmask $ return () -- with interrupts enabled
      18 ns - unsafeUnmask $ return () -- inside a mask_
13 ns - ffi                      -- a null FFI call (getpid cached by libc)
      18 ns - unsafeUnmask ffi         -- with interrupts enabled
      22 ns - unsafeUnmask ffi         -- inside a mask_
Those are lower than I was expecting, but look plausible.  There's room
for improvement too (by inlining some or all of unsafeUnmask#).
Do you mean inline unsafeUnmask, or unmaskAsyncExceptions#?  I tried
inlining unsafeUnmask by writing my own version and giving it the
INLINE pragma, and it didn't affect performance at all.
Right, I meant inlining unmaskAsyncExceptions#, which would require 
compiler support.
...
...
However, the general case of unsafeUnmask E, where E is something more
complex than return (), will be more expensive because a new closure for
E has to be created.  e.g. try "return x" instead of "return ()", and
try to make sure that the closure has to be created once per
unsafeUnmask, not lifted out and shared.
Okay.  I'm surprised by getpid example wasn't already stressing this,
but, indeed, I see a tiny difference with the following code:
ffi>>= return . (1 +) -- where ffi calls getpid
13 ns - no unmasking
        20 ns - unsafeUnmask when not inside _mask
        25 ns - unsafeUnmask when benchmark loop in inside one big _mask
So now we're talking about 28 cycles or something instead of 22.
Still not a huge deal.
Ok, sounds reasonable.
...
...
There are no locks here, thanks to the message-passing implementation we
use for throwTo between processors.
Okay, that sounds good.  So then there is no guarantee about ordering
of throwTo exceptions?  That seems like a good thing since there are
other mechanisms for synchronization.
What kind of ordering guarantee did you have in mind?  We do guarantee 
that in

    throwTo t e1
    throwTo t e2

Thread t will receive e1 before e2 (obviously, because throwTo is 
synchronous and only returns when the exception has been raised).

Pending exceptions are processed in LIFO order (for no good reason other 
than performance), so there's no kind of fairness guarantee of the kind 
you get with MVars.  One thread doing throwTo can be starved by others. 
  I don't think that's a serious problem.

Cheers,
	Simon