
At Wed, 11 May 2011 13:02:21 +0100, Simon Marlow wrote:
However, if there's some simpler way to guarantee that>>= is the point where exceptions are thrown (and might be the case for GHC in practice), then I basically only need to update the docs. If someone with more GHC understanding could explain how asynchronous exceptions work, I'd love to hear it...
There's no guarantee of the form that you mention - asynchronous exceptions can occur anywhere. However, there might be a way to do what you want (disclaimer: I haven't looked at the implementation of iterIO).
Control.Exception will have a new operation in 7.2.1:
allowInterrupt :: IO () allowInterrupt = unsafeUnmask $ return ()
which allows an asynchronous exception to be thrown inside mask (until 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).
So to answer my own question from earlier, I did a bit of benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon 3060, running linux 2.6.38), I get the following costs: 9 ns - return () :: IO () -- baseline (meaningless in itself) 13 ns - unsafeUnmask $ return () -- with interrupts enabled 18 ns - unsafeUnmask $ return () -- inside a mask_ 13 ns - ffi -- a null FFI call (getpid cached by libc) 18 ns - unsafeUnmask ffi -- with interrupts enabled 22 ns - unsafeUnmask ffi -- inside a mask_ 131 ns - syscall -- getppid through FFI 135 ns - unsafeUnmask syscall -- with interrupts enabled 140 ns - unsafeUnmask syscall -- inside a mask_ So it seems that the cost of calling unsafeUnmask inside every liftIO would be about 22 cycles per liftIO invocation, which seems eminently reasonable. You could then safely run your whole program inside a big mask_ and not worry about exceptions happening between >>= invocations. Though truly compute-intensive workloads could have issues, the kind of applications targeted by iterIO will spend most of their time doing I/O, so this shouldn't be an issue. Better yet, for programs that don't use asynchronous exceptions, if you don't put your whole program inside a mask_, the cost drops roughly in half. It's hard to imagine any real application whose performance would take a significant hit because of an extra 11 cycles per liftIO. Is there anything I'm missing? For instance, my machine only has one CPU, and the tests all ran with one thread. Does unmaskAsyncExceptions# acquire a spinlock that could lock the memory bus? Or is there some other reason unsafeUnmask could become expensive on NUMA machines, or in the presence of concurrency? Thanks, David