[Haskell-cafe] Re: Optimizing locking with MVars

3 May 2006

      This is interesting, thanks.

I propose to add INLINE pragmas to withMVar and friends.

Having an interface for simple locks sounds like a good idea to me. 
Would you like to send a patch?

This won't affect Handle I/O unfortunately, because we need block to 
protect against asynchronous exceptions.  I'm still not certain you 
won't need that in the stream library, too: check any stateful code (eg. 
buffering) and imagine what happens if an exception is raised at an 
arbitrary point.

Cheers,
	Simon

Bulat Ziganshin wrote:
...
Main reason of slowness of existing Handle-based I/O in GHC is locking
around each operation. it is especially bad for simple char-at-a-time
I/O where 99% of time spent on locking and unlocking.
To be exact, on my CPU, hPutChar for 100mb file requires 150 seconds,
while hGetChar for the same file is "only" 100 seconds long. it seems
that former use 3 locking operations and later 2 ones, because my own
vGetChar/vPutChar implementations both requires 52 seconds, of those
only about one second is real work and rest is just `withMVar`
expenses.
Until now, i thought that this 0.5 ms (about 1000 primitive CPU
operations) on each withMVar is pure time required to perform
takeMVar+putMVar operations. But yesterday i investigated this problem
deeper and results was amazing!
First, i just made local copy of `withMVar` and added INLINE to it:
import Control.Exception as Exception
{-# INLINE inlinedWithMVar #-}
inlinedWithMVar :: MVar a -> (a -> IO b) -> IO b
inlinedWithMVar m io =
  block $ do
    a <- takeMVar m
    b <- Exception.catch (unblock (io a))
            (\e -> do putMVar m a; throw e)
    putMVar m a
    return b
Second, i've developed my own simplified version of this procedure.
Here i should say that my library uses "MVar ()" field to hold lock
and separate immutable data field with actual data locked:
data WithLocking h = WithLocking h !(MVar ())
This allowed me to omit block/unblock operation and develop the
following faster analog of withMVar:
lock (WithLocking h mvar) action = do
      Exception.catch (do takeMVar mvar
                           result <- action h                   
                           putMVar mvar ()                      
                           return res                           
                      )                                         
                      (\e -> do tryPutMVar mvar (); throw e)
And as third variant i tried exception-unsafe variant of `withMVar`:
unsafeWithMVar :: MVar a -> (a -> IO b) -> IO b
unsafeWithMVar m io = do
    a <- takeMVar m
    b <- io a
    putMVar m a
    return b
And now are results:
withMVar        52 seconds
inlinedWithMVar 38 seconds
lock            20 seconds
unsafeWithMVar  10 seconds
So,
1) `withMVar` can be made significantly faster just by attaching
INLINE pragma to it. until GHC includes this patch, you can just make
local copy of this procedure (it's implementation is
compiler-independent) and use INLINE pragma for this local copy
2) if MVar is used only to protect some immutable data from
simultaneous access, it's use can be made significantly faster by
using above-mentioned WithLocking type constructor together with
'lock' function. I hope that this mechanism will go into future
Haskell implementations and in particular it will be used in my own
Streams library and in new DiffArray implementation (that is a part
of ArrayRef library)
3) For simple programs that don't catch exceptions anyway, this
excessive protection is just meaningless. they can use
'unsafeWithMVar' to work as fast as possible. i mean in particular
shootout-like benchmarks. it is also possible to develop fast & safe
routines by using explicit unlocking (with 'tryPutMVar') in
higher-level exception handlers
and a more general conclusion. this case is a good demonstration of
significant performance loss due to using of higher-order functions. i
think that more aggressive inlining of high-order and polymorphic
functions should significantly speed up GHC-compiled programs.

[Haskell-cafe] Re: Optimizing locking with MVars

Simon Marlow