Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

29 Dec 2008

      Evan Laforge wrote:
...
On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram  wrote:
...
Both readTVar and writeTVar are worse than O(1); they have to look up
the TVar in the transaction log to see if you have made local changes
to it.
Right now it looks like that operation is O(n) where n is the number
of TVars accessed by a transaction, so your big transaction which is
just accessing a ton of TVars is likely O(n^2).
So this actually brings up a tangential question I've wondered about
for a while.
The lock-free datastructure paper at
http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-f...
shows lock vs. STM with very similar performance for single processor
with STM quickly winning on multi-processors.
I have not verified this, but a possible cause is that
Control.Concurrent.QSem isn't efficient if there are many waiters.
It should use two lists for managing the waiters (ala Okasaki).

But why does it manually manage the waiters at all? MVars are fair, in
ghc at least. So this should work:

    newtype Sem = Sem (MVar Int) (MVar Int)

    newSem :: Int -> IO Sem
    newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar

    -- | Wait for a unit to become available
    waitSem :: Sem -> IO ()
    waitSem (Sem sem wakeup) = do
       avail' <- modifyMVar sem (\avail -> return (avail-1, avail-1))
       when (avail' < 0) $ takeMVar wakeup >>= putMVar sem

    -- | Signal that a unit of the 'Sem' is available
    signalSem :: Sem -> IO ()
    signalSem (Sem sem wakeup) = do
       avail <- takeMVar sem
       if avail < 0 then putMVar wakeup (avail+1)
                    else putMVar sem (avail+1)

(I should turn this into a library proposal.)

Bertram

Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Bertram Felgenhauer