RE: [Haskell-cafe] Help with "shootout"

4 Jan 2006

      On 03 January 2006 12:03, Chris Kuklewicz wrote:
...
STM* is usually slower than IO/MVar.  STM has to do the transactional
record keeping and throws away work (i.e. CPU cycles and speed) when
it aborts.  The Chameneos benchmark has 4 writers working *very*
quickly, so the contention is high.  Taking the MVar acts like a
mutex to serialize access without throwing away work.
We found that once you add exception safety to your MVar code, i.e.
using withMVar and modifyMVar rather than raw takeMVar/putMVar, then
MVar code is comparable in speed to STM.  For example, Chan and TChan
perform about the same.

Of course, for a benchmark, you can forget about the exception safety,
as the Ch implementation does (plus it is tweaked in various other
ways).
...
Also, I suspect the following may be true:
If 4 threads block trying to take an MVar, and a 5th thread puts a
value in the MVar, then *exactly one* of the 4 blocked threads is
woken up and scheduled to run.
Yes, that's true.  We haven't documented this as a property of MVars,
but we don't intend to change this behaviour, so I think we probably
should document it.

Furthermore, the thread to be woken up is always the first thread to
block on the MVar.  That is, there's a kind of fairness built into MVars
which isn't present in STM and TMVars.
...
If 4 threads retry while in STM (e.g. takeTMVar), and a 5th thread
commits a change to that TMVar, then *all 4 threads* are woken up and
rescheduled.  This is what the Apache httpd server folks called the
"thundering herd" problem when many processes are blocked waiting on
socket 80.
If 4000 threads were getting woken up when only 1 was needed, then
performance would be poor.  Certainly I found 4 writer thread and STM
to be much slower for this shootout problem than the Einar's custom
MVar channel.
Could someone who knows the STM implementation comment on this?
Hope this helps...

Simon PJ and I talked about this briefly this morning, and agreed that
it might be possible to provide more primitive MVar-like objects in STM
to recover single-wakeup and some of the speed that MVars currently
have.  We haven't worked through the details though.
...
...
This is the CVS code. newTChanIO is exported but undocumented in GHC
6.4.1. I'm not sure what purpose it serves.
I get tired of writing "do tv <- atomically $ newTVar foo"
I bet this is just shorthand: "do tv <- newTVarIO foo"
Same for newTChanIO.
It's not just shorthand: you can use newTVarIO and friends inside
unsafePerformIO, which doesn't always work for the longhand atomically
version.

Cheers,
	Simon

Simon Marlow

tags

participants (1)