RE: [Haskell-cafe] Help with "shootout"

On 03 January 2006 12:03, Chris Kuklewicz wrote:
STM* is usually slower than IO/MVar. STM has to do the transactional record keeping and throws away work (i.e. CPU cycles and speed) when it aborts. The Chameneos benchmark has 4 writers working *very* quickly, so the contention is high. Taking the MVar acts like a mutex to serialize access without throwing away work.
We found that once you add exception safety to your MVar code, i.e. using withMVar and modifyMVar rather than raw takeMVar/putMVar, then MVar code is comparable in speed to STM. For example, Chan and TChan perform about the same. Of course, for a benchmark, you can forget about the exception safety, as the Ch implementation does (plus it is tweaked in various other ways).
Also, I suspect the following may be true:
If 4 threads block trying to take an MVar, and a 5th thread puts a value in the MVar, then *exactly one* of the 4 blocked threads is woken up and scheduled to run.
Yes, that's true. We haven't documented this as a property of MVars, but we don't intend to change this behaviour, so I think we probably should document it. Furthermore, the thread to be woken up is always the first thread to block on the MVar. That is, there's a kind of fairness built into MVars which isn't present in STM and TMVars.
If 4 threads retry while in STM (e.g. takeTMVar), and a 5th thread commits a change to that TMVar, then *all 4 threads* are woken up and rescheduled. This is what the Apache httpd server folks called the "thundering herd" problem when many processes are blocked waiting on socket 80.
If 4000 threads were getting woken up when only 1 was needed, then performance would be poor. Certainly I found 4 writer thread and STM to be much slower for this shootout problem than the Einar's custom MVar channel.
Could someone who knows the STM implementation comment on this?
Hope this helps... Simon PJ and I talked about this briefly this morning, and agreed that it might be possible to provide more primitive MVar-like objects in STM to recover single-wakeup and some of the speed that MVars currently have. We haven't worked through the details though.
This is the CVS code. newTChanIO is exported but undocumented in GHC 6.4.1. I'm not sure what purpose it serves.
I get tired of writing "do tv <- atomically $ newTVar foo"
I bet this is just shorthand: "do tv <- newTVarIO foo"
Same for newTChanIO.
It's not just shorthand: you can use newTVarIO and friends inside unsafePerformIO, which doesn't always work for the longhand atomically version. Cheers, Simon
participants (1)
-
Simon Marlow