
On 17/08/2010 06:09, Gregory Collins wrote:
Does GHC expose any primitives for things like atomic compare-and-swap? I can't seem to find anything in the docs. I'm wondering if it's possible, for example, to implement things like the wait-free concurrent queue from [1] or a lock-free wait-free hash table like the Azul people are doing [2].
We could provide a compare-and-swap on IORefs, indeed I've been planning to try that so we could move the implementation of atomicModifyIORef into user-space, so to speak. Anecdotally it's hard to beat the performance of a good pure data structure in a mutable container. For example, the mutable list experiments we did a while back [1] were all at least a factor of 10 slower than IORef [a]. [1] http://www.haskell.org/~simonmar/bib/concurrent-data-08_abstract.html
In network servers, we often have a large number of clients fighting over shared resources -- even simple stuff like "which threads are scheduled to timeout?" and "which threads are currently active, so I can kill them if I need to?". Shared mutable collections seem to be a fact of life here.
Under load, pure data structures behind MVars don't perform well because of lock contention, and in my experience atomicModifyIORef is marginally better but still suffers from contention issues (probably related to thunk blackholing).
This is the specific case that we addressed in the GHC runtime recently, and you should find that 6.14 will perform a lot better than 6.12 with pure data structures in mutable containers, especially with atomicModifyIORef. In the meantime you'll probably get better performance using STM. Whether you should perform the update strictly inside the transaction or not is a matter for experimentation, it probably depends on the data structure.
Recently I got a ~35% speedup on a benchmark by switching one of these tables from a Map behind an IORef to a hashmap using a striped-locking scheme (see [3] if you're interested.) I think there's a need for high-concurrency data structures like this, and I don't see much on Hackage that addresses it. As much as we like to say that we have a handle on concurrency issues, from what I can see java.util.concurrent.* has us soundly thrashed in this department, especially as it relates to exposing atomic wait-free CPU primitives like the ones in java.util.concurrent.atomic.* [4]. Speaking as a partisan, this cannot be allowed to stand, can it?
Absolutely. Although the Java crowd have to grapple with a complicated memory model which makes building these data structures virtually impossible. At least in Haskell you can whip up a concurrent version of any pure data structure trivially, and have a lot of confidence that you got it right. Of course you could do that in Java, but they don't tend to build pure data structures by default, whereas we have them by the bucketload. Cheers, Simon
Cheers,
G.
------------------------------------------------------------------------------
[1] Maged M. Michael and Michael L. Scott, "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms": http://www.cs.rochester.edu/u/michael/PODC96.html
[2] http://www.stanford.edu/class/ee380/Abstracts/070221_LockFreeHash.pdf
[3] http://github.com/snapframework/snap-server/blob/master/src/Data/HashMap/Con...
[4] http://download.oracle.com/javase/6/docs/api/java/util/concurrent/atomic/Ato...