
G'day all. On Wed, Mar 08, 2006 at 01:50:06PM +0200, Einar Karttunen wrote:
Does anyone have an efficient tree implemented in STM that supports concurrent updates in an efficient fashion?
One could easily rewrite this question as: Does anyone have an efficient tree that supports concurrent updates in an efficient fashion? The short answer is: If by "tree" you mean "binary search tree", then the answer is no. If you ever come up with an efficient concurrent binary search tree, please collect your PhD. The longer answer is: If you want to know what data structures can be made efficiently concurrent, look at the data structures that database programmers use. In general, the popular data structures are those which allow for tunable granularity. There must be well-partitioned regions of the data structure which are big enough so that locking them doesn't cost too much and are small enough so that you get effective concurrency. As an example, hash tables are popular because it's easily partitioned (you can lock groups of hash chains independently), and it's highly tunable (the hash table size and the number of hash chains per lock are both good knobs). B-trees are popular for a similar reason. A node is an obvious unit of granularity, since different threads can work in different nodes without interfering. Not only is the page size tunable, there is also an obvious way to "group" nodes together such that locks can cover more than one physical node if you need that. Covering more than one node with a lock makes slightly more sense with a B-tree than with a hash table, because B-tree operations can require restructuring the tree, which requires holding more than one lock. Usually this is only a small fraction of operations (it depends on the workload, but 1% is a good rule of thumb), but grouping physical nodes together reduces the number of locks you need to hold even further. I've talked about "locks" here, but of course lock contention is the same thing as thread contention in STM. It's just that the details are different, and details are important in concurrent programming. Cheers, Andrew Bromage