
Hello Does anyone have an efficient tree implemented in STM that supports concurrent updates in an efficient fashion? This seems suprisingly hard to implement - a normal binary tree with links as TVar is very slow and does not scale very well. - Einar Karttunen

On Wed, Mar 08, 2006 at 01:50:06PM +0200, Einar Karttunen wrote:
Does anyone have an efficient tree implemented in STM that supports concurrent updates in an efficient fashion?
Interesting idea!
This seems suprisingly hard to implement - a normal binary tree with links as TVar is very slow and does not scale very well.
By "normal" you mean unbalanced? Do you think it's slow because it's not balanced, or because of STM? Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

On 08.03 13:32, Tomasz Zielonka wrote:
This seems suprisingly hard to implement - a normal binary tree with links as TVar is very slow and does not scale very well.
By "normal" you mean unbalanced? Do you think it's slow because it's not balanced, or because of STM?
Unbalanced in this case because balancing produces even more problematic effects (TVar writes). I used random inserts in the test which kept the tree in a good balance, and tested the same tree without STM to see whether that was the problem - so I am fairly sure that balancing is not the issue. - Einar Karttunen

G'day all. On Wed, Mar 08, 2006 at 01:50:06PM +0200, Einar Karttunen wrote:
Does anyone have an efficient tree implemented in STM that supports concurrent updates in an efficient fashion?
One could easily rewrite this question as: Does anyone have an efficient tree that supports concurrent updates in an efficient fashion? The short answer is: If by "tree" you mean "binary search tree", then the answer is no. If you ever come up with an efficient concurrent binary search tree, please collect your PhD. The longer answer is: If you want to know what data structures can be made efficiently concurrent, look at the data structures that database programmers use. In general, the popular data structures are those which allow for tunable granularity. There must be well-partitioned regions of the data structure which are big enough so that locking them doesn't cost too much and are small enough so that you get effective concurrency. As an example, hash tables are popular because it's easily partitioned (you can lock groups of hash chains independently), and it's highly tunable (the hash table size and the number of hash chains per lock are both good knobs). B-trees are popular for a similar reason. A node is an obvious unit of granularity, since different threads can work in different nodes without interfering. Not only is the page size tunable, there is also an obvious way to "group" nodes together such that locks can cover more than one physical node if you need that. Covering more than one node with a lock makes slightly more sense with a B-tree than with a hash table, because B-tree operations can require restructuring the tree, which requires holding more than one lock. Usually this is only a small fraction of operations (it depends on the workload, but 1% is a good rule of thumb), but grouping physical nodes together reduces the number of locks you need to hold even further. I've talked about "locks" here, but of course lock contention is the same thing as thread contention in STM. It's just that the details are different, and details are important in concurrent programming. Cheers, Andrew Bromage

On Sat, Mar 18, 2006 at 05:46:30PM -0500, ajb@spamcop.net wrote:
B-trees are popular for a similar reason. A node is an obvious unit of granularity, since different threads can work in different nodes without interfering. Not only is the page size tunable, there is also an obvious way to "group" nodes together such that locks can cover more than one physical node if you need that.
I believe the current best choice in concurrent environments is 'skip-lists' since they require no global invarients (or locks) to be maintained as they are probabalistic. They parallelize very well. John -- John Meacham - ⑆repetae.net⑆john⑈

Sorry for the slow reply,
On 3/8/06, Einar Karttunen
Does anyone have an efficient tree implemented in STM that supports concurrent updates in an efficient fashion? This seems suprisingly hard to implement - a normal binary tree with links as TVar is very slow and does not scale very well.
I'm just wondering whether it is important for you that all the internal links are TVars. That of course depends on what kind of API you're imagining for your data structure. I would imagine though that a single TVar pointing to the root of your favourite functional tree implementation will be quite sufficient. My guess is it would also be the fastest way of implementing it. Cheers, /Josef
participants (5)
-
ajb@spamcop.net
-
Einar Karttunen
-
John Meacham
-
Josef Svenningsson
-
Tomasz Zielonka