Software Transactional Memory and LWN
Hi, Browsing LWN, I ran across this comment: http://lwn.net/Articles/336039/ The author makes a bunch of unsubstantiated claims about STM, namely that all implementations use locking under the hood, and that STM can live- and deadlock. I've seen a lot of similar sentiments in other places as well (comp.arch springs to mind). Now, I'm no expert on STM, but I was pretty sure these are incorrect, and I certainly had the impression that Haskell's STM guarantees some progress - which it couldn't in a deadlock situation. Am I wrong? -k -- If I haven't seen further, it is by standing in the footprints of giants
On Thu, Jun 11, 2009 at 2:30 AM, Ketil Malde
Hi,
Browsing LWN, I ran across this comment:
http://lwn.net/Articles/336039/
The author makes a bunch of unsubstantiated claims about STM, namely that all implementations use locking under the hood, and that STM can live- and deadlock. I've seen a lot of similar sentiments in other places as well (comp.arch springs to mind).
Now, I'm no expert on STM, but I was pretty sure these are incorrect, and I certainly had the impression that Haskell's STM guarantees some progress - which it couldn't in a deadlock situation.
MVars can be simulated with STM, and MVars can semantically get in a deadlock situation, so STM can also deadlock. Admittedly, if you're using STM to simulate MVars, you're doing it wrong. Luke
Ketil Malde wrote:
Hi,
Browsing LWN, I ran across this comment:
http://lwn.net/Articles/336039/
The author makes a bunch of unsubstantiated claims about STM, namely that all implementations use locking under the hood, and that STM can live- and deadlock. I've seen a lot of similar sentiments in other places as well (comp.arch springs to mind).
Now, I'm no expert on STM, but I was pretty sure these are incorrect, and I certainly had the impression that Haskell's STM guarantees some progress - which it couldn't in a deadlock situation.
I think there needs to be some differentiation here between the implementation of STM, and the programmer's use of STM. The implementation of STM does effectively use locks (from memory, it's this paper that explains it: http://research.microsoft.com/en-us/um/people/tharris/papers/2007-tocs.pdf). I'm not sure how optimistic algorithms work, so maybe they can usually avoid locks -- but I suspect there are cases in every STM algorithm where locks are needed as a last resort. The implementation of STM cannot deadlock. There will always be at least one process making progress, but potentially at the expense of starving others indefinitely (a form of livelock). So the implementation has locks, can't deadlock, can sort of livelock. The use of STM does not involve locks, and one of STM's main advantages is that it hides explicit locks from the user. If you have retry in STM (as Haskell does, but not all other implementations do) then you can write deadlocking code with it, and indeed you can simulate mutexes and so on using retry, hence allowing you to use your own constructed locks with STM. So in using STM you can deadlock, and you can make some locks to use if you want, but it's not required. Neil.
Neil Brown
I think there needs to be some differentiation here between the implementation of STM, and the programmer's use of STM.
The implementation of STM does effectively use locks (from memory, it's this paper that explains it:
Ignoring the paper in the interest of laz...expedience, I guess the crucial part is committing the transactions - you'd either need locks or to single-thread the committing.
The use of STM does not involve locks, and one of STM's main advantages is that it hides explicit locks from the user. If you have retry in STM (as Haskell does, but not all other implementations do) then you can write deadlocking code with it, and indeed you can simulate mutexes and so on using retry, hence allowing you to use your own constructed locks with STM. So in using STM you can deadlock, and you can make some locks to use if you want, but it's not required.
So the naïve attempt at doing this would be something like: thread = do -- grab "lock 1" t <- readTVar lock when t retry writeTVar lock True -- grab "lock 2" t2 <- readTVar lock2 when t2 retry writeTVar writeTVar lock2 True -- do something writeTVar lock2 False writeTVar lock False and another one with the locks reversed. But that won't work of course, since the 'retry' will rollback the taking of lock 1 as well. So do I need to split this up into separate STM transactions and orchestrate the locking from the IO monad? -k -- If I haven't seen further, it is by standing in the footprints of giants
Ketil Malde wrote:
So the naïve attempt at doing this would be something like:
thread = do -- grab "lock 1" t <- readTVar lock when t retry writeTVar lock True
-- grab "lock 2" t2 <- readTVar lock2 when t2 retry writeTVar writeTVar lock2 True
-- do something writeTVar lock2 False writeTVar lock False
and another one with the locks reversed. But that won't work of course, since the 'retry' will rollback the taking of lock 1 as well. So do I need to split this up into separate STM transactions and orchestrate the locking from the IO monad?
Indeed: type Lock = TVar Bool claim :: Lock -> IO () claim tv = atomically $ do b <- readTVar tv when b retry writeTVar tv True release :: Lock -> IO () release tv = atomically $ writeTVar tv False Writing a lock in STM is not useful for the purpose of doing some other STM stuff inbetween anyway, because that goes against the point of STM (you don't need locks for STM actions -- as you point out, the rollback from the locks is not useful in your example). So it only makes sense if you are doing some other IO action inbetween the claim and release. For example, you might want to write to a socket from several threads, and use locks to ensure exclusivity. Often, using STM for locks is pretty silly because there is some other way to do it (e.g. have the threads write their packets to an STM queue, and have a single thread reading from the queue and sending on the sockets) but they're a simple example of how you can create deadlock in STM, e.g. main = do tv1 <- newTVarIO False tv2 <- newTVarIO False forkIO $ claim tv2 >> claim tv1 >> release tv1 >> release tv2 claim tv1 >> claim tv2 >> release tv2 >> release tv1 is a possible source of deadlock, or even the more straightforward: main = newTVarIO True >>= claim What is particularly cool about Haskell's STM implementation is that in the second example (and possibly in the first one too) the run-time can detect straight-out deadlock and will throw you a deadlock exception. It does this via garbage collection -- if a thread is waiting on a TVar that no other thread has a reference to, the thread is technically ripe for garbage collection and the thread is instead woken up with an exception. At least, I think that's the principle. I don't think it can catch all cases (another thread may have the reference but may never use it) but it's still quite impressive. Thanks, Neil.
On Thu, Jun 11, 2009 at 4:38 AM, Ketil Malde
Ignoring the paper in the interest of laz...expedience, I guess the crucial part is committing the transactions - you'd either need locks or to single-thread the committing.
It's possible to single-thread the commit without locking using lock-free mechanisms, but it's extremely tricky. The current proof-of-concept implementation uses locks under the hood, but only during commit. One big benefit of STM, in fact, is that you can avoid using locks until commit-time; it's very difficult to write correct concurrent code even with locks around the entirety of your transaction (as is used in, for example, Java's "synchronize" methods). Being correct while delaying all locking until updates need to be made is impressive in itself. The other benefit is that you free the programmer from having to worry about locks; they are an implementation detail. There's no technical hurdle towards switching all of Haskell's STM code to a lock-free implementation; client code using STM doesn't have to change at all. Saying "all current implementations use locks" is kind of a cop-out; it would be like saying ten years ago "all current cars use gasoline, so why bother researching better batteries?" I am confident that there will be a lock-free implementation of STM in Haskell's future. -- ryan
On Thursday 11 June 2009, Ketil Malde wrote:
Hi,
Browsing LWN, I ran across this comment:
http://lwn.net/Articles/336039/
The author makes a bunch of unsubstantiated claims about STM, namely that all implementations use locking under the hood, and that STM can live- and deadlock. I've seen a lot of similar sentiments in other places as well (comp.arch springs to mind).
Now, I'm no expert on STM, but I was pretty sure these are incorrect, and I certainly had the impression that Haskell's STM guarantees some progress - which it couldn't in a deadlock situation.
Am I wrong?
Hi, While I'm no STM expert either, I'd like to remind an often overlooked detail: locks and STM are two different abstractions. While locks provide semantics for running critical sections (informally, parts of code which only one thread can execute) STM provides semantics for atomic actions (informally, actions the intermediate state of which can't be observed by other threads). Now, while an STM implementation can use locks under the hood, it doesn't change the fact that the programmer isn't exposed to using those. Saying STM is bad because it uses locks under the hood is the same as saying that a garbage-collected environment is bad because it uses malloc and free under the hood. As far as the STM-is-better-because-it-doesn't-use-locks theory is concerned, the idea here is that STM doesn't associate a lock with every atomic section. This means that (in an optimistic implementation) any number of threads (and potentially CPU cores) can execute the atomic action in parallel, and if there are few rollbacks, this can lead to better performance than using a single lock[3]. And you can't forget about composability -- code written using locks is less modular than code written using STM. While either optimistic[1] or pessimistic[2] STM can livelock, this can be solved by some sort of exponential backoff algorithm (which does not guarantee progress, just makes a livelock less likely). As far as deadlock is concerned -- if we allow for retry, then as it has already been mentioned, there is a possibility for deadlock. But with STM, you can do deadlock detection by means of cycle detection in wait-for graphs (in much the same way as it is done in DBMS). While now STM usually performs worse than locks, it is an active area of research (even Intel released an experimental STM-supporing C++ compiler[4]). It is also true, that as the number of cores in cpus grows, STM will become more attractive. Thanks! Marcin Kosiba [1]http://en.wikipedia.org/wiki/Software_transactional_memory#Implementation_is... [2]when a long-lived transaction is rolled-back by small transactions [3]you can even switch between STM and locking: General and Efficient Locking without Blocking. Yannis Smaragdakis, Anthony Kay, Reimer Behrends, Michal Young [4]http://software.intel.com/en-us/articles/intel-c-stm-compiler-prototype-edit...
participants (5)
-
Ketil Malde -
Luke Palmer -
Marcin Kosiba -
Neil Brown -
Ryan Ingram