Stronger STM primitives needed? Or am I just doing it wrong?

How can I implement the following operation efficiently in STM? Given a TVar "now", waitFor t0 = do t <- readTVar now if (t < t0) then retry else return () This naive implementation has the problem that the transaction gets restarted every time "now" gets updated, even if the new value is still less than t0. One primitive that would be strong enough is this: retryUntil :: TVar a -> (a -> Bool) -> STM () although it would still do some computation every time "now" changed. The thought I originally had was to register the "waitFor" time with some helper which kept track of the current time and fired off a notice when it was ready. But the problem with that is that "retry" undoes all the work of setting that up; the goal is still to block, but I want a stronger blocking primitive. Does anyone have any ideas? -- ryan Here's the background: I'm following along with Conal's excellent FRP paper at http://conal.net/papers/simply-reactive/ In his implementation of futures, Conal uses a race primitive which spawns two threads each computing a value that (if successful) is guaranteed to agree. I was (and still am) sure that an implementation using STM could avoid needing to race, with careful use of orElse & retry. Here's the datatype I'm using for Future:
-- semantic type for future -- t = type of Time, an instance of Bounded. a = value type type F t a = (t, a) force :: Future t a -> F t a
data Future t a = Fut { waitFor :: t -> STM () , value :: STM (t, a) }
firstFuture :: Future t a -> Future t a -> Future t a -- with semantics: -- force (firstFuture f1 f2) = (min t1 t2, if t1 <= t2 then v1 else v2) -- where -- (t1, v1) = force f1 -- (t2, v2) = force f2
Each future lives in some universe; imagine you have the following: ] type Universe t -- abstract ] P :: Bounded t => Universe t -- universe of pure values ] R :: Universe Time -- universe of the real world ] univ :: Future t a -> Universe t ] now :: Universe t -> STM t ] -- now P = return maxBound ] -- now R = current time The main thrust of this is that each universe has its own idea of what time it is; but when combining futures, we get a new universe which tracks the later of the times in the two universes. The problem is combining two futures: Lets say we have: anyKey :: Future Time () -- fires when the user first presses a key thousand :: Num t => Future t () -- fires at tick 1000 thousand = exactly 1000 () and we are evaluating force (firstFuture anyKey thousand) Now, "clock" is going to fire at tick 1000; we know this because it lives in the pure universe and its value is always available. So firstFuture can get the following: x1 <- maybeSTM (value anyKey) -- x1 = Nothing x2 <- maybeSTM (value clock) -- x2 = Just (1000, ()) ... -- synchronize P with R at tick 1000 waitFor anyKey 1000 -- if nothing has changed before tick 1000 return (1000, ()) -- converts a possibly-retry-calling STM into one that never fails maybeSTM m = fmap Just m `orElse` return Nothing So if we use the naive implementation for "waitFor" in terms of "now", the whole transaction will get re-evaluated every tick. I only want it to get re-evaluated if the user presses a key (changing some TVar evaluated in "value anyKey"), or tick 1000 passes, whichever comes first. Is there a way to do this? Is my choice of "waitFor" as the basic "universe synchronization" operation too weak?

Ryan Ingram said:
How can I implement the following operation efficiently in STM? Given a TVar "now",
waitFor t0 = do t <- readTVar now if (t < t0) then retry else return ()
This naive implementation has the problem that the transaction gets restarted every time "now" gets updated, even if the new value is still less than t0.
I'm not familiar with FRP, so this may be off the mark. Are you familiar with Control.Concurrent.STM.TVar.registerDelay? I realise your concept of time differs from that of registerDelay, but I suspect you'll need to use a similar approach.
One primitive that would be strong enough is this: retryUntil :: TVar a -> (a -> Bool) -> STM ()
although it would still do some computation every time "now" changed.
I don't think a primitive retryUntil would be able to do any better than the obvious implementation (which looks a lot like your waitFor).
The thought I originally had was to register the "waitFor" time with some helper which kept track of the current time and fired off a notice when it was ready. But the problem with that is that "retry" undoes all the work of setting that up; the goal is still to block, but I want a stronger blocking primitive.
That's essentially what registerDelay does. The key is that registering a timer must occur in the IO monad, outside the transaction that waits for it. I vaguely recall that there is at least one hackage library which is somewhat more sophisticated than registerDelay, so you might also want to look there.

On Tue, 22 Apr 2008 14:48:54 -0700, Ryan Ingram wrote:
waitFor t0 = do t <- readTVar now if (t < t0) then retry else return ()
This naive implementation has the problem that the transaction gets restarted every time "now" gets updated, even if the new value is still less than t0.
One primitive that would be strong enough is this: retryUntil :: TVar a -> (a -> Bool) -> STM ()
Hmm. This makes me suspicious. A change to a variable may change the transaction such that it never even calls your retryUntil the next time around. Is it really safe to not retry the transaction? -- Chris Smith

On 4/22/08, Chris Smith
One primitive that would be strong enough is this: retryUntil :: TVar a -> (a -> Bool) -> STM ()
Hmm. This makes me suspicious. A change to a variable may change the transaction such that it never even calls your retryUntil the next time around. Is it really safe to not retry the transaction?
Of course not; the semantics would be that the transaction log, instead of saying "I read from v" would say "I read from v and failed because v didn't satisfy this predicate". Changes to any other variable in the log would have the same effect as always: restarting the transaction. This is actually required in my desired use case; I want to block until "now" becomes >= t, or a different TVar gets filled with a non-Nothing value. -- ryan

Ryan Ingram said:
retryUntil :: TVar a -> (a -> Bool) -> STM ()
[...]
the semantics would be that the transaction log, instead of saying "I read from v" would say "I read from v and failed because v didn't satisfy this predicate".
Changes to any other variable in the log would have the same effect as always: restarting the transaction. This is actually required in my desired use case; I want to block until "now" becomes >= t, or a different TVar gets filled with a non-Nothing value.
I see now what you mean, and I can see that it might be a nice little addition. By giving the STM runtime additional information about what conditions would allow the transaction to progress, it might be able to save some false retries. Note that the only work that is saved is the work done to get to the retryUntil. However, strictly speaking, the STM runtime doesn't need this additional information. You could view a composite STM transaction as an inverted tree, with individual reads at the leaves, merging through monadic bindings towards the root, which is the final result (which includes the transaction's atomic updates). A transaction waiting on a retry is just one of these trees in suspended animation, waiting for a new value to be written to one of the leaves. When such a value is written to one of the leaves, only its downstream nodes need re-evaluation. Implementing STM that way may not be worth the effort, though, whereas retryUntil might be.

Actually, I think I came up with a solution on the way home from work today. Instead of
data Future t a = Fut { waitFor :: t -> STM () , value :: STM (t, a) }
I will use
data Future t a = Fut { waitFor :: t -> IO (STM ()) , value :: IO (STM (t, a)) }
The goal is to be able to wait on multiple things at a time, but that
doesn't mean all the setup has to happen in STM. Now I can setup some
TVars in IO and then hand-off to STM for the niceness of "orElse".
-- ryan
On Tue, Apr 22, 2008 at 6:46 PM, Matthew Brecknell
Ryan Ingram said:
retryUntil :: TVar a -> (a -> Bool) -> STM ()
[...]
the semantics would be that the transaction log, instead of saying "I read from v" would say "I read from v and failed because v didn't satisfy this predicate".
Changes to any other variable in the log would have the same effect as always: restarting the transaction. This is actually required in my desired use case; I want to block until "now" becomes >= t, or a different TVar gets filled with a non-Nothing value.
I see now what you mean, and I can see that it might be a nice little addition. By giving the STM runtime additional information about what conditions would allow the transaction to progress, it might be able to save some false retries. Note that the only work that is saved is the work done to get to the retryUntil.
However, strictly speaking, the STM runtime doesn't need this additional information. You could view a composite STM transaction as an inverted tree, with individual reads at the leaves, merging through monadic bindings towards the root, which is the final result (which includes the transaction's atomic updates). A transaction waiting on a retry is just one of these trees in suspended animation, waiting for a new value to be written to one of the leaves. When such a value is written to one of the leaves, only its downstream nodes need re-evaluation.
Implementing STM that way may not be worth the effort, though, whereas retryUntil might be.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi, I think this could be a nice addition. What do you think about a slight change: readTVarWhen :: TVar a -> (a -> bool) -> STM a This would retry until the (a->bool) function returns true (and, of course, as with a normal "retry", the implementation would need to watch all of the TVars that have been read). This would avoid replicating the condition in the control flow of the program and the parameter to retryUntil, and avoid the possibility that the two conditions might be inconsistent (or can you see any examples where it might be useful to have different conditions)? Do you have a program where the basic implementation with "retry" is performing badly? If you're OK distributing it then could we add it to the STM benchmark suite that the group at Barcelona Supercomputing Center are developing? Thanks, Tim -----Original Message----- From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Ryan Ingram Sent: 22 April 2008 14:49 To: haskell-cafe@haskell.org Subject: [Haskell-cafe] Stronger STM primitives needed? Or am I just doing it wrong? How can I implement the following operation efficiently in STM? Given a TVar "now", waitFor t0 = do t <- readTVar now if (t < t0) then retry else return () This naive implementation has the problem that the transaction gets restarted every time "now" gets updated, even if the new value is still less than t0. One primitive that would be strong enough is this: retryUntil :: TVar a -> (a -> Bool) -> STM () although it would still do some computation every time "now" changed. The thought I originally had was to register the "waitFor" time with some helper which kept track of the current time and fired off a notice when it was ready. But the problem with that is that "retry" undoes all the work of setting that up; the goal is still to block, but I want a stronger blocking primitive. Does anyone have any ideas? -- ryan Here's the background: I'm following along with Conal's excellent FRP paper at http://conal.net/papers/simply-reactive/ In his implementation of futures, Conal uses a race primitive which spawns two threads each computing a value that (if successful) is guaranteed to agree. I was (and still am) sure that an implementation using STM could avoid needing to race, with careful use of orElse & retry. Here's the datatype I'm using for Future:
-- semantic type for future -- t = type of Time, an instance of Bounded. a = value type type F t a = (t, a) force :: Future t a -> F t a
data Future t a = Fut { waitFor :: t -> STM () , value :: STM (t, a) }
firstFuture :: Future t a -> Future t a -> Future t a -- with semantics: -- force (firstFuture f1 f2) = (min t1 t2, if t1 <= t2 then v1 else v2) -- where -- (t1, v1) = force f1 -- (t2, v2) = force f2
Each future lives in some universe; imagine you have the following: ] type Universe t -- abstract ] P :: Bounded t => Universe t -- universe of pure values ] R :: Universe Time -- universe of the real world ] univ :: Future t a -> Universe t ] now :: Universe t -> STM t ] -- now P = return maxBound ] -- now R = current time The main thrust of this is that each universe has its own idea of what time it is; but when combining futures, we get a new universe which tracks the later of the times in the two universes. The problem is combining two futures: Lets say we have: anyKey :: Future Time () -- fires when the user first presses a key thousand :: Num t => Future t () -- fires at tick 1000 thousand = exactly 1000 () and we are evaluating force (firstFuture anyKey thousand) Now, "clock" is going to fire at tick 1000; we know this because it lives in the pure universe and its value is always available. So firstFuture can get the following: x1 <- maybeSTM (value anyKey) -- x1 = Nothing x2 <- maybeSTM (value clock) -- x2 = Just (1000, ()) ... -- synchronize P with R at tick 1000 waitFor anyKey 1000 -- if nothing has changed before tick 1000 return (1000, ()) -- converts a possibly-retry-calling STM into one that never fails maybeSTM m = fmap Just m `orElse` return Nothing So if we use the naive implementation for "waitFor" in terms of "now", the whole transaction will get re-evaluated every tick. I only want it to get re-evaluated if the user presses a key (changing some TVar evaluated in "value anyKey"), or tick 1000 passes, whichever comes first. Is there a way to do this? Is my choice of "waitFor" as the basic "universe synchronization" operation too weak? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, 23 Apr 2008 15:54:15 +0100, Tim Harris (RESEARCH) wrote:
What do you think about a slight change:
readTVarWhen :: TVar a -> (a -> bool) -> STM a
This would retry until the (a->bool) function returns true (and, of course, as with a normal "retry", the implementation would need to watch all of the TVars that have been read).
This looks nice to me, anyway. I was confusing myself with my earlier reply. So basically you'd accumulate a list of reads to any given TVar and the old value that was read, and then walk through that list to determine whether to retry a transaction, where: 1. If the predicate fails on the new value, then keep waiting. 2. If the predicate succeeds on the new value: 2a. If the new value is the same as the old, look at next in list. 2b. If the new value is different from the old, retry now. Case 2a is there because the TVar may have been changed to something that fails the predicate, and then changed back to the original. Of course, retrying now is always legal, so you could simplify this by only keeping a record of the first access to the TVar and throwing away the old value; then always do 2b if the predicate succeeds. That would involve considerably less bookkeeping, at the expense of replaying some transactions unnecessarily. Then readTVarWhen would basically be special only if it's the first access to the TVar within the transaction. -- Chris Smith

On Wed, Apr 23, 2008 at 7:54 AM, Tim Harris (RESEARCH)
What do you think about a slight change:
readTVarWhen :: TVar a -> (a -> bool) -> STM a
This seems strictly less powerful than retryUntil:
readTVarWhen v p = retryUntil v p >> readTVar v
Consider the following transaction:
intV :: TVar Int boolV :: TVar Bool
interesting = atomically $ do retryUntil intV (> 50) retryUntil boolV id
Lets say that intV contains 100 and boolV contains False. Then this transaction retries. Now, if intV changes to 101, this transaction doesn't need to re-run; we can see immediately that no predicate changed. Using "readTVarWhen", this is less clear; the transaction log would hold a read on intV which would be more difficult to ignore. -- ryan

On Apr 23, 2008, at 12:13 PM, Ryan Ingram wrote:
On Wed, Apr 23, 2008 at 7:54 AM, Tim Harris (RESEARCH)
wrote: What do you think about a slight change:
readTVarWhen :: TVar a -> (a -> bool) -> STM a
This seems strictly less powerful than retryUntil:
readTVarWhen v p = retryUntil v p >> readTVar v
Consider the following transaction:
intV :: TVar Int boolV :: TVar Bool
interesting = atomically $ do retryUntil intV (> 50) retryUntil boolV id
Lets say that intV contains 100 and boolV contains False. Then this transaction retries. Now, if intV changes to 101, this transaction doesn't need to re-run; we can see immediately that no predicate changed.
How can we tell, though? In effect, I need to either say "I care when intV changes" or I need read intV again and make sure that (> 50) still holds before I can commit.
Using "readTVarWhen", this is less clear; the transaction log would hold a read on intV which would be more difficult to ignore.
In order to guarantee that your test is atomic wrt the rest of the transaction, you'll need to do the same. What you do in response to a change in intV might be different, though. I've been trying to decide whether either of these is implementable in terms of `orElse`, in such a way that we immediately check the predicate upon retry before doing anything else. I can't quite make up my mind whether this is possible or not. -Jan-Willem Maessen
-- ryan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 4/23/08, Jan-Willem Maessen
I've been trying to decide whether either of these is implementable in terms of `orElse`, in such a way that we immediately check the predicate upon retry before doing anything else. I can't quite make up my mind whether this is possible or not.
I do not think it is possible; consider this case: broken = atomically $ do v <- expensive_computation :: STM (TVar Int) retryUntil v (> 50) Given that you don't know which tvar to use until the end of the expensive computation, I don't see how you can lift "orElse" to the make that tvar be the first thing checked when the transaction is rerun.

On Wed, Apr 23, 2008 at 12:12:15PM -0700, Ryan Ingram wrote:
On 4/23/08, Jan-Willem Maessen
wrote: I've been trying to decide whether either of these is implementable in terms of `orElse`, in such a way that we immediately check the predicate upon retry before doing anything else. I can't quite make up my mind whether this is possible or not.
I do not think it is possible; consider this case:
broken = atomically $ do v <- expensive_computation :: STM (TVar Int) retryUntil v (> 50)
Given that you don't know which tvar to use until the end of the expensive computation, I don't see how you can lift "orElse" to the make that tvar be the first thing checked when the transaction is rerun.
I'm confused as to how your retryUntil gains you anything. If any of the TVars used in the expensive_computation change while waiting for a retry, then the expensive_computation will need to be done again. If none of them change, then we can skip the expensive_computation. How does retryUntil help us with this? i.e. how does your broken function using retryUntil differ from the following? broken = atomically $ do v <- expensive_computation :: STM (TVar Int) vv <- readTVar v unless (vv > 50) retry -- David Roundy Department of Physics Oregon State University

On 4/23/08, David Roundy
I'm confused as to how your retryUntil gains you anything. If any of the TVars used in the expensive_computation change while waiting for a retry, then the expensive_computation will need to be done again. If none of them change, then we can skip the expensive_computation.
Does the STM runtime do this?
i.e. how does your broken function using retryUntil differ from the following?
broken = atomically $ do v <- expensive_computation :: STM (TVar Int) vv <- readTVar v unless (vv > 50) retry
If the STM runtime does the optimization you suggest, it's not different at all. But consider this computation:
-- non-primitive retryUntil: retryUntil v p = do x <- readVar v unless (p x) retry
broken2 = atomically $ do (v1, v2) <- expensive_computation :: STM (TVar Int, TVar Int) retryUntil v1 (> 50) x <- expensive_computation2 :: STM Int retryUntil v2 (> x)
If v1 succeeds and v2 fails, then v1 changes to some other value > 50, I am sure that the STM runtime as it stands now will re-run expensive_computation2. -- ryan

On Wed, Apr 23, 2008 at 01:46:53PM -0700, Ryan Ingram wrote:
On 4/23/08, David Roundy
wrote: I'm confused as to how your retryUntil gains you anything. If any of the TVars used in the expensive_computation change while waiting for a retry, then the expensive_computation will need to be done again. If none of them change, then we can skip the expensive_computation.
Does the STM runtime do this?
i.e. how does your broken function using retryUntil differ from the following?
broken = atomically $ do v <- expensive_computation :: STM (TVar Int) vv <- readTVar v unless (vv > 50) retry
If the STM runtime does the optimization you suggest, it's not different at all.
I doubt it does this, but my point is that you aren't suggesting a new primitive, you're suggesting an improved runtime. Once you've got your improved runtime, this optimization is trivial, as far as I can tell. And without an improved runtime, your function is equivalent to this one.
But consider this computation:
-- non-primitive retryUntil: retryUntil v p = do x <- readVar v unless (p x) retry
broken2 = atomically $ do (v1, v2) <- expensive_computation :: STM (TVar Int, TVar Int) retryUntil v1 (> 50) x <- expensive_computation2 :: STM Int retryUntil v2 (> x)
If v1 succeeds and v2 fails, then v1 changes to some other value > 50, I am sure that the STM runtime as it stands now will re-run expensive_computation2.
I suppose it depends on how you rewrite the runtime. Rather than your very limited retryUntil + caching of results computed in the STM, why not make this caching explicit, and allow users to write their own more complicated variants of retryUntil? e.g. retryUntil2 :: TVar a -> TVar b -> (a -> b -> Bool) -> IO () A simple primitive to do this (in combination with a totally rewritten STM runtime) would be subatomically :: ReadOnlySTM a -> STM () which would run a STM computation that is guaranteed to have no side-effects (i.e. can't write to TVars) and ignore its results (and let the runtime know that the results have been ignored). Then your extra-fancy retryUntil could simply be. retryUntil v p = subatomically $ do x <- readVarRO v unless (p x) retryRO The only thing that is special about your retryUntil is that it does not allow the modification of TVars. On the other hand, I suspect the whole issue is silly. Is it ever actually wise to do an expensive calculation in the STM monad? That seems like it's guaranteed to be a performance nightmare. -- David Roundy Department of Physics Oregon State University

David Roundy wrote:
A simple primitive to do this (in combination with a totally rewritten STM runtime) would be
subatomically :: ReadOnlySTM a -> STM ()
which would run a STM computation that is guaranteed to have no side-effects (i.e. can't write to TVars) and ignore its results (and let the runtime know that the results have been ignored).
Matthew Brecknell wrote:
Nevertheless, the distinction between read-only and read-write transactions does not necessarily have to occur at the level of types. STM is fundamentally a dynamic approach to concurrency control, so I think it would make sense for transactions to *dynamically* determine whether they are read-only or read-write, as they compose with each other.
In that case, we can treat subatomic as a "hint" to the STM runtime.
subatomic :: STM a -> STM ()
Concerning this question of whether the argument to subatomic should statically or dynamically be known to be read-only, there is also the option of collapsing ReadOnlySTM a and STM a by changing the semantics of writeTVar . I mean, writeTVar can be used for two different things: 1) communicate with other threads, i.e. crossing atomically boundaries 2) communicating inside a single thread/STM action à la mutable state (IORef). We only want 1), but 2) is expendable, we can always use parameters to pass state around or wrap the whole thing into a state monad. For 1), it's enough to have a primitive scheduleWriteTVar :: TVar a -> a -> STM () that ensures to write the TVar at the very end of the atomically block. (This is similar to scheduleIO :: IO a -> STM ()). In other words, the current STM semantics can be implemented in terms of ReadOnlySTM assuming that the latter has such a primitive for scheduling. type ReadOnlySTM a = StateT TVarEnvironment STM a Regards, apfelmus

On Thu, 24 Apr 2008 12:57:56 +0200, apfelmus wrote:
there is also the option of collapsing ReadOnlySTM a and STM a by changing the semantics of writeTVar .
I mean, writeTVar can be used for two different things:
1) communicate with other threads, i.e. crossing atomically boundaries 2) communicating inside a single thread/STM action à la mutable state (IORef).
We only want 1), but 2) is expendable, we can always use parameters to pass state around or wrap the whole thing into a state monad. For 1), it's enough to have a primitive
scheduleWriteTVar :: TVar a -> a -> STM ()
that ensures to write the TVar at the very end of the atomically block.
Unfortunately, though, this breaks the very thing that makes STM attractive: namely, composability. Now in order to work with a TVar, I need to know whether anything that came before me might have modified it, and if so take the current value as a parameter instead of reading it like normal. Or am I misunderstanding something? -- Chris Smith

Chris Smith wrote:
apfelmus wrote:
For 1), it's enough to have a primitive
scheduleWriteTVar :: TVar a -> a -> STM ()
that ensures to write the TVar at the very end of the atomically block..
Unfortunately, though, this breaks the very thing that makes STM attractive: namely, composability. Now in order to work with a TVar, I need to know whether anything that came before me might have modified it, and if so take the current value as a parameter instead of reading it like normal.
Or am I misunderstanding something?
You're correct, that's what I meant. But it's nothing more and nothing less than the purely functional way of dealing with "mutable" state, isn't it? And you need a parameter anyway, namely the TVar a itself. I mean, when it's in scope like in do a <- readTVar v writeTVar v (a+1) readTVar v you don't need a parameter. But if the do-block is broken up into functions, you need a parameter foo v = do a <- readTVar v writeTVar v (a+1) bar v bar v = readTVar v and you may as well supply its value instead of the reference v . Regards, apfelmus

-- non-primitive retryUntil: retryUntil v p = do x <- readVar v unless (p x) retry
broken2 = atomically $ do (v1, v2) <- expensive_computation :: STM (TVar Int, TVar Int) retryUntil v1 (> 50) x <- expensive_computation2 :: STM Int retryUntil v2 (> x)
If v1 succeeds and v2 fails, then v1 changes to some other value > 50, I am sure that the STM runtime as it stands now will re-run expensive_computation2.
But expensive_computation2 is in STM. This means that it *should* be rerun, no? Between the first run and the retry, the result of expensive_computation2 may well have changed. If it doesn't get rerun, then we may have violated another transactional integrity constraint unknowingly. STM transactions should either happen once, or not at all. But if expensive_computation2 is not rerun, there's the possibility of "spooky action at a distance" effects, as I understand it? A more specific solution would be to build a memo- table (either with a CAF or via either TVars or MVars) structure into expensive_computation2 such that if it is rerun with the same TVar input, its work doesn't need to be recomputed. --S.

On 4/23/08, Sterling Clover
But expensive_computation2 is in STM. This means that it *should* be rerun, no? Between the first run and the retry, the result of expensive_computation2 may well have changed.
Ah, but that's not true; the main "good thing" about retry is that you don't have to keep running the computation; instead you wait until something that you accessed in your transaction log changes and -then- you rerun it. Right now the transaction log for "retry" only contains "read from variable X" (for some list of variables). But it could instead contain "read from variable X" and "read from variable X using retryUntil with predicate p which gave result True". Then we have a "smarter" log which can use the pure predicate p to give more information about whether or not the whole transaction can run or whether it is just guaranteed to fail again. If we know a transaction is guaranteed to fail, we can skip running it because we know it will not commit. Given the semantics of retryUntil, it is impossible that changing v1 affects the results of running this STM computation -unless- it or something else prior in the computation read from v1, or the result of the predicate changes. No spooky-action-at-a-distance is possible. David's more general "subatomically" primitive would achieve the same results; the proof is that (1) no side effects can be caused by the subatomic action, that is, no writes happen which could change future reads. (2) the result of the computation is ignored -except- for whether it retries or returns, that is, it can't affect the control flow of the rest of the transaction. So, if have a transaction T that is waiting inside "retry" for a variable that it read to change, and a variable that is only accessed in a "subatomic" part of T is changed, we can try running the subatomic computation first. Here are the four cases: 1) The subatomic computation succeeded before and still succeeded. Then we know the end result of the computation is unaffected, and will still retry. No need to do anything. 2) The subatomic computation succeeded before and now fails (calls 'retry' or retryRO'). Then we know that the computation will now fail at this earlier point. Mark the change to "fail" in the transaction log and leave the computation in the "waiting for retry" state. 3) The subatomic computation failed before and still fails. See (1) 4) The subatomic computation failed before and now succeeds. The result of the entire computation can change, we should now re-run the entire computation. -- ryan

Ryan Ingram said:
So, if have a transaction T that is waiting inside "retry" for a variable that it read to change, and a variable that is only accessed in a "subatomic" part of T is changed, we can try running the subatomic computation first. Here are the four cases:
1) The subatomic computation succeeded before and still succeeded. Then we know the end result of the computation is unaffected, and will still retry. No need to do anything. 2) The subatomic computation succeeded before and now fails (calls 'retry' or retryRO'). Then we know that the computation will now fail at this earlier point. Mark the change to "fail" in the transaction log and leave the computation in the "waiting for retry" state. 3) The subatomic computation failed before and still fails. See (1) 4) The subatomic computation failed before and now succeeds. The result of the entire computation can change, we should now re-run the entire computation.
I'm trying to figure out whether subatomic could be weakened to allow writes as well as reads. I don't think this change would affect cases 2 to 4 above. But in case 1, the subatomic computation might perform a different set of writes, which might affect the outcome of the outer computation, so it is not safe to continue blocking. It's case 1 which makes retryUntil (and subatomic) stronger than readTVarWhen. If it's not possible to weaken subatomic to allow writes, without affecting case 1, then I think this also means that subatomic/retryUntil is stronger than the hypothetical "continuation-logging" implementation previously hinted at by David, apfelmus and myself (that is, one which treats each individual read as a kind of checkpoint, by recording the read's continuation in the transaction log, and using that continuation to restart the blocked transaction). Nevertheless, the distinction between read-only and read-write transactions does not necessarily have to occur at the level of types. STM is fundamentally a dynamic approach to concurrency control, so I think it would make sense for transactions to *dynamically* determine whether they are read-only or read-write, as they compose with each other. In that case, we can treat subatomic as a "hint" to the STM runtime. It could have a simpler type, and the semantics of "id": subatomic :: STM a -> STM a If the subatomic transaction turns out to be read-only, then we get the benefit of all four cases Ryan describes above. If it turns out to be read-write, we only get the benefit of cases 2 to 4, while case 1 must restart. It doesn't matter if the subatomic transaction captures variables which depend on previous reads, since changes to those reads would cause a restart regardless of the outcome of the subatomic transaction. Moreover, note that the hypothetical "continuation-logging" implementation could implement (m >>= k) by implicitly wrapping every m in a call to subatomic. Of course, that would require a lot of speculative book-keeping. I think this means that subatomic is not a fundamental abstraction, but could be a useful pragmatic optimisation.

I said:
In that case, we can treat subatomic as a "hint" to the STM runtime. It could have a simpler type, and the semantics of "id":
subatomic :: STM a -> STM a
If the subatomic transaction turns out to be read-only, then we get the benefit of all four cases Ryan describes above. If it turns out to be read-write, we only get the benefit of cases 2 to 4, while case 1 must restart. It doesn't matter if the subatomic transaction captures variables which depend on previous reads, since changes to those reads would cause a restart regardless of the outcome of the subatomic transaction.
Scrap that. I realise now that it allows leakage of information read from variables. It would need to be: subatomic :: STM a -> STM () Which means it's no longer just a hint.

Ryan Ingram wrote:
No spooky-action-at-a-distance is possible. David's more general "subatomically" primitive would achieve the same results; the proof is that (1) no side effects can be caused by the subatomic action, that is, no writes happen which could change future reads. (2) the result of the computation is ignored -except- for whether it retries or returns, that is, it can't affect the control flow of the rest of the transaction.
So, if have a transaction T that is waiting inside "retry" for a variable that it read to change, and a variable that is only accessed in a "subatomic" part of T is changed, we can try running the subatomic computation first. Here are the four cases:
1) The subatomic computation succeeded before and still succeeded. Then we know the end result of the computation is unaffected, and will still retry. No need to do anything. 2) The subatomic computation succeeded before and now fails (calls 'retry' or retryRO'). Then we know that the computation will now fail at this earlier point. Mark the change to "fail" in the transaction log and leave the computation in the "waiting for retry" state. 3) The subatomic computation failed before and still fails. See (1) 4) The subatomic computation failed before and now succeeds. The result of the entire computation can change, we should now re-run the entire computation.
Sounds good. But I wonder what "obscure" optimization comes next; can we have a toy-model of STM? I mean, it should be possible to express both the "continuation-logging" and "read-only-fail" optimization in terms of type STM a = Maybe a or similar? Regards, apfelmus

On 24 Apr 2008, at 12:02, apfelmus wrote:
Sounds good. But I wonder what "obscure" optimization comes next; can we have a toy-model of STM? I mean, it should be possible to express both the "continuation-logging" and "read-only-fail" optimization in terms of
type STM a = Maybe a
or similar?
There's a pure version of STM in the latest version of the IOSpec library: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/IOSpec-0.2 I've used it to test a few non-trivial applications. I should warn you that it does use a very simple stop-the-world semantics - the model may not be fine-grained enough to try out all kinds of optimisations. Wouter

broken2 = atomically $ do (v1, v2) <- expensive_computation :: STM (TVar Int, TVar Int) retryUntil v1 (> 50) x <- expensive_computation2 :: STM Int retryUntil v2 (> x)
Ah. I think I see now. I had thought you were trying to give more power to
the primitive than you are.
But I'm still finding the example confusing, in that expensive_computation2
is in STM, and thus isn't pure. The point, I suppose, would rather be to
assert that retryUntil v2 (> x) depends on what v2 depends on, and what x
depends on, but *not* on what v1 depends on, and thus to retry only when the
appropriate portion of the dependency tree changed.
However, in this case, that still doesn't buy you anything, I think, because
from the information given, one can't disentangle what v1 and v2 depend on?
I hope I'm somewhat closer to the mark here.
--S
On 4/23/08, Ryan Ingram
On 4/23/08, Sterling Clover
wrote: But expensive_computation2 is in STM. This means that it *should* be rerun, no? Between the first run and the retry, the result of expensive_computation2 may well have changed.
Ah, but that's not true; the main "good thing" about retry is that you don't have to keep running the computation; instead you wait until something that you accessed in your transaction log changes and -then- you rerun it. Right now the transaction log for "retry" only contains "read from variable X" (for some list of variables). But it could instead contain "read from variable X" and "read from variable X using retryUntil with predicate p which gave result True". Then we have a "smarter" log which can use the pure predicate p to give more information about whether or not the whole transaction can run or whether it is just guaranteed to fail again. If we know a transaction is guaranteed to fail, we can skip running it because we know it will not commit.
Given the semantics of retryUntil, it is impossible that changing v1 affects the results of running this STM computation -unless- it or something else prior in the computation read from v1, or the result of the predicate changes.
No spooky-action-at-a-distance is possible. David's more general "subatomically" primitive would achieve the same results; the proof is that (1) no side effects can be caused by the subatomic action, that is, no writes happen which could change future reads. (2) the result of the computation is ignored -except- for whether it retries or returns, that is, it can't affect the control flow of the rest of the transaction.
So, if have a transaction T that is waiting inside "retry" for a variable that it read to change, and a variable that is only accessed in a "subatomic" part of T is changed, we can try running the subatomic computation first. Here are the four cases:
1) The subatomic computation succeeded before and still succeeded. Then we know the end result of the computation is unaffected, and will still retry. No need to do anything. 2) The subatomic computation succeeded before and now fails (calls 'retry' or retryRO'). Then we know that the computation will now fail at this earlier point. Mark the change to "fail" in the transaction log and leave the computation in the "waiting for retry" state. 3) The subatomic computation failed before and still fails. See (1) 4) The subatomic computation failed before and now succeeds. The result of the entire computation can change, we should now re-run the entire computation.
-- ryan

Ryan Ingram wrote:
Consider the following transaction:
intV :: TVar Int boolV :: TVar Bool
interesting = atomically $ do retryUntil intV (> 50) retryUntil boolV id
Lets say that intV contains 100 and boolV contains False. Then this transaction retries. Now, if intV changes to 101, this transaction doesn't need to re-run; we can see immediately that no predicate changed. Using "readTVarWhen", this is less clear; the transaction log would hold a read on intV which would be more difficult to ignore.
I don't quite understand what you want to do but I presume it's related to the following: given an expression like readTVar intV >>= (\ -> ... readTVar boolV >>= (\_ -> ... retry)) The ... indicate branches that are there have not been taken in our example run, so the STM-side-effects that have been performed are readTVar intV readTVar boolV retry The thread waits for either intV or boolV to change. Now, assume that boolV changes. Then, the idea for improving performance is to not restart the whole transaction, but only the part after the readTVar boolV . In other words, only the continuation (\_ -> ... retry) will be executed again (and possibly yield something different from retry ). I'm not sure whether this is currently implemented in Control.STM. It seems that your scheme wants even more. You want to avoid work when intV changes, because the predicate for boolV clearly indicates that no matter what intV is, we'll have to retry anyway. Unfortunately, I don't think it works: the predicate itself might depend on intV interesting = atomically $ readTVar intV >>= $ \i -> if i > 50 then retry else retryUntil boolV (even i ==) That's the general property of >>= , you always have to evaluate its right argument when the left argument changes. In essence, we have the same problem with parser combinators. Applicative functors to the rescue! Regards, apfelmus

On 4/23/08, apfelmus
I don't quite understand what you want to do but I presume it's related to the following: given an expression like
readTVar intV >>= (\ -> ... readTVar boolV >>= (\_ -> ... retry))
The ... indicate branches that are there have not been taken in our example run, so the STM-side-effects that have been performed are
readTVar intV readTVar boolV retry
The thread waits for either intV or boolV to change. Now, assume that boolV changes. Then, the idea for improving performance is to not restart the whole transaction, but only the part after the readTVar boolV . In other words, only the continuation (\_ -> ... retry) will be executed again (and possibly yield something different from retry ). I'm not sure whether this is currently implemented in Control.STM.
This isn't exactly correct, no. The idea was to attach an additional predicate to readTVar in the STM log, so that we know the result cannot affect the computation as long as the predicate is unchanged.
It seems that your scheme wants even more. You want to avoid work when intV changes, because the predicate for boolV clearly indicates that no matter what intV is, we'll have to retry anyway. Unfortunately, I don't think it works: the predicate itself might depend on intV
interesting = atomically $ readTVar intV >>= $ \i -> if i > 50 then retry else retryUntil boolV (even i ==)
That's the general property of >>= , you always have to evaluate its right argument when the left argument changes. In essence, we have the same problem with parser combinators. Applicative functors to the rescue!
Ah, but that's the exact thing that retryUntil prevents; retryUntil doesn't return the value of the variable read; it has exactly two options: retryUntil v p ~= do x <- readTVar v if (p x) then return () else retry Now, a simple implementation would re-run the computation after each change to v. But we can take advantage of the knowledge that retryUntil imparts no knowledge to the rest of the computation besides "this predicate succeeded on the contents of this tvar", to make the transaction log smarter. So, in the case of "interesting", the transaction log would look something like this: retryUntil intV (> 50) => True retryUntil boolV id => False retry Now, lets say intV changes from 100 to 101; we can look at this transaction log, re-test the predicate (> 50), notice that the log itself remains unchanged, and leave the transaction suspended without worrying that the remainder of the computation was affected. Using Tim Harris' proposed "readTVarWhen" combinator, this guarantee is weakened; although the rule does hold for reads that fail the predicate: interesting2 = atomically $ do x <- readTVarWhen intV (>50) readTVarWhen intV2 (x ==) readTVarWhen intV (>50) => True readTVarWhen intV2 (x ==) => False retry If intV2 changes but the predicate stays false, we don't have to re-run the computation, but if intV changes we absolutely do. This is made clear by my definition of readTVarWhen: readTVarWhen v p = retryUntil v p >> readTVar v In this case the transaction log would look like this: retryUntil intV (>50) => True readTVar intV retryUntil intV2 (== x) => False retry Now if intV changes it's clear in the transaction log that the transaction needs to be re-run. -- ryan
participants (9)
-
apfelmus
-
Chris Smith
-
David Roundy
-
Jan-Willem Maessen
-
Matthew Brecknell
-
Ryan Ingram
-
Sterling Clover
-
Tim Harris (RESEARCH)
-
Wouter Swierstra