
It's an interesting problem to tackle, but I think we should bear in mind that STM already solves this, so you should weigh any solution against whether it would be better to just use STM. (or indeed, whether it's better to spend effort on improving the performance of STM, where there's plenty of low-hanging fruit). Don't let me discourage you when you seem to be having fun though :) I'll be happy to review your design when it settles down a bit more. Cheers, Simon On 15/02/13 09:53, Alexander Kjeldaas wrote:
Ok, another round of fixes and insights:
I've renamed the "tso" field in MVarGroup to "blocked_tso" and strenghtened invariants so that it is only defined in the state where the TSO is actually blocking. I also used 'TAKEN' instead of 'PUT' in the state names. That must have been confusing, sorry about that. I've renamed the states to be clearer.
It is important to understand that this MVarGroup is a "one-shot" animal. It can be created, MVars can be registered, and then it is possible to block on it. After that, no operation on it make any sense. Another blocking operation will have to create a new MVarGroup.
There are good reasons for this, and I'll show why this is an almost necessary price to pay for not locking all MVars atomically. A higher level API should be able to paper over this by keeping the MVar set that the user sees away from the RTS, and creating the MVarGroup inside the 'takeMVarGroup' function.
So we assume we cannot atomically remove the MVarGroup from the TSOQueue of all of the MVars once one of the MVars have been 'put'. Instead we ask the MVars to ignore an INVALID MVarGroup and just remove it from the TSOQueue wait list.
If we ever were to change the state of an MVarGroup from INVALID back to one of the active states, then a random subset of the registered MVars will have the MVarGroup in their TSOQueue which wouldn't make sense.
*Here is me trying to do a multi-shot (block multiple times) design:* Another option would be to ask the MVars to *ignore, but not remove* an MVarGroup that is in a LATENT state. This state would be "we're still registering MVars, and haven't started blocking yet. Nothing to see here, please ignore". From the MVar's point of view, this TSOQueue object is not on the wait list, but it is not removed either. This means that we indeed can change the state of the MVarGroup and it will suddenly be active in all MVars.
However there's a problem. When calling putMVar and the only TSOQueue element is a LATENT one, it should be ignored, right? So the putMVar's TSO wants to block. This requires the TSO that calls 'takeMVarGroup', and that wants to change out from the LATENT state to go through all associated MVars to wake up a blocked one. This seems to introduce a certain level of complexity, for example keeping track of all MVars associated with an MVarGroup.
All of these pointers firmly tie the lifetime of the MVarGroup to the set of MVars that refer to it. The multi-shot MVarGroup requires manual resource management to disentangle it from MVars, or it will outlive the last MVar it has been associated with, even though it is not used.
Because of the lifetime problem, it will be "necessary" to support a 'unregisterMVar' function that makes it possible to discard an MVarGroup, and possibly more set operations.
Unfortunately, unregisterMVar and being able to discard MVarGroups introduces empty MVarGroups.
Empty MVarGroups are invalid to block on. Granted single-shot MVarGroups can only be blocked on once, so both APIs are fragile, but single-shot MVarGroups can naturally be wrapped in a safe higher-level construct that doesn't see the RTS' MVarGroup.
Thus all in all, this adds somewhat more record-keeping and a slightly more fragile API because functions that couldn't fail now can fail, and care must be taken to not make GC of MVarGroups hard. *// end of multi-shot discussion *
Another insight: unionMVarGroup is not well defined in all cases, and is thus probably not a good idea. Both MVarGroups can have an MVar put, and thus the natural result would be to return both elements when blocking. That's not a good primitive.
Another insight: MVars and MVarGroups are semantically equivalent enough that I think takeMVar can be implemented using an MVarGroup. This means that they could conceptually be collapsed into one concept. I think this is possible by conceptually splitting the 'put' and 'take' sides of the object. Today we have:
data MVar a = MVar (MVar# RealWorld a)
we could instead have something like
data MVar a = MVar (MVar# RealWorld a, [MVar# RealWorld a])
where the first tuple element is used for 'put', and the set consisting of the first and second tuple elements are used for 'take', with one-shot MVarGroups used in the background.
Here's a new version of the comment with better invariants and naming for one-shot MVarGroup:
/* An MVarGroup represents a group of MVars that a TSO waits on. An MVarGroup must be locked to be investigated by the MVar code paths.
Locking: Since an MVar also must be locked, the lock order is this: First MVar, then MVarGroup. Note that we never hold two MVars at the same time.
* Invariants:* * - At least one MVarTSOQueue element must point to an MVarGroup.* * - "blocked_tso" is only defined in state NOT_PUT_WITH_TSO* * - "deferred_result" is only defined in state PUT_NO_TSO*
* In the higher-level API we further strenghten the first invarant:* * - MVars can only be added to an MVarGroup, not removed.*
The MVarGroup acts as a synchronization point that decides which MVar "won" the 'put', and as a storage area for the result *if no* * TSO has blocked on the group yet*.
The MVarGroup has two boolean states that starts off as false, and can only transition to true: 1) Whether an MVar in the MVarGroup is 'put', and *2) whether a TSO is blocking*.
The MVarGroup goes through the following states.
*NOT_PUT_NO_TSO* (false, false): No MVars have been 'put'. No TSO associated. This is the state of the group in the normal case while MVars are being added to the group. Next: PUT_NO_TSO or NOT_PUT_WITH_TSO
*PUT_NO_TSO* (true, false): One MVar has been 'put' without an associated TSO. The value that is 'put' is stored in the "deferred_result". When a TSO later tries to block on the group, this value will be immediately returned. Note that this is the only state where "deferred_result" is defined. Next: INVALID
*NOT_PUT_WITH_TSO* (false, true): A TSO is blocked on the MVarGroup. Getting to this state means that the TSO has built the group of MVars without any MVar having been 'put' while the TSO was doing this. Note that this is the only state where "blocked_tso" is defined. Next: INVALID
INVALID (true, true): One of the associated MVars have been 'put', and the TSO has blocked. This really means that everything is over, and the TSO is unblocked with a result. The individual MVars that are not the first to 'put' a result will see this in their TSO wait queues, and just ignore them.
*/
typedef struct StgMVarGroup_ { StgHeader header; StgWord lock; // Protects everything below /* The state of the MVarGroup */ StgWord state GUARDED_BY(lock); * // When the TSO is "woken up" by a 'put' on an MVar. Only defined* * // in state 'PUT_NO_TSO'* * StgClosure *deferred_result GUARDED_BY(lock);* * // The blocked TSO associated with the MVar group. Only defined in* * // state 'NOT_PUT_WITH_TSO'* * struct StgTSO_ *blocked_tso GUARDED_BY(lock);* }
Alexander
On Fri, Feb 15, 2013 at 9:09 AM, Alexander Kjeldaas
mailto:alexander.kjeldaas@gmail.com> wrote: On Fri, Feb 15, 2013 at 8:54 AM, Edward Z. Yang
mailto:ezyang@mit.edu> wrote: Apologies for the kibitzing.
If anyone, it's me proposing weird stuff :-)
> -- This will fail if the MVarGroup is not associated with the current > process (if the MVarGroup's TSO is not the current TSO)!
I'm not super keen about this requirement. What if I want to create an MVarGroup but pass it to another thread?
Our emails crossed each other :-). As you can see, I think this is solvable.
Alexander
Edward