On Fri, Feb 15, 2013 at 5:38 AM, Alexander Kjeldaas <alexander.kjeldaas@gmail.com> wrote:



On Thu, Feb 14, 2013 at 10:37 PM, Simon Marlow <marlowsd@gmail.com> wrote:
On 14/02/13 20:54, Alexander Kjeldaas wrote:

I ended up staring at the PrimOps.cmm file today, and porting tryPutMVar
to C so it can be used from the RTS.

During my staring, it occurred to me that it should be possible to
implement a wait-on-multiple MVars with mostly no overhead.  I am
guessing that this would be desirable, but I am not sure.

My rough idea is like this:

-- wait for all MVars, return the value of one of them, and the
corresponding index
takeOneMVar :: [MVar a] -> IO (a, Int)

This is implemented by this change:
[snip]

I've occasionally wondered whether we could do this.  So I think you'll have some difficulty implementing the code that blocks, because you have to atomically add your queue element to multiple MVars.  You could lock all the MVars, but you'll need to sort them to avoid lock-order problems, or do an STM-like two-phase locking thing.  It could get pretty hairy.


That's true.  First I didn't think of this.  Then I thought I'd have to sort the MVars, but now after having slept on it, I have this modified design.  I think this is much simpler.

Instead of the group_head and group_link, we replace the 'tso' link with a MVarGroup link in the multi-case.

typedef struct StgMVarTSOQueue_ {
    StgHeader                header;
    struct StgMVarTSOQueue_ *link;
    /* struct StgTSO_          *tso; */
    StgClosure              *tso_or_group;
} StgMVarTSOQueue;

Then actually the MVarGroup doesn't know about the number of MVars, and MVars can be added one by one, but it has a simple state-machine.

What this design really does is decouple MVar registration from blocking.

The invariant that is relaxed in this scheme is this:  An MVarGroup can be in an MVar queue without the TSO blocking.  In this case, the "wakeup" and the MVar result is deferred and stored until the TSO actually does block on the MVarGroup.

/* An MVarGroup represents a group of MVars that a TSO waits on.  An
   MVarGroup must be locked to be investigated by the MVar code paths.

   Locking:
   Since an MVar also must be locked, the lock order is this: First
   MVar, then MVarGroup.
   Note that we never hold two MVars at the same time.

   MVars can only be added to an MVarGroup, not removed from the
   group.  The MVarGroup does not know about the MVars, and the MVars
   do not know about each others, so in theory this could possibly
   work.  The MVarGroup acts as a synchronization point that decides
   which MVar "won" the 'put', and as a storage area for the result if
   the TSO hasn't retrieved the result yet.

   The MVarGroup has two boolean states that starts off as false, and
   can only transition to true: 1) Whether the MVarGroup is taken, and
   2) whether the TSO is blocking.

   The MVarGroup goes through the following states.

   NOT_TAKEN_NOT_BLOCKED (false, false):
        None of the associated MVars have been 'put'.  The TSO is *not
        sleeping/blocking on the MVarGroup*.  This is the state of the
        group in the normal case while MVars are being added to the
        group.  Next: TAKEN_NOT_BLOCKED or NOT_TAKEN_BLOCKED

   TAKEN_NOT_BLOCKED (true, false):
        One of the associated MVars have been 'put'. The value that is
        'put' is stored in the "deferred_result".  This is the state
        of the group when an MVar is 'put' while the TSO is adding
        MVars to the group.  When the TSO tries to block on the group,
        this value will be immediately returned.  Note that this is
        the only state where "deferred_result" is used.  Next: INVALID

   NOT_TAKEN_BLOCKED (false, true):
        The TSO is blocked on the MVarGroup.  Getting to this state
        means that the TSO has built the group of MVars without any
        MVar having been 'put' while the TSO was doing this.  Next:
        INVALID

   INVALID (true, true): One of the associated MVars have been
        'put', and the TSO has blocked.  This really means that
        everything is over, and the TSO is unblocked with a result.
        The individual MVars that are not the first to 'put' a result
        will see this in their TSO wait queues, and just ignore them.

*/

typedef struct StgMVarGroup_ {
    StgHeader        header;
    // The TSO associated with the MVar group.  The TSO might or might
    // not currently be blocked, but eventually it will be.  Immutable.
    struct StgTSO_  *tso;
    StgWord          lock;  // Protects everything below
    /* The state of the MVarGroup */
    StgWord          state GUARDED_BY(mutex);
    // When the TSO is "woken up" by a 'put' on an MVar
    StgClosure      *deferred_result GUARDED_BY(mutex);
}

 
Also, what does the primop look like?  Lists aren't a primitive type, and you can't put MVar# into an Array# (kind mismatch).


I didn't know that, but reading this is what forced the above design :-)

A very flexible API would be the following:

-- Create MVarGroup
newMVarGroup :: IO (MVarGroup a)


Now I understand why it must be illegal to remove an MVar from an MVarGroup.
An invariant that must hold for a MVarGroup is this:  The group can never be empty when a TSO is blocked on it.

One way of doing this is to have a counter in the MVarGroup structure, and support the full set of set operations, but that creates an API where 'takeMVarGroup' can sometimes fail.  That leads to fragile software.

A better way is to make it impossible to create empty MVarGroups by changing newMVarGroup like this:

-- Create MVarGroup with a single element.
newMVarGroup :: MVar a -> IO (MVarGroup a)

 
-- Add an MVar to a MVarGroup. 
-- This will fail if the MVarGroup is not associated with the current process (if the MVarGroup's TSO is not the current TSO)!
registerMVar :: MVarGroup a -> MVar a -> IO (MVarGroup a)


The following is probably clearer about the mutation going on:
-- Add an MVar to a MVarGroup
registerMVar :: MVarGroup a -> MVar a -> IO ()

Now I think that the "same-TSO" restriction on registerMVar can be removed as well.
There is no code-path where the "tso" field of the MVarGroup need to be accessed until after the TSO is blocked on 'takeMVarGroup'.

That leaves a pretty flexible API where MVarGroup can be passed freely between processes, but more importantly 'registerMVar' should never fail.

 
-- Block on the MVarGroup
takeMVarGroup :: MVarGroup a -> IO a


A fourth operation that is consistent with these invariants is:

-- Makes both MVarGroups aliases for the same set of MVars
unionMVarGroup :: MVarGroup a -> MVarGroup a -> IO ()

This can be supported by either overloading the "tso" field of MVarGroup, or adding a "parent" pointer that points to another MVarGroup.

This adds a lock order invariant:  Child MVarGroups must be locked before parent MVarGroups.

Alexander