Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

29 Dec 2008

      Both readTVar and writeTVar are worse than O(1); they have to look up
the TVar in the transaction log to see if you have made local changes
to it.

Right now it looks like that operation is O(n) where n is the number
of TVars accessed by a transaction, so your big transaction which is
just accessing a ton of TVars is likely O(n^2).
...
From ghc/HEAD/rts/STM.c:
static TRecEntry *get_entry_for(StgTRecHeader *trec, StgTVar *tvar,
StgTRecHeader **in) {
  TRecEntry *result = NULL;

  TRACE("%p : get_entry_for TVar %p", trec, tvar);
  ASSERT(trec != NO_TREC);

  do {
    FOR_EACH_ENTRY(trec, e, {
      if (e -> tvar == tvar) {
        result = e;
        if (in != NULL) {
          *in = trec;
        }
        BREAK_FOR_EACH;
      }
    });
    trec = trec -> enclosing_trec;
  } while (result == NULL && trec != NO_TREC);

  return result;
}

STM performance is not really geared towards "big" transactions right
now; in large part because big transactions are likely to starve under
any real workload anyways.  If you have a single-threaded startup bit
to populate your data followed by concurrent small mutations, you can
put the startup in IO using small transactions to populate the data.

  -- ryan

On Sun, Dec 28, 2008 at 8:02 PM, Jim Snow  wrote:
...
I decided to try to implement a graph algorithm using STM.  Each node in the
graph has a set of TVar-protected lists of the nodes it links to and the
nodes that link to it.  Also, there is a global TVar-protected Data.Map that
contains all the nodes in the graph, indexed by name (which is polymorphic):
data Node k r = Node {
fwdPos :: TVar [Node k r],  -- forward links (nodes we like)
fwdNeg :: TVar [Node k r],  -- we allow "negative" links, too (nodes we
don't like)
revPos :: TVar [Node k r],  -- backlinks (nodes that like us)
revNeg :: TVar [Node k r], -- negative back links (nodes that don't like us)
currRep :: r,    -- extra user-defined data
name   :: k     -- node's unique identifier
} deriving Show
data Network k r = Network {
node    :: TVar (M.Map k (Node k r)),  -- map of nodes by name
trusted :: TVar [Node k r]        -- a list of nodes we need to iterate over
occasionally
} deriving Show
I tried loading a datafile of about 20,000 nodes into the graph in one big
transaction, and found that it takes about 4 minutes.  This seemed rather
slow, so I replaced all the TVars with IORefs (and substituted STM with IO
in the type signatures), and the same operation with the new version took
about 6 seconds!
This is all with one thread, so there should be no contention for the TVars.
 Is there something about STM that makes it scale worse than linearly wrt
the number of mutations in a transaction?
Above performance numbers are for ghc-6.10.1.  With ghc-6.8.3, the STM
version takes more than 9 minutes.
According to profiling, one of my trouble spots is this function, which just
adds an entry onto a TVar [a]:
stmcons :: k -> TVar [k] -> STM ()
stmcons x tv =
do xs <- readTVar tv
    writeTVar tv (x:xs)
This seems like it ought to be pretty innocuous, unless the whole list is
getting evaluated each time I cons a new entry, or if readTVar or writeTVar
are much more expensive than they appear.
-jim
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Ryan Ingram