Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

29 Dec 2008

      On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram  wrote:
...
Both readTVar and writeTVar are worse than O(1); they have to look up
the TVar in the transaction log to see if you have made local changes
to it.
Right now it looks like that operation is O(n) where n is the number
of TVars accessed by a transaction, so your big transaction which is
just accessing a ton of TVars is likely O(n^2).
So this actually brings up a tangential question I've wondered about
for a while.

The lock-free datastructure paper at
http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-f...
shows lock vs. STM with very similar performance for single processor
with STM quickly winning on multi-processors.  I understand the basic
idea as being that the STM versions are optimistic in that they only
pay a cost on a transaction collision, while locks have to go frob
some bit of memory on every read or write.  I'm guessing that STM
would also show less contention on a read-heavy load since you don't
need block anyone if you are just reading (though I guess you can get
the same effect with shared-read locks in a locking style?).

But every STM operation has to modify a transaction log, which seems
like it should be even more expensive than frobbing a lock bit.  So it
seems like if the per-operation STM overhead is higher, and blocking
contention is the same (assuming the lock implementation uses shared
locks for reads), I don't see how the STM implementation could be
faster.

I'm sure it's all more complicated than this... but where exactly is
the STM performance coming from in those papers?  Also, they disclaim
in the testing section that the STM implementation was immature, but
now parallel GC and perhaps STM doesn't do so much dynamic allocation
(?), shouldn't the STM numbers be even better?

Re: [Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Evan Laforge