
Slightly related: I think it would be interesting to compare a Disruptor-based concurrency communications mechanism and compare it to e.g. TChans 1. Disruptor - http://code.google.com/p/disruptor/
From: Ryan Newton
I think so. Atomically reading and writing a single memory location
(which CAS does) is just a very simple transaction. But using a CAS instruction should be more efficient, since STM has to maintain a transaction log and commit transactions, which creates some overhead.
Ah, I see. In that case, it may be worthwhile to implement the CAS instruction in terms of STM as well and measure the performance difference this makes for the final data structure. After all, STM is a lot more compositional than CAS, so I'd like to know whether the loss of expressiveness is worth the gain in performance.
There's one annoying hitch with doing apples-to-apples comparisons here.
The problem is that CAS falls short of being a hardware-accelerated version of a simple transaction (read TVar, (==) against expected value, conditionally update TVar), because CAS is based on pointer equality rather than true equality. (eq? rather than equal? for any Schemers out there.)
For this reason our "Fake" version of CAS -- for older GHCs and for performance comparison -- has to use reallyUnsafePointerEquality#:
http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-I...
But we do provide a "CASable" type class in that package which is precisely for comparing the performance of:
- Hardware CAS - Fake CAS -- atomicModifyIORef + ptrEquality - Foreign CAS -- gcc intrinsic + function call