Re: [Haskell-cafe] Google Summer of Code - Lock-free data

Slightly related: I think it would be interesting to compare a Disruptor-based concurrency communications mechanism and compare it to e.g. TChans 1. Disruptor - http://code.google.com/p/disruptor/
From: Ryan Newton
I think so. Atomically reading and writing a single memory location
(which CAS does) is just a very simple transaction. But using a CAS instruction should be more efficient, since STM has to maintain a transaction log and commit transactions, which creates some overhead.
Ah, I see. In that case, it may be worthwhile to implement the CAS instruction in terms of STM as well and measure the performance difference this makes for the final data structure. After all, STM is a lot more compositional than CAS, so I'd like to know whether the loss of expressiveness is worth the gain in performance.
There's one annoying hitch with doing apples-to-apples comparisons here.
The problem is that CAS falls short of being a hardware-accelerated version of a simple transaction (read TVar, (==) against expected value, conditionally update TVar), because CAS is based on pointer equality rather than true equality. (eq? rather than equal? for any Schemers out there.)
For this reason our "Fake" version of CAS -- for older GHCs and for performance comparison -- has to use reallyUnsafePointerEquality#:
http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-I...
But we do provide a "CASable" type class in that package which is precisely for comparing the performance of:
- Hardware CAS - Fake CAS -- atomicModifyIORef + ptrEquality - Foreign CAS -- gcc intrinsic + function call

Hi,
I was interested in Distruptor few months ago and started to write a
Haskell implementation. Soon I discovered the lack of native CAS operations
and abandoned project for a while. I don't really have time to develop it
further right now, but the code is available:
https://github.com/Tener/disruptor-hs
The code as it is now is mostly benchmarking code. I think it is worth
trying out.
You mention benchmarking TChans: one particular problem with TChans and
Chans is that they are unbounded. If the producers outpace consumers it
soon ends with memory exhaustion.
In the process of writing above code I discovered particularly simple data
structure that performs suprisingly well (full implementation:
https://github.com/Tener/disruptor-hs/blob/master/Test/ManyQueue.hs) :
type Queue a = [MVar a]
mkQueue size = cycle `fmap` (replicateM size newEmptyMVar)
The throutput is very high, it is bounded and scales well with the number
of producer/consumers threads. In the presence of multiple
consumers/producers it's not a FIFO queue though, but rather a kind of
buffer with funny ordering.
Best regards,
Krzysztof Skrzętnicki
On Fri, Mar 30, 2012 at 00:03, John Lato
Slightly related: I think it would be interesting to compare a Disruptor-based concurrency communications mechanism and compare it to e.g. TChans
1. Disruptor - http://code.google.com/p/disruptor/
From: Ryan Newton
I think so. Atomically reading and writing a single memory location
(which CAS does) is just a very simple transaction. But using a CAS instruction should be more efficient, since STM has to maintain a transaction log and commit transactions, which creates some overhead.
Ah, I see. In that case, it may be worthwhile to implement the CAS instruction in terms of STM as well and measure the performance difference this makes for the final data structure. After all, STM is a lot more compositional than CAS, so I'd like to know whether the loss of expressiveness is worth the gain in performance.
There's one annoying hitch with doing apples-to-apples comparisons here.
The problem is that CAS falls short of being a hardware-accelerated version of a simple transaction (read TVar, (==) against expected value, conditionally update TVar), because CAS is based on pointer equality rather than true equality. (eq? rather than equal? for any Schemers out there.)
For this reason our "Fake" version of CAS -- for older GHCs and for performance comparison -- has to use reallyUnsafePointerEquality#:
http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-I...
But we do provide a "CASable" type class in that package which is
precisely
for comparing the performance of:
- Hardware CAS - Fake CAS -- atomicModifyIORef + ptrEquality - Foreign CAS -- gcc intrinsic + function call
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 3/30/12 4:27 AM, Krzysztof Skrzętnicki wrote:
You mention benchmarking TChans: one particular problem with TChans and Chans is that they are unbounded. If the producers outpace consumers it soon ends with memory exhaustion.
If that's the case, then you should consider TBChan[1] which is a bounded TChan, to solve exactly that problem. (There's also TMChan for closeable channels, and TBMChan for bounded closeable channels, as needed.) [1] http://hackage.haskell.org/package/stm-chans -- Live well, ~wren
participants (3)
-
John Lato
-
Krzysztof Skrzętnicki
-
wren ng thornton