
Ryan,
On Sun, Jan 24, 2016 at 12:46 AM, Thomas Koster
Using Criterion, I have been running benchmarks to measure the relative performance of STM and MVars for some simple transactions that I expect will be typical in my application. I am using GHC 7.10.2 and libraries as at Stackage LTS 3.2.
I have found that STM is faster than MVars in all my benchmarks, without exception. This seems to go against accepted wisdom [1][2][3]. I have not included my source code here to save space, but if you suspect that I am using MVars incorrectly, just say so and I will post my source code separately.
I have two questions:
1. When are MVars faster than STM? If the answer is "never", then when are MVars "better" than STM? (Choose your own definition of "better".)
2. When given two capabilities (+RTS -N2), MVars are suddenly an order of magnitude slower than with just one capability. Why?
On 25 January 2016 at 01:04, Ryan Yates
I'm sorry I don't have time right now for a proper response (buried under paper deadlines). There are certainly times when one will be faster then the other and the reasons are quite complicated. To complicate matters further it is very difficult to get benchmarks that don't lie about performance in this space. There are also alternative implementations that change the balance drastically. The only broad advice I can give is to benchmark the target application with both implementations to see how all the implications fall out.
That is fair. From what I can tell, the time spent in the runtime dominates my user time, so I am basically benchmarking the GHC runtime, which I am not qualified to do :) I had only hoped to be able to decide on MVar vs STM before getting into the nitty gritty.
A broad description of the differences in implementation would be that MVars have a fairness guarantee (that does not come for free) for waking waiting threads. STM does not have this fairness which can lead to problems for programs that have quick transactions that always win over occasional long transactions (there are ways to avoid with a different implementation or with the cost of shifted to the programmer). My guess is in your particular benchmark the unfairness of STM works to your advantage and all the work is happening sequentially while the MVar version's fairness incurs frequent cache misses.
Fairness may actually be very important to my application. Unlike my benchmark, the complexity of real transactions can vary enormously. Let me think about this. Thanks for your response. -- Thomas Koster