
Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive. Your may find discussion here https://github.com/typelevel/cats-effect/issues/451 and event try to advocate, tl;dr. Anyway, what is so wrong with MVar? 1. It’s complex. Each MVar has 2 state transitions, each may block. 2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`) this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation. Well, you may argue what the `takeMVar` function is actually interruptible, but I’m going to show an easier approach which renders interpretability magic unnecessary. What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in... (By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java). For people who don’t read Scala, the approach is fairly simple. Each thread which want to touch mutex, will create IVar, atomically swap it in the IORef masked (note, that IORef’s operations non-blocking), unmask and wait for previous become available IVar *unmasked*. Then it will either perform it’s operations or fail due to the interruption or exception and trigger newly installed IVar anyway. It just works. Without any «interruptible» magic. So, which benefits can we get? 1. Simpler implementation of basic primitives. Obliviously IORef is fairly simple. IVar is also mind be simpler than MVar, and possibly faster (just “possibly”, I don’t know how it’s implemented, but I guess lesser state transitions implies less logic). 2. Simplified deadlock analysis. Really, we have only IVar with only one state transition and only one potentially blocking operation. 3. Magicless support of interruptions. We don’t need to separate mask/uninterruptibleMask anymore, because all updates are non-blocking, and all waits are unmasked.

Interesting. I have indeed seen enough cases of "Blocked indefinitely on
MVar" to suggest they might be difficult to handle correctly!
On Wed, Dec 19, 2018, 23:02 Станислав Черничкин Recently I had an interesting discussion on MVars with cats-effect library
designers. Cats-effect brings MVar synchronization primitive along with
other IO stuff to the Scala programming language. I tried to persuade them
to include some Control.Concurrent.MVar’s functions to the library but has
failed completely. Moreover, now I think that MVar is a poor choice for
basic synchronization primitive. Your may find discussion here
https://github.com/typelevel/cats-effect/issues/451 and event try to
advocate, tl;dr. Anyway, what is so wrong with MVar? 1. It’s complex. Each MVar has 2 state transitions, each may block. 2. It does not play well in presence of asynchronous exceptions.
More specifically, `take` and `put` operations should be balanced (each
`take` must be followed by `put`) this force programmer to mask
asynchronous exceptions during the MVar acquisition and since `take`
function may block, this will delay task cancelation. Well, you may argue
what the `takeMVar` function is actually interruptible, but I’m going to
show an easier approach which renders interpretability magic unnecessary. What could be the sensible alternative? Guy from the cats-effect suggested
me IVar + atomic reference (IORef). This pattern separates concern of
blocking (synchronization) from the atomic mutation. So everything can be
represented as atomic reference with IVar inside. Just look at this
beautiful mutex implementation
https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in...
(By ”beautiful” I mean approach itself of course, but not the Scala’s
syntax. Scala is one of most ugliest girls after C++ I was forced to date
with by my employer for money. Thankfully he didn’t force me to do the same
things with her grandma Java). For people who don’t read Scala, the approach is fairly simple. Each
thread which want to touch mutex, will create IVar, atomically swap it in
the IORef masked (note, that IORef’s operations non-blocking), unmask and
wait for previous become available IVar *unmasked*. Then it will either
perform it’s operations or fail due to the interruption or exception and
trigger newly installed IVar anyway. It just works. Without any
«interruptible» magic. So, which benefits can we get? 1. Simpler implementation of basic primitives. Obliviously IORef is
fairly simple. IVar is also mind be simpler than MVar, and possibly faster
(just “possibly”, I don’t know how it’s implemented, but I guess lesser
state transitions implies less logic). 2. Simplified deadlock analysis. Really, we have only IVar with
only one state transition and only one potentially blocking operation. 3. Magicless support of interruptions. We don’t need to separate
mask/uninterruptibleMask anymore, because all updates are non-blocking, and
all waits are unmasked.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

MVar is very low level. You almost never want to use it directly unless you're building a higher level synchronization mechanism from scratch. On Wed, Dec 19, 2018 at 4:34 PM Bryan Richter wrote:
Interesting. I have indeed seen enough cases of "Blocked indefinitely on MVar" to suggest they might be difficult to handle correctly!
On Wed, Dec 19, 2018, 23:02 Станислав Черничкин
Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive. Your may find discussion here https://github.com/typelevel/cats-effect/issues/451 and event try to advocate, tl;dr. Anyway, what is so wrong with MVar?
1. It’s complex. Each MVar has 2 state transitions, each may block.
2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`) this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation. Well, you may argue what the `takeMVar` function is actually interruptible, but I’m going to show an easier approach which renders interpretability magic unnecessary.
What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in... (By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java).
For people who don’t read Scala, the approach is fairly simple. Each thread which want to touch mutex, will create IVar, atomically swap it in the IORef masked (note, that IORef’s operations non-blocking), unmask and wait for previous become available IVar *unmasked*. Then it will either perform it’s operations or fail due to the interruption or exception and trigger newly installed IVar anyway. It just works. Without any «interruptible» magic.
So, which benefits can we get?
1. Simpler implementation of basic primitives. Obliviously IORef is fairly simple. IVar is also mind be simpler than MVar, and possibly faster (just “possibly”, I don’t know how it’s implemented, but I guess lesser state transitions implies less logic).
2. Simplified deadlock analysis. Really, we have only IVar with only one state transition and only one potentially blocking operation.
3. Magicless support of interruptions. We don’t need to separate mask/uninterruptibleMask anymore, because all updates are non-blocking, and all waits are unmasked. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh allbery.b@gmail.com

A few thoughts: * IIRC, MVars (in Haskell anyway) also have some fairness guarantees, which I don't see otherwise mentioned in the discussion so far. Compare to STM, where large transactions are susceptible to starvation. * If the goal is to have a simpler base primitive on which to build, everything has to boil down to compare-and-swap, load-linked/store-conditional, etc -- whatever the machine provides. When you start talking about even mutexes and semaphores, you're already dealing with higher-level abstractions, so the thinking about what's the right "primitive" seems silly to me at that point; you should be thinking about what the high-level (user) code should be able to do, and then figure out how to meet in the middle. * If what you're after is *composable* concurrency, simpler locks won't get you that. STM is the best general solution I've seen for this, and if you're doing concurrency stuff in Haskell, I would say use STM unless you have a specific reason to do something else. * Re: async exceptions, you're in the same situation with MVars as you are with all resource-acquiring IO actions in Haskell. `withMVar` and friends cover async-exception safety, and you can't get rid of mask and friends by swapping out MVars with another primitive anyway, because you still need them for acquiring other (non-concurrency related) resources, like files. -Ian

IIRC, MVars (in Haskell anyway) also have some fairness guarantees, which I don't see otherwise mentioned in the discussion so far.
Looking at given mutex implementation I guess fairness controlled by the primitive itself. More particular: 1. Each thread which trying to obtain mutex will wait for one I-Var exclusively (no competing threads waits shared I-Var hence there is no space for “unfair” schedule). 2. I-Var’s swapped by atomic ‘getAndSet’ which is likely can be implemented over CAS or LL/SC and will be as fair as hardware could be.
If the goal is to have a simpler base primitive on which to build, everything has to boil down to compare-and-swap, load-linked/store-conditional, etc
I had this discussion with a Cats guys, but I still can’t understand how it’s possible to implement generic non-busy waiting with CAS/LL-SC only. Ok, somewhere at deeper level it’s actually CAS/LL-SC, but one of the threads should be able to HLT (privileged instruction). Otherwise non-busy waiting would not be possible. Speaking other words, generic locking strategy should allow you to deadlock. How would you implement non-busy deadlock without HLT? Probably I’m missing something, and I’ll be grateful if anyone clarifies. Because I was told multiple times that CAS/LL-SC is enough along, but I can’t figure out myself.
If what you're after is *composable* concurrency, simpler locks won't get you that.
Composability is not a point at all currently. I’m not after composability I’m after completeness. Let me speculate a bit and presume that any sufficiently complex composable system (i.e. such a system which would hold some “correctness” properties over allowed composition operations) will necessary be incomplete (i.e. some “correct” expressions will be inexpressible due to the lack of compositions rules and there is no way to supplement the system without making it controversial). This is my very personal interpretation of the Godel Theorem and pure speculation, but look, every non-Turing complete language will prohibit correct programs, on the other hand, every Turing complete will allow you to write non-terminating programs (contradictions). I believe, same property holds with concurrency. If some system prohibits you from deadlocks, it will also prohibit you from some correct synchronization strategies (once again, it’s my speculations, maybe there is a proof of contrary, but I haven't found them yet). Standing from this point, it’s worth to have some complete and controversial (uncomposable) API at low level and having several composable implementations over it with their restrictions.
async exceptions, you're in the same situation with MVars as you are with all resource-acquiring IO actions in Haskell.
In my opinion they are completely different. In one case we should provide
guaranteed resource release, so we want to make resource acquisition
atomic. In other case we want to make wait operation interruptible (hence –
non-atomic unless acquisition consists of only one operation). Consider
`mask $ \restore -> do { x <- resourceAcquisition; interruptibleOperation;
onException (restore $ f x) (resorceRelease<1> x); resorceRelease<2> x }`.
In case of interruption we will neither get to resorceRelease<1> nor to
resorceRelease<2> (that is why uninterruptibleMask was introduced). Its
tempting to consider a MVar as a resource, but resources don’t like
interruptions, they should be separated.
чт, 20 дек. 2018 г. в 01:16, Ian Denhardt
A few thoughts:
* IIRC, MVars (in Haskell anyway) also have some fairness guarantees, which I don't see otherwise mentioned in the discussion so far. Compare to STM, where large transactions are susceptible to starvation. * If the goal is to have a simpler base primitive on which to build, everything has to boil down to compare-and-swap, load-linked/store-conditional, etc -- whatever the machine provides. When you start talking about even mutexes and semaphores, you're already dealing with higher-level abstractions, so the thinking about what's the right "primitive" seems silly to me at that point; you should be thinking about what the high-level (user) code should be able to do, and then figure out how to meet in the middle. * If what you're after is *composable* concurrency, simpler locks won't get you that. STM is the best general solution I've seen for this, and if you're doing concurrency stuff in Haskell, I would say use STM unless you have a specific reason to do something else. * Re: async exceptions, you're in the same situation with MVars as you are with all resource-acquiring IO actions in Haskell. `withMVar` and friends cover async-exception safety, and you can't get rid of mask and friends by swapping out MVars with another primitive anyway, because you still need them for acquiring other (non-concurrency related) resources, like files.
-Ian
-- Sincerely, Stanislav Chernichkin.

Quoting Станислав Черничкин (2018-12-25 18:04:47) > >IIRC, MVars (in Haskell anyway) also have some fairness guarantees, > which I don't see otherwise mentioned in the discussion so far. > Looking at given mutex implementation I guess fairness controlled by > the primitive itself. More particular:� > 1. Each thread which trying to obtain mutex will wait for one I-Var > exclusively (no competing threads waits shared I-Var hence there is no > space for �unfair� schedule). > 2. I-Var�s swapped by atomic �getAndSet� which is likely can be > implemented over CAS or LL/SC and will be as fair as hardware could > be.� > > If the goal is to have a simpler base primitive on which to build, > everything has to boil down to compare-and-swap, > load-linked/store-conditional, etc > I had this discussion with a Cats guys, but I still can�t understand > how it�s possible to implement generic non-busy waiting with CAS/LL-SC > only. Ok, somewhere at deeper level it�s actually CAS/LL-SC, > HLT (privileged instruction). Right, so you have to build on primitives exposed by the OS instead. If you've got userspace threads you can "block" by scheduling another thread, if you have one, but indeed you can't not burn cpu cycles without the OS. > Standing from this point, it�s worth to have some complete and > controversial (uncomposable) API at low level and having several > composable implementations over it with their restrictions. How low level are we talking? My first instinct would be "just expose whatever the OS and machine give you." > (hence non-atomic unless acquisition consists of only one operation). But takeMVar *is* only one operation. I'm not following the distinction? -Ian

Am 26.12.18 um 00:04 schrieb Станислав Черничкин:
Ok, somewhere at deeper level it’s actually CAS/LL-SC, but one of the threads should be able to HLT (privileged instruction). Otherwise non-busy waiting would not be possible.
HLT stops the entire processor, including operating system. No way a user process should ever execute *that*. What actually happens is that threads are simply not scheduled, with strongly implementation-dependent means of making them runnable again (some schedulers require that you state runnability upfront, in the form of a mutex, futex, semaphore, monitor, or whatever; others allow one thread to change the runnability of another thread post-hoc).
Composability is not a point at all currently. I’m not after composability I’m after completeness. Let me speculate a bit and presume that any sufficiently complex composable system (i.e. such a system which would hold some “correctness” properties over allowed composition operations) will necessary be incomplete (i.e. some “correct” expressions will be inexpressible due to the lack of compositions rules and there is no way to supplement the system without making it controversial).
This is my very personal interpretation of the Godel Theorem and pure speculation,
The computer science equivalent of the Godel Theorem is "undecidability". That's well-trodden ground. In practice, the options are: 1. Giving up correctness is not very useful. (There are extremely rare exceptions - be prepared to strongly defend any a suggestion of that kind.) 2. Giving up completeness can still be useful. You do not get true universality, but you get a correctness guarantee. 3. You can still have both correctness and completeness. If your heuristics are good, you'll get a result in the vast majority of cases in practice. The cases that don't work out will simply run into an endless loop; just add a timeout to the computation. Obviously, you don't want such an approach for time-critical stuff like thread scheduling, but it's been used successfully e.g. for type systems that are more powerful than Hindley-Milner. 4. Do not solve the problem at all. E.g. don't prevent deadlocks by inspecting code and determining whether it can deadlock or not; detect deadlocks after they happen and report them as errors.
but look, every non-Turing complete language will prohibit correct programs, on the other hand, every Turing complete will allow you to write non-terminating programs (contradictions). I believe, same property holds with concurrency.
You get into undecidability issues at the point where an algorithm needs to analyze a program for termination. So people have been using algorithms don't inspect programs, but which inspect data (Haskell's infinite data structures would be categorized as program in this discussion, because it's code that will return finite portions of the infinite data structure). Even with finite data, algorithms can be undecidable. It's rare and usually doesn't happen, but it can happen if the data structure somehow expresses a Turing-complete language. Note that compilers and such never actually analyze the full program, they just apply patterns that a programmer has decided preserve semantics. (Mistakes here usually lead to compiler bugs.)
If some system prohibits you from deadlocks, it will also prohibit you from some correct synchronization strategies (once again, it’s my speculations, maybe there is a proof of contrary, but I haven't found them yet).
Correct, but this is only a problem in practice if you need a locking scheme that can express undecidable locking strategies. I am not a grand master of locking strategies, but I see locking strategies move from semaphores (undecidable) towards higher-level constructs that do not allow all kinds of locking anymore. I don't think that futures etc. exclude deadlocks by construction, but I do see that deadlocks have become a less common problem.
Standing from this point, it’s worth to have some complete and controversial (uncomposable) API at low level and having several composable implementations over it with their restrictions.
That's one of today's difficult goals: Designing a locking API that's both composable and useful. I'm not sure whether such a thing has even been thought of yet.

Станислав Черничкин wrote:
Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in...
As far as I can see, this only works because Java/Scala don't have (or at least, very strongly discourage) asynchronous exceptions. Here's my attempt to translate the code into Haskell: import Control.Concurrent.MVar -- should be an IVar import Control.Concurrent import Control.Exception (bracket) import Data.IORef type Mutex = IORef (MVar ()) newMutex :: IO Mutex newMutex = do next <- newMVar () newIORef next withMutex :: Mutex -> IO () -> IO () withMutex m act = do next <- newEmptyMVar bracket (atomicModifyIORef m (\curr -> (next, curr))) -- atomic swap (\_ -> putMVar next ()) $ \curr -> do readMVar curr -- readMVar is no longer a combination of takeMVar/putMVar -- since base 4.7, so we can faithfully emulate an IVar act Now if the `readMVar` is interrupted by an asynchronous exception, subsequent threads will be woken up, violating the mutual exclusion property. For example: mkThread lock nm = do tid <- forkIO $ withMutex lock $ do putStrLn $ unwords ["thread", nm, "running"] threadDelay 200000 putStrLn $ unwords ["thread", nm, "stopping"] yield return tid main = do lock <- newMutex threadA <- mkThread lock "A" threadB <- mkThread lock "B" threadC <- mkThread lock "C" killThread threadB threadDelay 1000000 Output: thread A running thread C running thread C stopping thread A stopping Oops. This is awkward to fix. Basically, when abandoning the lock before it has been released by the previous owner, we need a new thread to wait for the 'current' IVar and notify the 'next' one, since the current thread is being interrupted. So `withMutex` will end up with code like this: withMutex :: Mutex -> IO () -> IO () withMutex m act = do next <- newEmptyMVar bracket (atomicModifyIORef m (\curr -> (next, curr))) (cleanup next) $ \curr -> do readMVar curr act where cleanup :: MVar () -> MVar () -> IO () cleanup next curr = do b <- tryReadMVar next case b of Just _ -> putMVar next () Nothing -> void $ forkIO $ do readMVar curr putMVar next () This loses a lot of elegance. On the low-level implementation side, both MVars and IVars need to maintain a list of waiting threads; both require logic to wake up threads (IVars will wake all threads; when putting a value, MVars will wake up threads reading the MVar, up to the first thread (if any) that actually takes the MVar value). I believe MVars are not much more difficult to implement than IVars. (This assumes a global memory; IVars may be simpler in a distributed setting.) For users, MVars are dangerous if used without restrictions, but we have easy to understand patterns, for example for using an MVar as a mutex (newMVar, withMVar), or as an IVar (newEmptyMVar, putMVar, readMVar). To summarize, IVars may be harder to misuse, but MVars provide tangible benefits as a primitive, especially in the presence of asynchronous exceptions. Cheers, Bertram P.S.:
1. [MVars are] complex. Each MVar has 2 state transitions, each may block.
It seems worth noting that the IVar state transition also blocks.
2. [MVars do not] play well in presence of asynchronous exceptions.
I can't help smirking about this claim.

On Dec 28, 2018, at 12:44 PM, Bertram Felgenhauer via Haskell-Cafe
wrote: This is awkward to fix. Basically, when abandoning the lock before it has been released by the previous owner, we need a new thread to wait for the 'current' IVar and notify the 'next' one, since the current thread is being interrupted.
I think that work can be delegated to the waiting thread, by making locks (really barriers) optionally chain to a parent barrier that also needs to be waited for (recursively). This is cheap, because unless threads are actually interrupted, the chain is always one deep. When a thread is interrupted, the next thread will wait for 2 barriers, ... -- Viktor. module Main (main) where import Control.Concurrent.MVar -- should be an IVar import Control.Concurrent import Control.Exception (bracket) import Data.IORef -- Really a recursive barrier newtype Lock = Lock (MVar (Maybe Lock)) type Mutex = IORef Lock type Chain = IORef (Maybe Lock) newMutex :: IO Mutex newMutex = Lock <$> newMVar Nothing >>= newIORef withMutex :: Mutex -> IO a -> IO a withMutex m = bracket swapShared signalPrivate . (\act -> (>> act) . waitChain . snd) where -- Return a new IORef containing the old barrier from the mutex, and a new -- barrier, that has been atomically swapped into the old mutex. swapShared :: IO (Lock, Chain) swapShared = Lock <$> newEmptyMVar >>= \b' -> atomicModifyIORef m (\b -> (b', b)) >>= \b -> newIORef (Just b) >>= \chain -> return (b', chain) signalPrivate :: (Lock, Chain) -> IO () signalPrivate (Lock b, chain) = readIORef chain >>= putMVar b -- The last barrier that we were waiting on (if we're interrupted) -- will be left in our chain as a "continuation" for whoever -- next gets the mutex. It may be already signalled by the time they -- see it, and that's OK. On normal return it will be 'Nothing'. waitChain :: Chain -> IO () waitChain c = readIORef c >>= go where go = mapM_ $ \(Lock a) -> readMVar a >>= \b -> writeIORef c b >> go b mkThread :: Mutex -> String -> IO ThreadId mkThread m name = do tid <- forkIO $ withMutex m $ do putStrLn $ unwords ["thread", name, "running"] threadDelay 200000 putStrLn $ unwords ["thread", name, "stopping"] yield return tid main :: IO () main = do m <- newMutex _ <- mkThread m "A" threadB <- mkThread m "B" _ <- mkThread m "C" killThread threadB threadDelay 1000000

Dear Viktor, Viktor Dukhovni wrote:
On Dec 28, 2018, at 12:44 PM, Bertram Felgenhauer via Haskell-Cafe
wrote: This is awkward to fix. Basically, when abandoning the lock before it has been released by the previous owner, we need a new thread to wait for the 'current' IVar and notify the 'next' one, since the current thread is being interrupted. I think that work can be delegated to the waiting thread, by making locks (really barriers) optionally chain to a parent barrier that also needs to be waited for (recursively). This is cheap, [...]
Thanks! Using the next waiting thread for cleanup work instead of spawning a fresh one is much more elegant indeed. Cheers, Bertram

MVar and IVar things are from dataflow programming, I believe, from Id90
programming language (read about it, it's fascinating). They were ued there
with greate success (linear scaling on CM-5, no less; they had to invent
throttling to tame huge parallelism available). MVar were used for
synchronization and IVars to provide a kind of call-by-need in a lenient
language. Mind that Id90 was not as sofisticated as Haskell today and these
things were used as a coordination tool between huge number of small
threads of execution, with no I/O and all threads must work to completion.
Thus, they are, as usually happens, not general purpose yet very useful. I
consider them "baby-STM".
But what it takes to consider them harmful is beyond me.
чт, 20 дек. 2018 г. в 00:02, Станислав Черничкин
Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive. Your may find discussion here https://github.com/typelevel/cats-effect/issues/451 and event try to advocate, tl;dr. Anyway, what is so wrong with MVar?
1. It’s complex. Each MVar has 2 state transitions, each may block.
2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`) this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation. Well, you may argue what the `takeMVar` function is actually interruptible, but I’m going to show an easier approach which renders interpretability magic unnecessary.
What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in... (By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java).
For people who don’t read Scala, the approach is fairly simple. Each thread which want to touch mutex, will create IVar, atomically swap it in the IORef masked (note, that IORef’s operations non-blocking), unmask and wait for previous become available IVar *unmasked*. Then it will either perform it’s operations or fail due to the interruption or exception and trigger newly installed IVar anyway. It just works. Without any «interruptible» magic.
So, which benefits can we get?
1. Simpler implementation of basic primitives. Obliviously IORef is fairly simple. IVar is also mind be simpler than MVar, and possibly faster (just “possibly”, I don’t know how it’s implemented, but I guess lesser state transitions implies less logic).
2. Simplified deadlock analysis. Really, we have only IVar with only one state transition and only one potentially blocking operation.
3. Magicless support of interruptions. We don’t need to separate mask/uninterruptibleMask anymore, because all updates are non-blocking, and all waits are unmasked. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Having no sense of either history or implementation.
On Sat, Dec 29, 2018 at 5:43 AM Serguey Zefirov
MVar and IVar things are from dataflow programming, I believe, from Id90 programming language (read about it, it's fascinating). They were ued there with greate success (linear scaling on CM-5, no less; they had to invent throttling to tame huge parallelism available). MVar were used for synchronization and IVars to provide a kind of call-by-need in a lenient language. Mind that Id90 was not as sofisticated as Haskell today and these things were used as a coordination tool between huge number of small threads of execution, with no I/O and all threads must work to completion.
Thus, they are, as usually happens, not general purpose yet very useful. I consider them "baby-STM".
But what it takes to consider them harmful is beyond me.
чт, 20 дек. 2018 г. в 00:02, Станислав Черничкин
: Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive. Your may find discussion here https://github.com/typelevel/cats-effect/issues/451 and event try to advocate, tl;dr. Anyway, what is so wrong with MVar?
1. It’s complex. Each MVar has 2 state transitions, each may block.
2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`) this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation. Well, you may argue what the `takeMVar` function is actually interruptible, but I’m going to show an easier approach which renders interpretability magic unnecessary.
What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in... (By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java).
For people who don’t read Scala, the approach is fairly simple. Each thread which want to touch mutex, will create IVar, atomically swap it in the IORef masked (note, that IORef’s operations non-blocking), unmask and wait for previous become available IVar *unmasked*. Then it will either perform it’s operations or fail due to the interruption or exception and trigger newly installed IVar anyway. It just works. Without any «interruptible» magic.
So, which benefits can we get?
1. Simpler implementation of basic primitives. Obliviously IORef is fairly simple. IVar is also mind be simpler than MVar, and possibly faster (just “possibly”, I don’t know how it’s implemented, but I guess lesser state transitions implies less logic).
2. Simplified deadlock analysis. Really, we have only IVar with only one state transition and only one potentially blocking operation.
3. Magicless support of interruptions. We don’t need to separate mask/uninterruptibleMask anymore, because all updates are non-blocking, and all waits are unmasked. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- brandon s allbery kf8nh allbery.b@gmail.com

Am 29.12.18 um 11:43 schrieb Serguey Zefirov:
MVar and IVar things are from dataflow programming, I believe, from Id90 programming language (read about it, it's fascinating). They were ued there with greate success (linear scaling on CM-5, no less; they had to invent throttling to tame huge parallelism available). So they were forced to go sublinear - that makes the claim pretty shady IMHO.
Claims of linearly scaling parallelism are usually wrong. First, there's always coordination overhead. Even if tasks distribute perfectly, it's there - not much, but if you didn't notice it, either you didn't scale up far enough to see the effect, or your measurements are off. Second, there's always a bottleneck somewhere. E.g. for GPU parallelism, there's a pretty hard limit on the amount of data you can transfer over the bus. You *can* work around that by setting up a hierarchical compute node organization, which was all the rage for a few years, but it never took off; I guess the constant factors have been too high. So, forgive me if I am sceptical about that Id90 claim of linear scaling. Regards, Jo

Hello, I’m the author of the `MVar` implementation in Cats-Effect.
Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive.
I believe `MVar` is a superb choice for synchronisation, because it behaves like a blocking queue, which in computer science is a pretty fundamental tool. It is true that in Cats-Effect an `MVar` might not be a primitive, but as I explained in that thread, on top of the JVM the real primitives are the low level building blocks like `AtomicReference`.
1. It’s complex. Each MVar has 2 state transitions, each may block.
Blocking is a feature. If you have to build that logic via `Ref` (`IORef`), you’d have to model the state machine by yourself and that’s complexity being pushed to the user.
2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`)
The `take` followed by a `put` rule is only for your “modify” use-case. The problem is that a `modify` function that’s described via `take` + `put` is not an atomic operation and this is a problem, but only if any of the actors accessing the `MVar` are doing so in a different order. This isn’t a problem for other use-cases tough.
this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation.
I don’t have much experience with Haskell’s async exceptions, but if you mean that `take` must be executed as `uncancelable`, via `bracket`, then this is a problem that we can fix in Cats-Effect 2.x, as mentioned in that thread.
What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in...
As I said above, most things on top of the JVM can be implemented via an `AtomicReference` due to its memory model and this is no exception. But it’s not elegant, or simple ;-)
(By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java).
That’s unnecessary and TBH a turnoff. Cheers, -- Alexandru Nedelcu https://alexn.org

And one more thing, you mentioned the `Synchronized` implementation at: https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in... N.B. I happen to believe this is harmful because: 1. Mutexes are in general bad, a solution of last resort, because it prevents threads from making progress in case the scheduler pauses the OS thread that holds the lock; for this reason we want non-blocking, even wait-free algorithms, whenever possible 2. This implementation itself has performance characteristics that are less than ideal — could be much better as it could use platform intrinsics for spin-locking and it could be biased for single threaded uses On point number 2, this is important because it shows that `Ref` isn’t very adequate to build `Synchronized`. On point number 1 … this extends to `MVar` usage too. If you have blocking behaviour in a way that prevents threads from making progress, such solutions don’t scale and it doesn’t matter how you build it (`MVar` or `IORef` or whatever), it’s still going to be a bottleneck. -- Alexandru Nedelcu https://alexn.org On 2 Jan 2019, at 2:44, Alexandru Nedelcu wrote:
Hello,
I’m the author of the `MVar` implementation in Cats-Effect.
Recently I had an interesting discussion on MVars with cats-effect library designers. Cats-effect brings MVar synchronization primitive along with other IO stuff to the Scala programming language. I tried to persuade them to include some Control.Concurrent.MVar’s functions to the library but has failed completely. Moreover, now I think that MVar is a poor choice for basic synchronization primitive.
I believe `MVar` is a superb choice for synchronisation, because it behaves like a blocking queue, which in computer science is a pretty fundamental tool.
It is true that in Cats-Effect an `MVar` might not be a primitive, but as I explained in that thread, on top of the JVM the real primitives are the low level building blocks like `AtomicReference`.
1. It’s complex. Each MVar has 2 state transitions, each may block.
Blocking is a feature. If you have to build that logic via `Ref` (`IORef`), you’d have to model the state machine by yourself and that’s complexity being pushed to the user.
2. It does not play well in presence of asynchronous exceptions. More specifically, `take` and `put` operations should be balanced (each `take` must be followed by `put`)
The `take` followed by a `put` rule is only for your “modify” use-case.
The problem is that a `modify` function that’s described via `take` + `put` is not an atomic operation and this is a problem, but only if any of the actors accessing the `MVar` are doing so in a different order.
This isn’t a problem for other use-cases tough.
this force programmer to mask asynchronous exceptions during the MVar acquisition and since `take` function may block, this will delay task cancelation.
I don’t have much experience with Haskell’s async exceptions, but if you mean that `take` must be executed as `uncancelable`, via `bracket`, then this is a problem that we can fix in Cats-Effect 2.x, as mentioned in that thread.
What could be the sensible alternative? Guy from the cats-effect suggested me IVar + atomic reference (IORef). This pattern separates concern of blocking (synchronization) from the atomic mutation. So everything can be represented as atomic reference with IVar inside. Just look at this beautiful mutex implementation https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/in...
As I said above, most things on top of the JVM can be implemented via an `AtomicReference` due to its memory model and this is no exception.
But it’s not elegant, or simple ;-)
(By ”beautiful” I mean approach itself of course, but not the Scala’s syntax. Scala is one of most ugliest girls after C++ I was forced to date with by my employer for money. Thankfully he didn’t force me to do the same things with her grandma Java).
That’s unnecessary and TBH a turnoff.
Cheers,
-- Alexandru Nedelcu https://alexn.org
participants (10)
-
Alexandru Nedelcu
-
Bertram Felgenhauer
-
Brandon Allbery
-
Bryan Richter
-
Ian Denhardt
-
Joachim Durchholz
-
Serguey Zefirov
-
Viktor Dukhovni
-
William Fearon
-
Станислав Черничкин