Proposal: Replace QSem, QSemN, SampleVar in base

This is my first proposal to change some things in "base" and I trust that my request is not going to be perfect. The packages that I am proposing to change are all in Control.Concurrent and build on MVars: QSem, QSemN, and SampleVar. I wish to change them because this will fix an important bug they have in common. Who needs to approve of the following proposed change? Additional URLS: http://hackage.haskell.org/trac/ghc/ticket/3160 http://hackage.haskell.org/package/SafeSemaphore http://haskell.org/haskellwiki/SafeConcurrent All three of QSem, QSemN, and SampleVar create an abstraction and attempt to maintain an invariant or behavior. QSem and QSemN present the abstraction of a conserved quantity. SampleVar wishes to maintain a single updatable value. All three have in common that waiting on a semaphore or read the sample var will block if there is no quantity or value present. There can be many thread blocked in this way. All three abstractions fail to maintain their integrity in the case where one of the blocked waiters gets interrupted. This is not a random or rare failure, they abstraction always fails after the first waiter dies. Thus the root cause of the problem is that these modules were written with the unenforced and undocumented assumption that none of the waiters can be interrupted. The specific failure with QSem and QSemN is that signaling the semaphore will pass quantity to the already interrupted waiter. This quantity leak cannot be prevented by the user of module with clever exception handling. This leak will likely cause some or all future attempts to wait on the semaphore to block indefinitely. The specific failure with SampleVar is that writeSampleVar and emptySampleVar will trust the no longer correct count of blocked readers. These methods can then block indefinitely. This is triggered by: newEmptySampleVar, readSampleVar, kill the blocked reader, writeSampleVar, writeSampleVar. The writeSampleVar should never more than momentarily block, so this indefinite block turns the SampleVar into acting like a regular MVar. Also, isEmptySampleVar will no longer ever read full once the SampleVar is broken. All of these three modules can be replaced with MSem, MSemN, and MSampleVar in the SafeSemaphore package (on hackage). These allow the same operations as those in base but without the above flaw. If a waiter dies in the new modules then the waiter will have had no effect on the quantity or value. Slightly surprisingly, these improved modules all have simpler implementations. I have also expanded the APIs to expose additional useful operations that can be performed. Finally, I have changed the names in the new modules to be less verbose as the community style has evolved to use qualified imports. There is a vanishingly small change someone depends on the flawed behavior of the base modules. I propose to ignore this possibility and replace the old modules with thin layers around SafeSemaphore.

On 12/04/11 14:48, Chris Kuklewicz wrote:
This is my first proposal to change some things in "base" and I trust that my request is not going to be perfect.
The packages that I am proposing to change are all in Control.Concurrent and build on MVars: QSem, QSemN, and SampleVar. I wish to change them because this will fix an important bug they have in common. Who needs to approve of the following proposed change?
Additional URLS: http://hackage.haskell.org/trac/ghc/ticket/3160 http://hackage.haskell.org/package/SafeSemaphore http://haskell.org/haskellwiki/SafeConcurrent
All three of QSem, QSemN, and SampleVar create an abstraction and attempt to maintain an invariant or behavior. QSem and QSemN present the abstraction of a conserved quantity. SampleVar wishes to maintain a single updatable value.
All three have in common that waiting on a semaphore or read the sample var will block if there is no quantity or value present. There can be many thread blocked in this way.
All three abstractions fail to maintain their integrity in the case where one of the blocked waiters gets interrupted. This is not a random or rare failure, they abstraction always fails after the first waiter dies. Thus the root cause of the problem is that these modules were written with the unenforced and undocumented assumption that none of the waiters can be interrupted.
The specific failure with QSem and QSemN is that signaling the semaphore will pass quantity to the already interrupted waiter. This quantity leak cannot be prevented by the user of module with clever exception handling. This leak will likely cause some or all future attempts to wait on the semaphore to block indefinitely.
The specific failure with SampleVar is that writeSampleVar and emptySampleVar will trust the no longer correct count of blocked readers. These methods can then block indefinitely. This is triggered by: newEmptySampleVar, readSampleVar, kill the blocked reader, writeSampleVar, writeSampleVar. The writeSampleVar should never more than momentarily block, so this indefinite block turns the SampleVar into acting like a regular MVar. Also, isEmptySampleVar will no longer ever read full once the SampleVar is broken.
All of these three modules can be replaced with MSem, MSemN, and MSampleVar in the SafeSemaphore package (on hackage). These allow the same operations as those in base but without the above flaw. If a waiter dies in the new modules then the waiter will have had no effect on the quantity or value. Slightly surprisingly, these improved modules all have simpler implementations. I have also expanded the APIs to expose additional useful operations that can be performed. Finally, I have changed the names in the new modules to be less verbose as the community style has evolved to use qualified imports.
There is a vanishingly small change someone depends on the flawed behavior of the base modules. I propose to ignore this possibility and replace the old modules with thin layers around SafeSemaphore.
I don't think there's any need to introduce new types. Just fix the exception safety in the existing implementations. Could you make a patch? Cheers, Simon

So I have two replies from Simon and Ian suggesting a minimal behind the scenes patch for QSem (possible) or splitting them out of base. I have considered both of these. Over the releases of GHC there have been several bits separated out of base, each time requiring another round of cabal file updating. If people want to carve more out of base this is an independent decision to updating QSem/QSemN/SampleVar. Thanks for looking through the code, Ian, and finding where it was used. I'll go look at that to see how safe the usage is. With this encouragement, I'll prepare a patch to GHC to replace the guts of the modules and update the documentation. I'll patch against ghc-7.0.3, and I don't expect conflicts on the these on files. Cheers, Chris

On Tue, Apr 12, 2011 at 02:48:03PM +0100, Chris Kuklewicz wrote:
This is my first proposal to change some things in "base" and I trust that my request is not going to be perfect.
The packages that I am proposing to change are all in Control.Concurrent and build on MVars: QSem, QSemN, and SampleVar. I wish to change them because this will fix an important bug they have in common.
Have you considered putting them in a separate package instead, where interested people could better care for them? As far as I can see, the only things that use them in a GHC tree (other than a couple of tests and benchmarks) are mergeIO and nmergeIO in Control.Concurrent, both of which look like they haven't been touched for some time, and if still wanted could also be moved out into the other package. Thanks Ian
participants (3)
-
Chris Kuklewicz
-
Ian Lynagh
-
Simon Marlow