Race conditions with threadWait(Read/Write) and closeFdWith

I was very tangentially involved with a use-after-close IO descriptor fault recently, and I came to realize that descriptor indexes are typically allocated by the smallest available index, Previously, I had erroneously believed that the indexes were sequentially allocated by an ever-increasing counter, until wrap-around occurs. Obviously, the smallest available index allocation strategy makes use-after-close faults rather more significant, because in a server application that is constantly closing and allocating descriptors, it makes it rather more likely that an erroneous operation will actually have a tangible effect, and not simply return an EINVAL or similar errno. So in my low-level linux-inotify binding, I felt it wise to add protection against use-after-close faults. The approach I'm currently investigating is the obvious one: to (conceptually) represent the inotify socket by a "MVar (Maybe Fd)" value. However, if there's no file system events to read from the socket, we want to call threadWaitRead, So somewhere there's some code that's essentially this: readMVar inotifyfd >>= maybe (throwIO ...) threadWaitRead So the problem is, what happens when the thread executing this snippet yields in between the readMVar and the threadWaitRead? It would then be possible for another thread (or two) to close the inotify descriptor, and then allocate another descriptor with the same index. The first thread would then end up blocking until the new descriptor becomes readable, after which the thread would re-read the MVar and throw an exception that the descriptor has been closed. This is a _relatively_ benign effect, but still, it does prevent the waiting thread from throwing an exception at the earliest possible moment, and there are several particularly unlucky scenarios I can think of where such a thread could be waiting on the wrong descriptor for a very long time. Not to mention that this is a whole family of race conditions, which I haven't really explored the other possible manifestations of which might not be as benign. Somebody did also point me to the new threadWaitReadSTM primitives, but this doesn't cover this use case although the original proposal (properly implemented, of course) would: http://www.haskell.org/ghc/docs/7.8.3/html/libraries/base-4.7.0.1/Control-Co... https://ghc.haskell.org/trac/ghc/ticket/7216 In any case, my linux-inotify binding is available on github; the current main branch is unreleased and also has code to attempt to make reading from a descriptor a thread-safe operation. However I'm not too enthusiastic about this unreleased code at the moment, and am considering throwing it out. However, I'd still very much like to add protection against use-after-close faults, and I'd certainly prefer being able to concurrently call both close and read on the descriptor with complete safety. https://github.com/lpsmith/linux-inotify Best, Leon

On Mon, Sep 1, 2014 at 4:17 AM, Leon Smith
I was very tangentially involved with a use-after-close IO descriptor fault recently, and I came to realize that descriptor indexes are typically allocated by the smallest available index
You do realize that Fd is a system file descriptor, and the only way you're going to change its allocation strategy is a kernel patch? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Right, I'm not suggesting changing the allocation strategy. (although, I
think that would be a simpler and more elegant solution...) I'm
pointing out a problem and seeing if anybody has any insight into it,
without patching any kernels. ;-)
best,
Leon
On Mon, Sep 1, 2014 at 8:45 AM, Brandon Allbery
On Mon, Sep 1, 2014 at 4:17 AM, Leon Smith
wrote: I was very tangentially involved with a use-after-close IO descriptor fault recently, and I came to realize that descriptor indexes are typically allocated by the smallest available index
You do realize that Fd is a system file descriptor, and the only way you're going to change its allocation strategy is a kernel patch?
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

The documentation for threadWaitRead states:
Block the current thread until data is available to read on the given
file descriptor (GHC only).
This will throw an IOError if the file descriptor was closed while this
thread was blocked. To safely close a file descriptor that has been used
with threadWaitRead
http://hackage.haskell.org/package/base-4.7.0.1/docs/GHC-Conc-IO.html#v:thre...,
use closeFdWith
http://hackage.haskell.org/package/base-4.7.0.1/docs/GHC-Conc-IO.html#v:clos...
.
So what should happen in your situation is that, when a separate thread
closes the fd, threadWaitRead will throw an exception promptly, not
continue to wait on a now-closed fd.
Note the docs for closeFdWith:
Close a file descriptor in a concurrency-safe way (GHC only). If you
are using threadWaitRead
http://hackage.haskell.org/package/base-4.7.0.1/docs/GHC-Conc-IO.html#v:thre...
or threadWaitWrite
http://hackage.haskell.org/package/base-4.7.0.1/docs/GHC-Conc-IO.html#v:thre...
to
perform blocking I/O, you *must* use this function to close file
descriptors, or blocked threads may not be woken.
So I think the short answer is, if you're using fd's and threadWaitRead,
and you always use closeFdWith, you should be protected against operating
on a re-used fd index, even with wrapping your fd's in a Maybe.
I seem to recall that there are some other issues that arise if you attempt
to mix these low-level fd calls with higher-level fd operations, but that
was two IO managers ago so those concerns may be out of date.
John L.
On Mon, Sep 1, 2014 at 4:17 PM, Leon Smith
I was very tangentially involved with a use-after-close IO descriptor fault recently, and I came to realize that descriptor indexes are typically allocated by the smallest available index, Previously, I had erroneously believed that the indexes were sequentially allocated by an ever-increasing counter, until wrap-around occurs.
Obviously, the smallest available index allocation strategy makes use-after-close faults rather more significant, because in a server application that is constantly closing and allocating descriptors, it makes it rather more likely that an erroneous operation will actually have a tangible effect, and not simply return an EINVAL or similar errno.
So in my low-level linux-inotify binding, I felt it wise to add protection against use-after-close faults. The approach I'm currently investigating is the obvious one: to (conceptually) represent the inotify socket by a "MVar (Maybe Fd)" value.
However, if there's no file system events to read from the socket, we want to call threadWaitRead, So somewhere there's some code that's essentially this:
readMVar inotifyfd >>= maybe (throwIO ...) threadWaitRead
So the problem is, what happens when the thread executing this snippet yields in between the readMVar and the threadWaitRead? It would then be possible for another thread (or two) to close the inotify descriptor, and then allocate another descriptor with the same index. The first thread would then end up blocking until the new descriptor becomes readable, after which the thread would re-read the MVar and throw an exception that the descriptor has been closed.
This is a _relatively_ benign effect, but still, it does prevent the waiting thread from throwing an exception at the earliest possible moment, and there are several particularly unlucky scenarios I can think of where such a thread could be waiting on the wrong descriptor for a very long time. Not to mention that this is a whole family of race conditions, which I haven't really explored the other possible manifestations of which might not be as benign.
Somebody did also point me to the new threadWaitReadSTM primitives, but this doesn't cover this use case although the original proposal (properly implemented, of course) would:
http://www.haskell.org/ghc/docs/7.8.3/html/libraries/base-4.7.0.1/Control-Co... https://ghc.haskell.org/trac/ghc/ticket/7216
In any case, my linux-inotify binding is available on github; the current main branch is unreleased and also has code to attempt to make reading from a descriptor a thread-safe operation. However I'm not too enthusiastic about this unreleased code at the moment, and am considering throwing it out. However, I'd still very much like to add protection against use-after-close faults, and I'd certainly prefer being able to concurrently call both close and read on the descriptor with complete safety.
https://github.com/lpsmith/linux-inotify
Best, Leon
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Mon, Sep 1, 2014 at 7:13 PM, John Lato
I seem to recall that there are some other issues that arise if you attempt to mix these low-level fd calls with higher-level fd operations, but that was two IO managers ago so those concerns may be out of date.
There are *always* issues with mixing low-level and high-level I/O, especially if the high-level I/O is buffered. It's best to not mix them. (Note that using handleToFd will safely give you something that you can do low level I/O with and will close the Handle; fdToHandle can't stop you from mixing them, though.) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Tue, Sep 2, 2014 at 7:20 AM, Brandon Allbery
On Mon, Sep 1, 2014 at 7:13 PM, John Lato
wrote: I seem to recall that there are some other issues that arise if you attempt to mix these low-level fd calls with higher-level fd operations, but that was two IO managers ago so those concerns may be out of date.
There are *always* issues with mixing low-level and high-level I/O, especially if the high-level I/O is buffered. It's best to not mix them. (Note that using handleToFd will safely give you something that you can do low level I/O with and will close the Handle; fdToHandle can't stop you from mixing them, though.)
Well, yes. But fd operations are all pretty low-level relative to Handle ops. I specifically meant mixing functions in GHC.Conc.IO (threadWaitRead etc) with functions from System.Posix.IO (e.g. fdReadBuf). Which you'd think should be relatively simple to have work together (and maybe it is, provided you're familiar with fd's and non-blocking IO, etc). If you want to use fdToHandle and call functions on both the fd and the Handle, good luck with that.

On Mon, Sep 1, 2014 at 7:13 PM, John Lato
So what should happen in your situation is that, when a separate thread closes the fd, threadWaitRead will throw an exception promptly, not continue to wait on a now-closed fd.
Right, which is what would happen most of the time. But in the race condition I'm pointing out, the thread yields *before* the call to threadWaitRead, so there is no interest yet registered in the file descriptor, so closeFdWith does not raise an exception. When the thread that calls threadWaitRead is resumed, it ends up waiting on an entirely different descriptor that happens to share the same index as the old descriptor. Best, Leon

On Sep 2, 2014 3:59 AM, "Leon Smith"
On Mon, Sep 1, 2014 at 7:13 PM, John Lato
wrote: So what should happen in your situation is that, when a separate thread
closes the fd, threadWaitRead will throw an exception promptly, not continue to wait on a now-closed fd.
Right, which is what would happen most of the time. But in the race
condition I'm pointing out, the thread yields *before* the call to threadWaitRead, so there is no interest yet registered in the file descriptor, so closeFdWith does not raise an exception. When the thread that calls threadWaitRead is resumed, it ends up waiting on an entirely different descriptor that happens to share the same index as the old descriptor. I was thinking you could do away entirely with the Maybe wrapper. But I guess that might not be possible in your case. Have you actually observed this behavior? I suspect that the thread will never yield between readMVar and threadWaitRead because there are no allocations. I suppose an async exception could arise, so it might be correct to run that line with exceptions masked. John

On 02-09-2014 14:01, John Lato wrote:
Have you actually observed this behavior? I suspect that the thread will never yield between readMVar and threadWaitRead because there are no allocations. I suppose an async exception could arise, so it might be correct to run that line with exceptions masked.
As I understand it, the problem isn't limited to the thread yielding. If the program is running on more than one capability, it's conceivable that another thread running on another capability will (a) close that Fd and (b) open another Fd that will receive the same value, both between the readMVar and the threadWaitRead calls. I don't see how one could allow concurrent readers and "closers" without leaving this small opening. The best workaround I can think of is to create a blocking close operation that waits for readers using a semaphore. Cheers, -- Felipe.

On Tue, Sep 2, 2014 at 1:31 PM, Felipe Lessa
I don't see how one could allow concurrent readers and "closers" without leaving this small opening. The best workaround I can think of is to create a blocking close operation that waits for readers using a semaphore.
Well yes, I can't think of a simple lock-based solution to this problem, because you don't want to hold any kind of lock while you are blocked on the file descriptor. It would be solvable though if we had a thread-safe threadWaitReadMVar :: MVar Fd -> IO () You should be able to implement this relatively easily if you dig deeper into the IO manager itself, but we already have threadWaitReadSTM. (For different reasons, which doesn't cover this use case.) How many variations on this theme are we going to need. We could implement threadWaitReadMVar if we had a non-blocking way of registering interest in a file descriptor, and then later actually blocking on it. So let's say something like registerWaitRead :: Fd -> IO (IO ()) threadWaitReadMVar fd = join $ withMVar fd registerWaitRead Which, ignoring asynchronous exceptions for the moment, should be adequate for the task. I suppose that means you could instead do threadWaitReadMVar fd = (atomically . fst) =<< withMVar fd threadWaitReadSTM Which seems like an odd use of STM, but that also does seem like a solution. So I guess the earlier part of this email (as well as eariler emails) is in fact wrong, that threadWaitReadSTM does cover this use case. And STM might also offer a nicer way of making multiple reads from the inotify binding thread-safe as well. I suppose that leaves the question that if linux-inotify adopted this approach, what should be done (if anything) about GHC 7.0 through 7.6. Perhaps it would be possible to implement something like registerWaitRead using the lower-level semi-internal interface to the IO manager, or perhaps I should just require GHC 7.8. Best, Leon

On Tue, Sep 2, 2014 at 12:26 PM, Leon Smith
On Tue, Sep 2, 2014 at 1:31 PM, Felipe Lessa
wrote: I don't see how one could allow concurrent readers and "closers" without leaving this small opening. The best workaround I can think of is to create a blocking close operation that waits for readers using a semaphore.
Well yes, I can't think of a simple lock-based solution to this problem, because you don't want to hold any kind of lock while you are blocked on the file descriptor.
If you wanted to go this route, you could use an MVar (Maybe (Int,Fd)), where the Int is a count of interested threads. Instead of using readMVar before threadWaitRead, you would use modifyMVar to atomically increment the counter and retrieve the fd. Then, after threadWaitRead returns, decrement the counter. You'd need to make sure that you never close an fd when the counter is greater than 0. This would work better with a TMVar, because then the close operation could block until the counter has changed.
It would be solvable though if we had a thread-safe
threadWaitReadMVar :: MVar Fd -> IO ()
You should be able to implement this relatively easily if you dig deeper into the IO manager itself, but we already have threadWaitReadSTM. (For different reasons, which doesn't cover this use case.) How many variations on this theme are we going to need.
We could implement threadWaitReadMVar if we had a non-blocking way of registering interest in a file descriptor, and then later actually blocking on it. So let's say something like
registerWaitRead :: Fd -> IO (IO ())
threadWaitReadMVar fd = join $ withMVar fd registerWaitRead
Which, ignoring asynchronous exceptions for the moment, should be adequate for the task. I suppose that means you could instead do
threadWaitReadMVar fd = (atomically . fst) =<< withMVar fd threadWaitReadSTM
Which seems like an odd use of STM, but that also does seem like a solution. So I guess the earlier part of this email (as well as eariler emails) is in fact wrong, that threadWaitReadSTM does cover this use case. And STM might also offer a nicer way of making multiple reads from the inotify binding thread-safe as well.
This should work even in the presence of async exceptions, barring bugs in withMVar and threadWaitReadSTM. You can implement your registerWaitRead using forkIO and MVars, but I think the only reason to do so would be for compatibility with older ghcs. It's probably more sensible to just copy the definition of threadWaitReadSTM in that case, unless you want to target pre-STM compiler versions. John

On Wed, Sep 3, 2014 at 10:21 AM, John Lato
On Tue, Sep 2, 2014 at 12:26 PM, Leon Smith
wrote: On Tue, Sep 2, 2014 at 1:31 PM, Felipe Lessa
wrote: I don't see how one could allow concurrent readers and "closers" without leaving this small opening. The best workaround I can think of is to create a blocking close operation that waits for readers using a semaphore.
Well yes, I can't think of a simple lock-based solution to this problem, because you don't want to hold any kind of lock while you are blocked on the file descriptor.
If you wanted to go this route, you could use an MVar (Maybe (Int,Fd)), where the Int is a count of interested threads. Instead of using readMVar before threadWaitRead, you would use modifyMVar to atomically increment the counter and retrieve the fd. Then, after threadWaitRead returns, decrement the counter. You'd need to make sure that you never close an fd when the counter is greater than 0. This would work better with a TMVar, because then the close operation could block until the counter has changed.
Or a reader/writer lock. Hold the lock as a reader across the blocking operation. Alexander
It would be solvable though if we had a thread-safe
threadWaitReadMVar :: MVar Fd -> IO ()
You should be able to implement this relatively easily if you dig deeper into the IO manager itself, but we already have threadWaitReadSTM. (For different reasons, which doesn't cover this use case.) How many variations on this theme are we going to need.
We could implement threadWaitReadMVar if we had a non-blocking way of registering interest in a file descriptor, and then later actually blocking on it. So let's say something like
registerWaitRead :: Fd -> IO (IO ())
threadWaitReadMVar fd = join $ withMVar fd registerWaitRead
Which, ignoring asynchronous exceptions for the moment, should be adequate for the task. I suppose that means you could instead do
threadWaitReadMVar fd = (atomically . fst) =<< withMVar fd threadWaitReadSTM
Which seems like an odd use of STM, but that also does seem like a solution. So I guess the earlier part of this email (as well as eariler emails) is in fact wrong, that threadWaitReadSTM does cover this use case. And STM might also offer a nicer way of making multiple reads from the inotify binding thread-safe as well.
This should work even in the presence of async exceptions, barring bugs in withMVar and threadWaitReadSTM. You can implement your registerWaitRead using forkIO and MVars, but I think the only reason to do so would be for compatibility with older ghcs. It's probably more sensible to just copy the definition of threadWaitReadSTM in that case, unless you want to target pre-STM compiler versions.
John
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, Sep 3, 2014 at 4:21 AM, John Lato
On Tue, Sep 2, 2014 at 12:26 PM, Leon Smith
wrote: On Tue, Sep 2, 2014 at 1:31 PM, Felipe Lessa
wrote: I don't see how one could allow concurrent readers and "closers" without leaving this small opening. The best workaround I can think of is to create a blocking close operation that waits for readers using a semaphore.
Well yes, I can't think of a simple lock-based solution to this problem, because you don't want to hold any kind of lock while you are blocked on the file descriptor.
If you wanted to go this route, you could use an MVar (Maybe (Int,Fd)), where the Int is a count of interested threads. Instead of using readMVar before threadWaitRead, you would use modifyMVar to atomically increment the counter and retrieve the fd. Then, after threadWaitRead returns, decrement the counter. You'd need to make sure that you never close an fd when the counter is greater than 0. This would work better with a TMVar, because then the close operation could block until the counter has changed.
I think the semantics should probably be for a concurrent close to immediately close the descriptor, with exceptions raised in any threads that are blocked on the descriptor. It seems that any sort of use-case along these lines is going to result in an exception eventually... so you might as well make it prompt. So I still don't see any lock-based solution for my intended semantics.
threadWaitReadMVar fd = (atomically . fst) =<< withMVar fd
threadWaitReadSTM
This should work even in the presence of async exceptions, barring bugs in withMVar and threadWaitReadSTM. You can implement your registerWaitRead using forkIO and MVars, but I think the only reason to do so would be for compatibility with older ghcs. It's probably more sensible to just copy the definition of threadWaitReadSTM in that case, unless you want to target pre-STM compiler versions.
I think you are right; I suppose an async exception could result in interest registered in the descriptor without ever waiting on it, but this really isn't a big deal, especially as the callback will be removed from the IO manager once the descriptor becomes readable or the descriptor is closed. And it shouldn't prevent the higher-level inotify "descriptor" from being garbage-collected either, as the IO manager doesn't know anything about that, so the finalizer can still close the descriptor. Best, Leon

On 03-09-2014 14:40, Leon Smith wrote:
I think the semantics should probably be for a concurrent close to immediately close the descriptor, with exceptions raised in any threads that are blocked on the descriptor. It seems that any sort of use-case along these lines is going to result in an exception eventually... so you might as well make it prompt. So I still don't see any lock-based solution for my intended semantics.
How far are you willing to go? :) Instead of an Int, you could use [ThreadId] as a set (or any other data structure). The only problem I can see with this approach is: - Thread A adds itself to the [ThreadId] and waits for the Fd. - Thread B starts closing the Fd. - Thread A is awakened from the Fd. - Thread B closes the Fd and sends async exceptions to all [ThreadId], including thread A. - Thread A tries to modify the [ThreadId] again to remove itself, but sees Nothing instead. At this point, thread A has not yet received the async exception, but it *knows* that the exception is coming! Poor thread A :(. Cheers! -- Felipe.

On Tue, Sep 2, 2014 at 1:01 PM, John Lato
I was thinking you could do away entirely with the Maybe wrapper. But I guess that might not be possible in your case.
Actually, the approach currently on github uses -1 (an invalid descriptor) to represent Nothing, and does away with Maybe altogether, but that's just a implementation detail.
Have you actually observed this behavior? I suspect that the thread will never yield between readMVar and threadWaitRead because there are no allocations. I suppose an async exception could arise, so it might be correct to run that line with exceptions masked.
Not without inserting a superfluous readMVar read between reading the Fd and the threadWaitRead so that a demonstration can precisely control the interleaving of several threads. You are probably correct, that there aren't any yield points between the readMVar and the start of the threadWaitRead call, but are there any potential yield points inside of threadWaitRead itself before the interest in the file descriptor is properly registered in the IO manager? And I agree with Felipe, even if correct, this sort of assumption seems much too strong and fragile to safely make in a library that I hope not to have to maintain too much. As for async exceptions, the binding definitely needs work to make it safe in their presence, but that's another issue entirely. =) Best, Leon
participants (5)
-
Alexander Kjeldaas
-
Brandon Allbery
-
Felipe Lessa
-
John Lato
-
Leon Smith