On Tue, Aug 19, 2014 at 11:36 PM, Nikolay Amiantov <nikoamia@gmail.com> wrote:

Hello Cafe,

I'm using FFI to interact with a library which calls, when fail, leave
the reason in some kind of "errno"-like variable which is retrived via
another call. AFAIU, this is not thread-safe in Haskell even if
thread-local storage is used inside the library, because Haskell uses
its own thread management and the Haskell thread in the same OS thread
might be switched between the actual call and the retrival of errno
value.

Good question!

This should be somehow handled already in Haskell (errno is
widely used with syscalls in Linux, for example), but the source of
Foreign.C.Error suggests that this is not handled in any way at all.
For example, throwErrnoIf is implemented as such:

throwErrno loc = do
errno <- getErrno
ioError (errnoToIOError loc errno Nothing Nothing)

throwErrnoIf pred loc f = do
res <- f
if pred res then throwErrno loc else return res

So, the question is: how is it ensured that this block is "atomic" in
sense that at most one Haskell thread computes this whole function at
every moment in every OS thread?

I do not know the RTS very well, but I think this might be unsafe. If the thread is bound, then getErrno is guaranteed to be executed on the same thread, but otherwise no such guarantee is given.

However, reading http://blog.ezyang.com/2013/01/the-ghc-scheduler/ it might be unlikely.

If GC happens, "Threads are put in front (pushOnRunQueue) if: ... In the threaded runtime, if a thread was interrupted because another Capability needed to do a stop-the-world GC (see commit 6d18141d8);"

However, the same post indicates that you can force this behavior using signals.

"Threads are put in back (appendToRunQueue) in the case of pre-emption, or if it’s new; particularly, if: ...A thread was pre-empted via the context switch flag (e.g. incoming message from another thread, the timer fired, the thread cooperatively yielded, etc; see also [8] on how this interacts with heap overflows);"

Reading errno directly after the FFI call can eliminate heap overflows, but the async exception and timer issues still seem possible.

I would also like to see a good explanation of this.

Alexander