Hi David,

Interesting.
I don't have an answer, but I write few things.

Your case is:
  * consecutive FFI calls
  * on the same Haskell Thread

Consecutive FFI call cases are:
  (1) do { safe_ffiCall1;   safe_ffiCall2 }
  (2) do { safe_ffiCall1;   unsafe_ffiCall2 }
  (3) do { unsafe_ffiCall1; safe_ffiCall2 }
  (4) do { unsafe_ffiCall1; unsafe_ffiCall2 }

I think at least answer is 'no' with case (4).
There are no memory barrier between unsafe_ffiCall1 and 2.


And apologies if I'm missing context.
Although a haskell thread can migrate to a different OS thread at any point,
you can put a memory barrier primitive (like "mfence" instruction [1][2][3])
at each target points before or after each ffi calls.

Of course, it's expensive if you put for each ffi calls.
And you should abstract from cpu hardware.
(I found useful explicit memory barrier api[4].)


I feel that the _exact_ memory barrier on out-of-order cpu,
multi core, memory mapped IO, ... is very expensive.
It's only satisfy by explicit "hardware memory barrier mechanism".

And it's difficult that exact memory barrier satisfy all case
by the combination of some implicit mechanism.


BTW, does it truly need memory barrier?
Also C language, exact memory barrier is expensive.


And, Maybe, ghc-devs are very busy to ship ghc7.10.2 :-)


[1]: Chapter 8.2, http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
[2]: MFENCE, http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf
[3]: Chapter 7.5.5, http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
[4]: https://hackage.haskell.org/package/atomic-primops


Cheers,
Takenobu


2015-06-02 22:26 GMT+09:00 David Turner <dct25-561bs@mythic-beasts.com>:
Hi Takenobu,

My question is more about consecutive FFI calls on the same Haskell
thread, of which there are I suppose 8 cases in your model: the thread
is {unbound,bound}, the first call is {safe,unsafe} and the second is
{safe,unsafe}. If the thread is bound, there's no problem as the two
calls happen on the same OS thread. No memory barriers are needed. If
the thread is unbound, the two calls may occur on distinct OS threads.
Although the first call must have returned before the second is made,
it doesn't immediately follow that there has been a memory barrier in
between. I'm not sure it matters whether either call is safe or
unsafe. As a Haskell thread can migrate to a different OS thread at
any point, I don't think it's possible to put appropriate memory
barriers in the source.

I've been looking at the GHC source and commentary and believe the
answer is 'yes', but can anyone from ghc-dev comment on the following?

If a Haskell thread moves to a different OS thread then
yieldCapability() will at some point be called. This function normally
calls ACQUIRE_LOCK, which is either pthread_mutex_lock() or
EnterCriticalSection() in the threaded runtime (on Linux and Win32
respectively). It looks like both of these count as full memory
barriers. I think in the (rare) case where yieldCapability() only does
a GC and then exits, the fact that it's always called in a loop means
that eventually *some* Task or other emits a memory barrier.

Thanks in advance,

David








On 30 May 2015 at 04:10, Takenobu Tani <takenobu.hs@gmail.com> wrote:
> Hi David,
>
> I'm not 100% sure, especially semantics,  and I'm studying too.
> I don't have an answer, but I describe the related matters in order to
> organize my head.
>
> At first:
>   "memory barrier" ... is order control mechanism between memory accesses.
>   "bound thread"   ... is association mechanism between ffi calls and a
> specified thread.
>
> And:
>   "memory barrier"  ... is depend on cpu hardware architecture(x86, ARM,
> ...).
>   "OS level thread" ... is depend on OS(Linux, Windows, ...).
>
> Last:
> There are four cases about ffi call [1]:
>   (1) safe ffi call   on unbound thread(forkIO)
>   (2) unsafe ffi call on unbound thread(forkIO)
>   (3) safe ffi call   on bound thread(main, forkOS)
>   (4) unsafe ffi call on bound thread(main, forkOS)
>
> I think, maybe (2) and (4) have not guarantee with memory ordering.
> Because they might be inlined and optimized.
>
> If (1) and (3) always use pthread api (or memory barrier api) for thread/HEC
> context switch,
> they are guarantee.
> But I think that it would not guarantee the full case.
>
>
> I feel that order issues are very difficult.
> I think order issues can be safely solved by explicit notation,
> like explicit memory barrier notation, STM,...
>
>
> If I have misunderstood, please teach me :-)
>
>
> [1]:
> http://takenobu-hs.github.io/downloads/haskell_ghc_illustrated.pdf#page=98
>
> Cheers,
> Takenobu
>
>
>
> 2015-05-29 1:24 GMT+09:00 David Turner <dct25-561bs@mythic-beasts.com>:
>>
>> Hi,
>>
>> If I make a sequence of FFI calls (on a single Haskell thread) but
>> which end up being called from different OS threads, is there any kind
>> of ordering guarantee given? More specifically, is there a full memory
>> barrier at the point where a Haskell thread migrates to a new OS
>> thread?
>>
>> Many thanks,
>>
>> David
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
>