Hi David,
I'm not 100% sure, especially semantics, and I'm studying too.
I don't have an answer, but I describe the related matters in order to organize my head.
At first:
"memory barrier" ... is order control mechanism between memory accesses.
"bound thread" ... is association mechanism between ffi calls and a specified thread.
And:
"memory barrier" ... is depend on cpu hardware architecture(x86, ARM, ...).
"OS level thread" ... is depend on OS(Linux, Windows, ...).
Last:
There are four cases about ffi call [1]:
(1) safe ffi call on unbound thread(forkIO)
(2) unsafe ffi call on unbound thread(forkIO)
(3) safe ffi call on bound thread(main, forkOS)
(4) unsafe ffi call on bound thread(main, forkOS)
I think, maybe (2) and (4) have not guarantee with memory ordering.
Because they might be inlined and optimized.
If (1) and (3) always use pthread api (or memory barrier api) for thread/HEC context switch,
they are guarantee.
But I think that it would not guarantee the full case.
I feel that order issues are very difficult.
I think order issues can be safely solved by explicit notation,
like explicit memory barrier notation, STM,...
If I have misunderstood, please teach me :-)
Cheers,
Takenobu