
When a "bound" foreign exported function is invoked [by foreign code], the implementation checks whether a Haskell thread is associated with the current OS thread. If there is one, this Haskell thread is used to execute the callback. If there is none, a new Haskell thread is created and associated with the native thread. This is the only situation where a Haskell thread is associated with a native thread. The new associated Haskell thread is then used to execute the callback. When the callback finishes, the Haskell thread is terminated, the association is dissolved, but the OS thread continues to run.
This is the bit I have trouble with too. If the OS thread trying to call a foreign export has an associated Haskell thread, then it must be because the Haskell thread called out to C in the first place. This Haskell thread will be waiting for the call to return, and has all the state associated with its current execution context, so we can't just use that thread to run the foreign export. I don't see the problem with forking a new Haskell thread for each foreign export, and associating it with the current native thread if the foreign export is marked "bound". It does mean we can get multiple Haskell threads bound to the same native thread, but only one can be runnable at any one time (this is an important invariant from the point of view of the implementation, I believe). Cheers, Simon