
After writing a fairly long, detailed reply (attached at end), I decided it would be simpler to write my take on what the design should be. Goals ~~~~~ Since foreign libraries sometimes exploit thread local state, it is necessary to provide some control over which thread is used to execute foreign code. In particular, it is important that it should be possible for Haskell code to arrange that a sequence of calls to a given library are performed by the same native thread and that if an external library calls into Haskell, then any outgoing calls from Haskell are performed by the same native thread. This specification is intended to be implementable both by multithreaded Haskell implementations and by single-threaded implementations and so it does not comment on which particular OS thread is used to execute Haskell code. Design ~~~~~~ Haskell threads may be associated at thread creation time with either zero or one native threads. There are only two ways to create Haskell threads so there are two cases to consider: 1) forkNativeThread :: IO () -> IO () The fresh Haskell thread is associated with a fresh native thread. (ToDo: do we really need to use a fresh native thread or would a pool of threads be ok? The issue could be avoided by separating creation of the native thread from the 'associate' operation.) 2) Calls to a threadsafe foreign export allocate a fresh Haskell thread which is then associated with the Haskell thread. Calls to threadsafe foreign imports by threads which have an associated native thread are performed by that native thread. Calls to any other foreign imports (i.e., 'safe' or 'unsafe' calls) may be made in other threads or, if it exists, in the associated native thread at the implementation's discretion. [ToDo: can Haskell threads with no associated thread make foreign calls using a thread associated with some other thread? Issues ~~~~~~ If multiple Haskell threads are associated with a single native thread, only one can make a foreign call at a time but what if the thread calls back into Haskell? Obviously, this callback is allowed to call out to C too. There is a third way to spawn a thread in GHC: running a finalizer attached to a weak pointer. There really ought to be a way to associate a native thread with finalizers. I guess the same applies to ForeignPtr finalizers which are (implicitly) foreign function calls. -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/
A foreign exported callback that is called from C code executing in an OS thread that is not associated with a "native" haskell thread is executed in a new green haskell thread.
Wouldn't it be better to always execute it in the native thread - and give that trhead equal status with native threads forked by Haskell code. As I understand the design at present, the only way to cause Haskell code to get executed in a particular native thread is to have Haskell fork the thread. This is fine for Haskell applications calling C libraries but doesn't seem too useful for C applications calling Haskell libraries. If the reason for wanting it to be executed in a green thread is efficiency, it seems we could say that 'threadsafe' calls will use the current calling thread and 'safe' and 'unsafe' calls may use a cheaper thread.
If a "native" haskell thread enters a foreign imported function that is marked as "safe" or "threadsafe", all other Haskell threads keep running. If the imported function is marked as "unsafe", no other threads are executed until the call finishes.
It seems like this is the wrong way around. 'unsafe' is meant to be the one that maximizes performance so surely that's the one where other threads are allowed to keep running if they want (and the implementation supports it). Also, the current ffi spec is written such that a compiler can choose to treat all 'unsafe' calls as 'safe' calls if it wishes. Your language seems to imply that programmers can rely on unsafe calls _not_ behaving like safe calls. What's the purpose of the restriction? Is it to make things cheaper or is it to provide programmers with guarantees about preemption and reentrancy?
If a "green" haskell thread enters a foreign imported function marked as "threadsafe", a new OS thread is spawned that keeps executing other green haskell threads while the foreign function executes.
Do you intend that a fresh thread should be spawned each time or is it ok to maintain a pool of previously used threads?
Native haskell threads continue to run in their own OS threads. If a "green" haskell thread enters a foreign imported function marked as "safe", all other green threads are blocked.
Why are they blocked? Is it to provide some kind of guarantee about preemption and reentrancy or is it to make the implementation simpler?
Some people may want to run finalizers in specific OS threads (are finalizers predictable enough for this to be useful?).
As I remarked in my reply to Nicholas, this isn't too bad a restriction because we can arrange for all calls to be made by a worker thread which does all OpenGL interaction.
Everyone would want SMP if it came for free (but SMP seems to be too hard to do at the moment...)
Of course, the semantics should be written so that SMP will just work. For that matter, I'd like it to be possible to implement this spec in Hugs. Hugs is internally single-threaded but this spec is concerned with what happens when Haskell calls out to C and we could arrange to switch into the appropriate native thread when Hugs calls out to C.
Other things I'm not sure about: What should we do get if a foreign function spawns a new OS thread and executes a haskell callback in that OS thread? Should a new native haskell thread that executes in the OS thread be created? Should the new OS thread be blocked and the callback executed in a green thread? What does the current threaded RTS do? (I assume the non-threaded RTS will just crash?)
I think that if the thread makes [threadsafe?] foreign calls, they should be executed in that same native thread. (What OS thread is actually used to execute Haskell code is not observable so we needn't talk about it.) Think of it from the point of view of an external application calling a library that just happens to be written in Haskell. If I call some function in the library and it makes a call into some other library (or back into the application), you expect the call to be in the same thread unless there is explicit documentation to the contrary.
Some (not very concrete) examples: 1.) Let's assume a piece of C code --- lets call it foo() --- was called by a green haskell thread. If foo() now invokes a haskell function, the haskell function might be executed in a different OS thread than foo(). This means that if the haskell code calls another C function, bar(), then bar() doesn't have access to the same thread-local state as foo(). For example, if foo() sets up an OpenGL context, then bar() can't use it.
I think this is a mistake. There should be a way (e.g., marking both the foreign import and the foreign export as 'threadsafe') to force the use of the same native thread for the call-out as was used for the call-in.