RE: Native Threads in the RTS

| I've now written up a slightly more formal proposal for native threads. | (OK, it's only a tiny bit more formal...) | I doubt I have explained everything clearly, please tell me which | points are unclear. And of course please tell me what you like/don't | like about it. Great, thanks. I hope you'll keep it up to date so that by the time the discussion converges it can serve as a specification and rationale. We can put it in CVS too... Simon will think of where! Ultimately it'd be worth integrating with http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi-thread. html Simon | A "native" haskell thread and all foreign imported functions that it | calls are executed in its associated OS thread. This part is ok | A foreign exported | callback that is called from C code executing in that OS thread is | executed in the native haskell thread. This is the bit I don't understand. Is the only scenario you have in mind here native Haskell thread calls C which calls Haskell and you want all that in the same native thread? What about this? native Haskell thread calls C which installs a pointer to a foreign-exported Haskell function in some C data structure Later... some other Haskell thread calls C which waits for an event which calls the callback So the callback was installed by a native thread, but won't be executed by it. Is that ok? Anyway I think it would be worth explaining what is guaranteed a bit more clearly. | If a "green" haskell thread enters a foreign imported function marked | as "safe", all other green threads are blocked. Native haskell threads | continue to run in their own OS threads. No, I don't think so. The reason that 'safe' is cheaper than 'threadsafe' is that the current worker OS thread does not need to release the Big Lock it holds on the Haskell heap, thereby allowing other green threads to run. Instead, it holds the lock, executes the call, and returns. At least I think this is the idea, but it's all jolly slippery. | Other things I'm not sure about: Presumably if a native thread spawns a thread using forkIO, it gets just a green thread? If it used forkNativeThread it gets a distinct native thread. Better say this.

Great, thanks. I hope you'll keep it up to date so that by the time the discussion converges it can serve as a specification and rationale. We can put it in CVS too... Simon will think of where!
Until then, I'll play the role of a "human CVS server".
Ultimately it'd be worth integrating with http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi- thread. html
Of course. Some parts should be part of the user documentation, while others should probably be considered implentation details.
| A foreign exported | callback that is called from C code executing in that OS thread is | executed in the native haskell thread.
This is the bit I don't understand. Is the only scenario you have in mind here
native Haskell thread calls C which calls Haskell
and you want all that in the same native thread?
Yes, exactly.
What about this?
native Haskell thread calls C which installs a pointer to a foreign-exported Haskell function in some C data structure
Later... some other Haskell thread calls C which waits for an event which calls the callback So the callback was installed by a native thread, but won't be executed by it. Is that ok?
Definitely. It's the same way it works in C. What thread some code executes in depends on what thread the code is called from.
Anyway I think it would be worth explaining what is guaranteed a bit more clearly.
I'm not sure how... to me it looks like I already specified this exactly ;-). Anyway, I've added some examples to the proposal to clarify what I mean.
| If a "green" haskell thread enters a foreign imported function marked | as "safe", all other green threads are blocked. Native haskell threads | continue to run in their own OS threads.
No, I don't think so. The reason that 'safe' is cheaper than 'threadsafe' is that the current worker OS thread does not need to release the Big Lock it holds on the Haskell heap, thereby allowing other green threads to run. Instead, it holds the lock, executes the call, and returns. At least I think this is the idea, but it's all jolly slippery.
I thought that was "unsafe"? The "safe" version still does quite a lot (after all, a callbacks are allowed, so is GC). In addition, "threadsafe" may start a new OS thread in order to keep executing green threads. On the other hand, we might simply leave it unspecified: If people want to know what happens to other threads, they should use "threadsafe" or "unsafe". The exact behaviour of "safe" seems to be an implementation detail.
| Other things I'm not sure about:
Presumably if a native thread spawns a thread using forkIO, it gets just a green thread? If it used forkNativeThread it gets a distinct native thread. Better say this.
"The main program and all haskell threads forked using forkIO are green threads. Threads forked using forkNativeThread :: IO () -> IO () are native threads." I thought that was clear enough... I've added a note. Cheers, Wolfgang ***************** Native Threads Proposal, version 2 Some "foreign" libraries (for example OpenGL) rely on a mechanism called thread-local storage. The meaning of an OpenGL call therefore usually depends on which OS thread it is called from. Therefore, some kind of direct mapping from Haskell threads to OS threads is necessary in order to use the affected foreign libraries. Executing every haskell thread in its own OS thread is not feasible for performance reasons. However, perfomance of native OS threads is not too bad as long as there aren't too many, so I propose that some threads get their own OS threads, and some don't: Every Haskell Thread can be either a "green" thread or a "native" thread. For each "native" thread, there is exactly one OS thread created by the RTS. For a green thread, it is unspecified which OS thread it is executed in. The main program and all haskell threads forked using forkIO are green threads. Threads forked using forkNativeThread :: IO () -> IO () are native threads. (Note: The type of the current haskell thread does _not_ matter when forking new threads) Execution of a green thread might move from one OS thread to another at any time. A "green" thread is never executed in an OS thread that is reserved for a "native" thread. A "native" haskell thread and all foreign imported functions that it calls are executed in its associated OS thread. A foreign exported callback that is called from C code executing in that OS thread is executed in the native haskell thread. A foreign exported callback that is called from C code executing in an OS thread that is not associated with a "native" haskell thread is executed in a new green haskell thread. Only one OS thread can execute Haskell code at any given time. If a "native" haskell thread enters a foreign imported function that is marked as "safe" or "threadsafe", all other Haskell threads keep running. If the imported function is marked as "unsafe", no other threads are executed until the call finishes. If a "green" haskell thread enters a foreign imported function marked as "threadsafe", a new OS thread is spawned that keeps executing other green haskell threads while the foreign function executes. Native haskell threads continue to run in their own OS threads. If a "green" haskell thread enters a foreign imported function marked as "safe", all other green threads are blocked. It is implementation dependent whether native haskell threads continue to run in their own OS threads. If the imported function is marked as "unsafe", no other threads are executed until the call finishes. Finalizers are always run in green threads. Issues deliberately not addressed in this proposal: Some people may want to run several Haskell threads in a dedicated OS thread (this is what has been called "thread groups" before). Some people may want to run finalizers in specific OS threads (are finalizers predictable enough for this to be useful?). Everyone would want SMP if it came for free (but SMP seems to be too hard to do at the moment...) Other things I'm not sure about: What should we do get if a foreign function spawns a new OS thread and executes a haskell callback in that OS thread? Should a new native haskell thread that executes in the OS thread be created? Should the new OS thread be blocked and the callback executed in a green thread? What does the current threaded RTS do? (I assume the non-threaded RTS will just crash?) Some (not very concrete) examples: 1.) Let's assume a piece of C code --- lets call it foo() --- was called by a green haskell thread. If foo() now invokes a haskell function, the haskell function might be executed in a different OS thread than foo(). This means that if the haskell code calls another C function, bar(), then bar() doesn't have access to the same thread-local state as foo(). For example, if foo() sets up an OpenGL context, then bar() can't use it. 2.) If foo() was invoked by a native haskell thread, it is guaranteed that all haskell functions invoked by foo() run in the same native haskell thread and therefore in the same OS thread. Now if the haskell code again calls bar(), then bar() is executed in the same OS thread as foo() and the native haskell thread. This means that bar() has access to the same thread-local state as foo() (---> OpenGL works). 3.) A piece of C code creates a new OS thread and calls a haskell function in that new OS thread. I don't think it makes sense to tun the haskell function in an existing haskelll thread, so we'll create a new one. What kind of haskell thread (native or green) should the haskell function run in? I'm slightly in favour of a new native thread (after all, the C code might have it's reasons for spawning a new OS thread).

Hello, I read your proposal. It's great but I have a few remarks : * I think that, if it is not too much complicated, it could be great to put many threads in the OpenGL OS thread. The goal of concurrent Haskell was to allow concurrency for expressivity. It would be a pity to lose this in part of programs for technical reason. Having this possibility would also be a pro this language : Haskell would be the only language to have safe multithreaded OpenGL programming. *Another problem can raise : if one render in two different OpenGL windows, he may want to use different threads for these rendering. However, this is impossible for the moment because it would implies that user threads know when a switch has occurred to a thread rendering in another context and swap OpenGL context. This implies a notion of either : allowing to execute arbitrary code on switch ; either introducing a notion of "family" of threads. When a switch occurs to a member of a family different of the last member of the family executed , some user defined code should be executed (in that case the code perform a context switch). family and members of family would be defined by user. It seems that family and OS threads are independent : it could either be multiple OS threads for a family or multiple families for an OS threads. I think a thread could be member of multiple families (even if I can't see pertinent example so far). I also think that multiple threads can be the same member of a family (multiple threads drawing in the same window). * A simple way of introducing these new notions (family, members, OS threads) would be to define them as first class value. This would ease the problem of callback : an OpenGL callback would simply be a call to an operator telling the OpenGL thread to execute really the callback. "threadsafe" could be a short hand for wrapping a callback in a call to the thread currently executing (at time of the call to threadsafe function) and give it to extern function. * To protect OpenGL operations, it would perhaps be useful to introduce a mechanism forbidding to switch between members of a family between a critical section. (I don't know what do a context switch between a glBegin and glEnd). I don't know if my proposal is pertinent but it addresses some problems that would arise. It's quite complicated but i think that there is no overcost for people who don't need using it. Best regards, Nicolas Oury

Nicolas Oury a écrit:
* I think that, if it is not too much complicated, it could be great to put many threads in the OpenGL OS thread. The goal of concurrent Haskell was to allow concurrency for expressivity. It would be a pity to lose this in part of programs for technical reason. Having this possibility would also be a pro this language : Haskell would be the only language to have safe multithreaded OpenGL programming.
You can safely render into two different OpenGL contexts from two different OS threads. I don't think that rendering into the same context from two green threads would work - the OpenGL interface is far too thread-based for this to be useful.
*Another problem can raise : if one render in two different OpenGL windows, he may want to use different threads for these rendering. However, this is impossible for the moment because it would implies that user threads know when a switch has occurred to a thread rendering in another context and swap OpenGL context. This implies a notion of either : allowing to execute arbitrary code on switch ; > [...]
If we want to render into two different OpenGL windows in parallel, we can use two OS threads. OpenGL keeps a reference to its current OpenGL context on a per-OS-thread basis (some old OpenGL implementations might not support this, but I think we can ignore them).
[...] some user defined code should be executed (in that case the code perform a context switch) [...]
Haskell Code won't work here [after all, we're between two haskell threads...]. C code would be no problem. I actually proposed something like this as a stopgap measure for making OpenGL work with the threaded RTS in summer, but I was convinced by others on this list that this is a "hackish" solution that relies on internals of the RTS far too much.
It seems that family and OS threads are independent : it could either be multiple OS threads for a family or multiple families for an OS threads. I think a thread could be member of multiple families (even if I can't see pertinent example so far). I also think that multiple threads can be the same member of a family (multiple threads drawing in the same window).
What would it mean if a thread was a member of more than one thread families? Would it mean that it might execute in several different OS threads? Also, how would these thread groups interact with the existing threaded RTS? Would the existing features still be available without additional effort? Would they be implemented on top of these thread families? I'm not quite convinced that the thread families approach would be worth the additional complexity. What would it be used for?
* To protect OpenGL operations, it would perhaps be useful to introduce a mechanism forbidding to switch between members of a family between a critical section. (I don't know what do a context switch between a glBegin and glEnd).
We already have MVars, they can be used for things like that. Can anybody else think of reasons why we should need a more complicated design where threads are put into different "families" or "groups", where each thread group executes in exactly one OS thread? This has been proposed at least twice now, but I fail to see the advantages (it looks more flexible, but what's it _for_?). Some disadvantages of a thread groups approach are: *) More complexity. *) Foreign calls will block other threads in the same group. *) It would be even less meaningful for a haskell implementation that always uses OS threads --- the native/green threads proposal could be implemented as a no-op (forkNativeThread = forkIO) without breaking programs that use it. Somebody else please fill in the advantages. Cheers, Wolfgang

Nicolas Oury
* I think that, if it is not too much complicated, it could be great to put many threads in the OpenGL OS thread. The goal of concurrent Haskell was to allow concurrency for expressivity. It would be a pity to lose this in part of programs for technical reason. Having this possibility would also be a pro this language : Haskell would be the only language to have safe multithreaded OpenGL programming.
Note that you can achieve the goal of having multithreaded OpenGL programming using Wolfgang's proposal. All you need to do is fork a worker thread which will make all the OpenGL calls. Any green thread wanting to make a call just sends a request to the worker and waits for a response. The code would look something like this: type Response = () type Job = IO Response type Ch = (MVar Job, MVar Response) -- the worker thread runs this worker :: Ch -> IO () worker ch@(jobs, responses) = do j <- readMVar jobs a <- j writeMVar responses a worker ch -- green threads call this to get work done doit :: Ch -> Job -> IO Response doit (jobs, responses) j = do writeMVar jobs j readMVar response Things get a little tedious if you want multiple return types. -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/

After writing a fairly long, detailed reply (attached at end), I decided it would be simpler to write my take on what the design should be. Goals ~~~~~ Since foreign libraries sometimes exploit thread local state, it is necessary to provide some control over which thread is used to execute foreign code. In particular, it is important that it should be possible for Haskell code to arrange that a sequence of calls to a given library are performed by the same native thread and that if an external library calls into Haskell, then any outgoing calls from Haskell are performed by the same native thread. This specification is intended to be implementable both by multithreaded Haskell implementations and by single-threaded implementations and so it does not comment on which particular OS thread is used to execute Haskell code. Design ~~~~~~ Haskell threads may be associated at thread creation time with either zero or one native threads. There are only two ways to create Haskell threads so there are two cases to consider: 1) forkNativeThread :: IO () -> IO () The fresh Haskell thread is associated with a fresh native thread. (ToDo: do we really need to use a fresh native thread or would a pool of threads be ok? The issue could be avoided by separating creation of the native thread from the 'associate' operation.) 2) Calls to a threadsafe foreign export allocate a fresh Haskell thread which is then associated with the Haskell thread. Calls to threadsafe foreign imports by threads which have an associated native thread are performed by that native thread. Calls to any other foreign imports (i.e., 'safe' or 'unsafe' calls) may be made in other threads or, if it exists, in the associated native thread at the implementation's discretion. [ToDo: can Haskell threads with no associated thread make foreign calls using a thread associated with some other thread? Issues ~~~~~~ If multiple Haskell threads are associated with a single native thread, only one can make a foreign call at a time but what if the thread calls back into Haskell? Obviously, this callback is allowed to call out to C too. There is a third way to spawn a thread in GHC: running a finalizer attached to a weak pointer. There really ought to be a way to associate a native thread with finalizers. I guess the same applies to ForeignPtr finalizers which are (implicitly) foreign function calls. -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/
A foreign exported callback that is called from C code executing in an OS thread that is not associated with a "native" haskell thread is executed in a new green haskell thread.
Wouldn't it be better to always execute it in the native thread - and give that trhead equal status with native threads forked by Haskell code. As I understand the design at present, the only way to cause Haskell code to get executed in a particular native thread is to have Haskell fork the thread. This is fine for Haskell applications calling C libraries but doesn't seem too useful for C applications calling Haskell libraries. If the reason for wanting it to be executed in a green thread is efficiency, it seems we could say that 'threadsafe' calls will use the current calling thread and 'safe' and 'unsafe' calls may use a cheaper thread.
If a "native" haskell thread enters a foreign imported function that is marked as "safe" or "threadsafe", all other Haskell threads keep running. If the imported function is marked as "unsafe", no other threads are executed until the call finishes.
It seems like this is the wrong way around. 'unsafe' is meant to be the one that maximizes performance so surely that's the one where other threads are allowed to keep running if they want (and the implementation supports it). Also, the current ffi spec is written such that a compiler can choose to treat all 'unsafe' calls as 'safe' calls if it wishes. Your language seems to imply that programmers can rely on unsafe calls _not_ behaving like safe calls. What's the purpose of the restriction? Is it to make things cheaper or is it to provide programmers with guarantees about preemption and reentrancy?
If a "green" haskell thread enters a foreign imported function marked as "threadsafe", a new OS thread is spawned that keeps executing other green haskell threads while the foreign function executes.
Do you intend that a fresh thread should be spawned each time or is it ok to maintain a pool of previously used threads?
Native haskell threads continue to run in their own OS threads. If a "green" haskell thread enters a foreign imported function marked as "safe", all other green threads are blocked.
Why are they blocked? Is it to provide some kind of guarantee about preemption and reentrancy or is it to make the implementation simpler?
Some people may want to run finalizers in specific OS threads (are finalizers predictable enough for this to be useful?).
As I remarked in my reply to Nicholas, this isn't too bad a restriction because we can arrange for all calls to be made by a worker thread which does all OpenGL interaction.
Everyone would want SMP if it came for free (but SMP seems to be too hard to do at the moment...)
Of course, the semantics should be written so that SMP will just work. For that matter, I'd like it to be possible to implement this spec in Hugs. Hugs is internally single-threaded but this spec is concerned with what happens when Haskell calls out to C and we could arrange to switch into the appropriate native thread when Hugs calls out to C.
Other things I'm not sure about: What should we do get if a foreign function spawns a new OS thread and executes a haskell callback in that OS thread? Should a new native haskell thread that executes in the OS thread be created? Should the new OS thread be blocked and the callback executed in a green thread? What does the current threaded RTS do? (I assume the non-threaded RTS will just crash?)
I think that if the thread makes [threadsafe?] foreign calls, they should be executed in that same native thread. (What OS thread is actually used to execute Haskell code is not observable so we needn't talk about it.) Think of it from the point of view of an external application calling a library that just happens to be written in Haskell. If I call some function in the library and it makes a call into some other library (or back into the application), you expect the call to be in the same thread unless there is explicit documentation to the contrary.
Some (not very concrete) examples: 1.) Let's assume a piece of C code --- lets call it foo() --- was called by a green haskell thread. If foo() now invokes a haskell function, the haskell function might be executed in a different OS thread than foo(). This means that if the haskell code calls another C function, bar(), then bar() doesn't have access to the same thread-local state as foo(). For example, if foo() sets up an OpenGL context, then bar() can't use it.
I think this is a mistake. There should be a way (e.g., marking both the foreign import and the foreign export as 'threadsafe') to force the use of the same native thread for the call-out as was used for the call-in.

After sending this mail this morning, I realized that threadsafety is largely orthogonal to the choice of which thread to run in. For example, I might want to make an 'unsafe' call in a particular native thread. So my proposed spec should add a second, orthogonal choice of ffi call types ('native'|'green') which may be specified _in addition to_ the current 'threadsafe'|'safe'|'unsafe'. -- Alastair ps Better names than 'native' and 'green' surely exist. Something which conveys the idea that the thread will be remembered for later use seems appropriate but no good words spring to mind.

On 26 Nov 2002, Alastair Reid wrote:
ps Better names than 'native' and 'green' surely exist. Something which conveys the idea that the thread will be remembered for later use seems appropriate but no good words spring to mind.
Perhaps "bound" and "free"?

On Tue, 2002-11-26 at 08:32, Dean Herington wrote:
On 26 Nov 2002, Alastair Reid wrote:
ps Better names than 'native' and 'green' surely exist. Something which conveys the idea that the thread will be remembered for later use seems appropriate but no good words spring to mind.
Perhaps "bound" and "free"?
Won't that be confused with bound and free as used in lambda calculus? Or am I missing the point and you mean that they are analogous to bound and free in lambda calculus? (If the latter, I need to reread my lambda calculus books. :) )
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Seth Kurtzberg M. I. S. Corp 480-661-1849 Pager 888-605-9296, or 6059296@skytel.com
participants (6)
-
Alastair Reid
-
Dean Herington
-
Nicolas Oury
-
Seth Kurtzberg
-
Simon Peyton-Jones
-
Wolfgang Thaller