FFI: number of worker threads?

Hello, The paper "Extending the Haskell FFI with Concurrency" mentioned the following in Section 6.3: "GHC's run-time system employs one OS thread for every bound thread; additionally, there is a variable number of so-called "worker" OS threads that are used to execute the unbounded (lightweight) threads." How does the runtime system determine the number of worker threads? Is the number hardcoded in the RTS or dynamically adjustable? Can a programmer specify it as an RTS option or change it using an API? I would like to use a large number (say, 2000) of unbounded threads, each calling a blocking, safe foreign function via FFI import. What is supposed to happen if all the worker threads are used up? I tried this in the recent GHC 6.5 and got some kind of "runaway worker threads?" RTS failure message when more than 32 threads are used. Is it a current limitation of the RTS, or should I file a bug report for it? Thanks, Peng

I have a related question. The docs state that in some environments O/S threads are used when the -threaded flag is used with ghc, and non-O/S threads are used otherwise (presumably these are non-preemptive). Does this apply as well to the worker threads that are the subject of this email?
On Tue, 20 Jun 2006 22:57:17 -0400
"Li, Peng"
Hello,
The paper "Extending the Haskell FFI with Concurrency" mentioned the following in Section 6.3:
"GHC's run-time system employs one OS thread for every bound thread; additionally, there is a variable number of so-called "worker" OS threads that are used to execute the unbounded (lightweight) threads."
How does the runtime system determine the number of worker threads? Is the number hardcoded in the RTS or dynamically adjustable? Can a programmer specify it as an RTS option or change it using an API?
I would like to use a large number (say, 2000) of unbounded threads, each calling a blocking, safe foreign function via FFI import. What is supposed to happen if all the worker threads are used up? I tried this in the recent GHC 6.5 and got some kind of "runaway worker threads?" RTS failure message when more than 32 threads are used. Is it a current limitation of the RTS, or should I file a bug report for it?
Thanks, Peng _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Hello Seth, Wednesday, June 21, 2006, 7:18:48 AM, you wrote: Seth and Li, look at http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi-thread.html it may answer some of your questions (page http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ contains commentaries about GHC internals)
I have a related question. The docs state that in some
The paper "Extending the Haskell FFI with Concurrency" mentioned the
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work. If you think you have only a handful of simultaneously-blocked foreign calls, but you still get "runaway worker threads", please do make a reproducible test case and file a bug report. Simon M will probably reply to Seth's qns about thread IDs (I assume you mean Haskell thread ID?) in due course. Once you get answers, can I ask either or both of you to type in what you learned to the GHC user-documentation Wiki? That way things improve! The place to start is here http://haskell.org/haskellwiki/GHC under "Collaborative documentation". There's a already a page for "Concurrency" and for "FFI", so you can add to those. Thanks Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] | On Behalf Of Bulat Ziganshin | Sent: 21 June 2006 07:00 | To: Seth Kurtzberg | Cc: glasgow-haskell-users@haskell.org | Subject: Re[2]: FFI: number of worker threads? | | Hello Seth, | | Wednesday, June 21, 2006, 7:18:48 AM, you wrote: | | | Seth and Li, look at http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi-thread. html | | it may answer some of your questions | | (page http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ contains | commentaries about GHC internals) | | | > I have a related question. The docs state that in some | | >> The paper "Extending the Haskell FFI with Concurrency" mentioned the | | | -- | Best regards, | Bulat mailto:Bulat.Ziganshin@gmail.com | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 6/21/06, Simon Peyton-Jones
New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work.
2000 OS threads definitely sound scary, but it is possible to work. The Linux NPTL threads can scale well up to 10K threads and the stack address spaces would be sufficient on 64-bit systems. I am thinking about some p2p applications where each peer is maintaining a huge amount of TCP connections to other peers, but most of these connections are idle. Unforturnately the default GHC RTS is multiplexing I/O using "select", which is O(n) and it seems to have a FDSET size limit of 1024. That makes me wonder if the current design of the GHC RTS is optimal in the long run. As software and hardware evolves, we will have efficient OS threads (like NPTL) and huge (64-bit) address spaces. My guess is (1) It is always a good idea to multiplex GHC user-level threads on OS threads, because it improve performance. (2) It may not be optimal to multiplex nonblocking I/O inside the GHC RTS, because it is unrealistic to have an event-driven I/O interface that is both efficient (like AIO/epoll) and portable (like select/poll). What is worse, nonblocking I/O still blocks on disk accesses. On the other hand, the POSIX threads are portable and it can be efficiently implemented on many systems. At least on Linux, NPTL easily beats "select"! My wish is to have a future GHC implementation that (a) uses blocking I/O directly provided by the OS, and (b) provides more control over OS threads and the internal worker thread pool. Using blocking I/O will simplify the current design and allow the programmer to take advantage of high-performance OS threads. If non-blocking I/O is really needed, the programmer can use customized, Claessen-style threads wrapped in modular libraries---some of my preliminary tests show that Claessen-style threads can do a much better job to multiplex asynchronous I/O.
If you think you have only a handful of simultaneously-blocked foreign calls, but you still get "runaway worker threads", please do make a reproducible test case and file a bug report.
Yes, I will try to make a reproducible test case soon.
Once you get answers, can I ask either or both of you to type in what you learned to the GHC user-documentation Wiki? That way things improve! The place to start is here http://haskell.org/haskellwiki/GHC under "Collaborative documentation". There's a already a page for "Concurrency" and for "FFI", so you can add to those. Thanks
Certainly!

On Wed, 2006-06-21 at 12:31 -0400, Li, Peng wrote:
On 6/21/06, Simon Peyton-Jones
wrote: New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work.
2000 OS threads definitely sound scary, but it is possible to work. The Linux NPTL threads can scale well up to 10K threads and the stack address spaces would be sufficient on 64-bit systems.
I am thinking about some p2p applications where each peer is maintaining a huge amount of TCP connections to other peers, but most of these connections are idle. Unforturnately the default GHC RTS is multiplexing I/O using "select", which is O(n) and it seems to have a FDSET size limit of 1024.
That makes me wonder if the current design of the GHC RTS is optimal in the long run. As software and hardware evolves, we will have efficient OS threads (like NPTL) and huge (64-bit) address spaces. My guess is
(1) It is always a good idea to multiplex GHC user-level threads on OS threads, because it improve performance.
Indeed.
(2) It may not be optimal to multiplex nonblocking I/O inside the GHC RTS, because it is unrealistic to have an event-driven I/O interface that is both efficient (like AIO/epoll) and portable (like select/poll). What is worse, nonblocking I/O still blocks on disk accesses. On the other hand, the POSIX threads are portable and it can be efficiently implemented on many systems. At least on Linux, NPTL easily beats "select"!
On linux, epoll scales very well with minimal overhead. Using multiple OS threads to do blocking IO would not scale in the case of lots of idle socket connections, you'd need one OS thread per socket. The IO is actually no longer done inside the RTS, it's done by a Haskell worker thread. So it should be easier now to use platform-specific select() replacements. It's already different between unix/win32. So I'd suggest the best approach is to keep the existing multiplexing non-blocking IO system and start to take advantage of more scalable IO APIs on the platforms we really care about (either select/poll replacements or AIO). Duncan

On 6/21/06, Duncan Coutts
On linux, epoll scales very well with minimal overhead. Using multiple OS threads to do blocking IO would not scale in the case of lots of idle socket connections, you'd need one OS thread per socket.
On Linux, OS threads can also scale very well. I have done an experiment using pipes and NPTL where most connections are idle---the performance scales like a straight line when up to 32K file descriptors and 16K threads are used.
The IO is actually no longer done inside the RTS, it's done by a Haskell worker thread. So it should be easier now to use platform-specific select() replacements. It's already different between unix/win32.
So I'd suggest the best approach is to keep the existing multiplexing non-blocking IO system and start to take advantage of more scalable IO APIs on the platforms we really care about (either select/poll replacements or AIO).
It is easy to take advantage of epoll---it shouldn't be that hard to bake it in. The question is about flexiblity: do we want it to be edge-triggered or level-triggered? Even with epoll built-in, the disk performance cannot keep up with NPTL unless AIO is also built-in. But for AIO, it is more complicated. It bypasses the OS caching; the Linux AIO even requires the use of certain kinds of file systems. My idea is that not everybody needs high-performance, asynchronous or nonblocking I/O. For those who really need it, it is worth (or, necessary) writing their own event loops, and event-driven programming in Haskell is not that difficult using CPS monads.

On Wed, 2006-06-21 at 17:55 +0100, Duncan Coutts wrote:
So I'd suggest the best approach is to keep the existing multiplexing non-blocking IO system and start to take advantage of more scalable IO APIs on the platforms we really care about (either select/poll replacements or AIO).
FYI: Felix provides a thread for socket I/O. Only one is required. Uses epoll/kqueue/io completion ports/select, depending on OS support. Similarly, there is one thread to handle timer events. We're working on a similar arrangement for asynchronous file I/O which many OS provide support for (Windows, Linux at least). There are only a limited number of devices you can connect to a computer. You can't need more than a handful of threads, unless the OS design is entirely lame .. in which case you'd not be trying to run high performance heavily loaded applications on that system in the first place. Our biggest headache is non-reentrant APIs such as Open GL which are only re-entrant on a process basis, this doesn't play with with pre-emptive threading and it's also not good for cooperative threading. The only real solution here is to run a server thread and a thread safe abstraction layer, which cooperate to do context switches when necessary. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net

Hello Peng, Wednesday, June 21, 2006, 8:31:41 PM, you wrote:
My wish is to have a future GHC implementation that (a) uses blocking I/O directly provided by the OS, and (b) provides more control over OS threads and the internal worker thread pool. Using blocking I/O will simplify the current design and allow the programmer to take advantage of high-performance OS threads. If non-blocking I/O is really needed, the programmer can use customized, Claessen-style threads wrapped in modular libraries---some of my preliminary tests show that Claessen-style threads can do a much better job to multiplex asynchronous I/O.
all I/O is done by the library procedures and so we can use other implementations (other libraries) without waiting for GHC changes (and GHC imho will be changed to include some sort of such library instead of adding new features to current already very complex I/O implementation) one of such libs is Einar's network-alt library that uses select/epoll/kqueue to overlap network i/o another library is my own Streams library that implements all the layers of I/O functionality and need only to implement read()/write() behavior in some way. currently is uses direct read()/write() calls but if someone will make alternative fdGetBuf/fdPutBuf implementations, it will work with it read: http://haskell.org/haskellwiki/Library/Streams Download: http://www.haskell.org/library/Streams.tar.gz Installation: run "make install" ps: new version of library is coming soon, but in this particular area nothing was changed -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Li, Peng wrote:
On 6/21/06, Simon Peyton-Jones
wrote: New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work.
2000 OS threads definitely sound scary, but it is possible to work. The Linux NPTL threads can scale well up to 10K threads and the stack address spaces would be sufficient on 64-bit systems.
I am thinking about some p2p applications where each peer is maintaining a huge amount of TCP connections to other peers, but most of these connections are idle. Unforturnately the default GHC RTS is multiplexing I/O using "select", which is O(n) and it seems to have a FDSET size limit of 1024.
That makes me wonder if the current design of the GHC RTS is optimal in the long run. As software and hardware evolves, we will have efficient OS threads (like NPTL) and huge (64-bit) address spaces. My guess is
(1) It is always a good idea to multiplex GHC user-level threads on OS threads, because it improve performance. (2) It may not be optimal to multiplex nonblocking I/O inside the GHC RTS, because it is unrealistic to have an event-driven I/O interface that is both efficient (like AIO/epoll) and portable (like select/poll). What is worse, nonblocking I/O still blocks on disk accesses. On the other hand, the POSIX threads are portable and it can be efficiently implemented on many systems. At least on Linux, NPTL easily beats "select"!
My wish is to have a future GHC implementation that (a) uses blocking I/O directly provided by the OS, and (b) provides more control over OS threads and the internal worker thread pool. Using blocking I/O will simplify the current design and allow the programmer to take advantage of high-performance OS threads. If non-blocking I/O is really needed, the programmer can use customized, Claessen-style threads wrapped in modular libraries---some of my preliminary tests show that Claessen-style threads can do a much better job to multiplex asynchronous I/O.
I've read your paper, and I expect many others here have read it, too. The results are definitely impressive. Ultimately what we want to do is to use a more flexible I/O library (eg. streams) that lets you choose the low-level I/O mechanism for each individual stream while maintaining the same high-level interface. If you want to use blocking I/O and OS threads to do the multiplexing, then you could do that. Similarly, if you want to use epoll underneath, then we should provide a way to do that. I imagine that most people will want epoll (or equivalent) by default, because that will give the best performance, but we'll have the portable fallback of OS threads if the system doesn't have an epoll equivalent, or we haven't implemented it. Am I right in thinking this will address your concerns? Mainly you are worried that you have no choice but to use the supplied select() implementation on Unix systems, right? There's one further advantage to using epoll as opposed to blocking read/write: the extra level of indirection means that the runtime can easily interrupt a thread that is blocked in I/O, because typically it will in fact be blocked on an MVar waiting to be unblocked by the thread performing epoll. It is much harder to interrupt a thread blocked in an OS call. (currently on Windows where we currently use blocking read/write, throwTo doesn't work when the target thread is blocked on I/O, whereas it does work on Unix systems where we multiplex I/O using select()). This means that you can implement blocking I/O with a timeout in Haskell, for example. Cheers, Simon

On Wed, 21 Jun 2006, Simon Peyton-Jones wrote:
New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work.
Does the RTS use select() to multiplex network IO instead of spawning
threads?
Tony.
--
f.a.n.finch

Tony Finch wrote:
On Wed, 21 Jun 2006, Simon Peyton-Jones wrote:
New worker threads are spawned on as needed. You'll need as many of them as you have simultaneously-blocked foreign calls. If you have 2000 simultaneously-blocked foreign calls, you'll need 2000 OS threads to support them, which probably won't work.
Does the RTS use select() to multiplex network IO instead of spawning threads?
Yes. Cheers, Simon

Bulat Ziganshin wrote:
Wednesday, June 21, 2006, 7:18:48 AM, you wrote:
Seth and Li, look at http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi-thread.html
That's a good page, but it's about 2 generations out of date unfortunately. Some of it is relevant, but many of the details have changed. We do plan to write up the current runtime architecture, probabliy for a paper at some point.
it may answer some of your questions
(page http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ contains commentaries about GHC internals)
And ideally all that material should be on the wiki. Cheers, Simon

| > (page http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ contains | > commentaries about GHC internals) | | And ideally all that material should be on the wiki. In fact we are looking for volunteers to do the HTML -> Wiki translation. No new writing reqd! Simon

Hello Simon, Wednesday, June 21, 2006, 4:10:46 PM, you wrote: | >> (page http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ contains | >> commentaries about GHC internals)
| | And ideally all that material should be on the wiki.
In fact we are looking for volunteers to do the HTML -> Wiki translation. No new writing reqd!
i thought about writing program that automates this task. it will be great for some my own docs. i asked once in haskell lists about such program but no one knows about this. may be someone can try to ask in more general forum? if such program can't be found, i will lazily volunteer to do it (not at this moment, but i have added it to my haskell to-do list) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Seth Kurtzberg wrote:
I have a related question. The docs state that in some environments O/S threads are used when the -threaded flag is used with ghc, and non-O/S threads are used otherwise (presumably these are non-preemptive). Does this apply as well to the worker threads that are the subject of this email?
It sounds like the docs are a bit unclear. Which bit of doc in particular are you referring to? forkIO always creates a lightweight thread. With -threaded, if a thread makes a safe foreign call, then that call might execute concurrently with other threads, because another OS thread (a worker thread) takes over in the runtime. In the HEAD (which will be 6.6), we now allow multiple OS threads in the runtime, so you also get to run multiple Haskell threads simultaneously, which is particularly useful if you have more than one CPU. Cheers, Simon

Simon,
Thanks for the response.
The doc I was referring to is the library haddock doc for Control.Concurrent.
Seth
On Wed, 21 Jun 2006 12:41:52 +0100
Simon Marlow
Seth Kurtzberg wrote:
I have a related question. The docs state that in some environments O/S threads are used when the -threaded flag is used with ghc, and non-O/S threads are used otherwise (presumably these are non-preemptive). Does this apply as well to the worker threads that are the subject of this email?
It sounds like the docs are a bit unclear. Which bit of doc in particular are you referring to?
forkIO always creates a lightweight thread. With -threaded, if a thread makes a safe foreign call, then that call might execute concurrently with other threads, because another OS thread (a worker thread) takes over in the runtime. In the HEAD (which will be 6.6), we now allow multiple OS threads in the runtime, so you also get to run multiple Haskell threads simultaneously, which is particularly useful if you have more than one CPU.
Cheers, Simon

Another related question. I have some threaded applications running which are servers and run continuously. A thread is spawned for each new connection, and the thread exits when the client terminates.
I've noticed that the thread ID increases. On one process I checked today I am up to thread number 3300. The number of running threads is not increasing; only six threads are running on this particular process. The threads are cleaned up and exit. The thread _ID_ is continually increasing.
Is this going to cause a problem when the thread ID exceeds some value? Do I have to force the server process to recycle periodically?
These processes are designed to run continuously, and are running in a fairly demanding commercial environment for extended periods of time. They have proven to be very stable and reliable. I'm hopeful that it will not be necessary to recycle to force the thread ID to restart.
Seth
On Tue, 20 Jun 2006 22:57:17 -0400
"Li, Peng"
Hello,
The paper "Extending the Haskell FFI with Concurrency" mentioned the following in Section 6.3:
"GHC's run-time system employs one OS thread for every bound thread; additionally, there is a variable number of so-called "worker" OS threads that are used to execute the unbounded (lightweight) threads."
How does the runtime system determine the number of worker threads? Is the number hardcoded in the RTS or dynamically adjustable? Can a programmer specify it as an RTS option or change it using an API?
I would like to use a large number (say, 2000) of unbounded threads, each calling a blocking, safe foreign function via FFI import. What is supposed to happen if all the worker threads are used up? I tried this in the recent GHC 6.5 and got some kind of "runaway worker threads?" RTS failure message when more than 32 threads are used. Is it a current limitation of the RTS, or should I file a bug report for it?
Thanks, Peng _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Another related question. I have some threaded applications running which are servers and run continuously. A thread is spawned for each new connection, and the thread exits when the client terminates.
I've noticed that the thread ID increases. On one process I checked today I am up to thread number 3300. The number of running threads is not increasing; only six threads are running on this particular
Seth Kurtzberg wrote: process.
The threads are cleaned up and exit. The thread _ID_ is continually increasing.
Is this going to cause a problem when the thread ID exceeds some value? Do I have to force the server process to recycle periodically?
These processes are designed to run continuously, and are running in a fairly demanding commercial environment for extended periods of time. They have proven to be very stable and reliable. I'm hopeful that it will not be necessary to recycle to force the thread ID to restart.
The thread ID assigned to new threads will wrap around when it reaches 2147483647. In 6.6 we made thread IDs 64 bits, so you get a bit longer before they wrap around. Even if you manage to wrap the thread ID, it'll only be a problem if you actually compare ThreadIDs. Cheers, Simon

Simon Marlow wrote:
Seth Kurtzberg wrote:
Another related question. I have some threaded applications running
which are servers and run continuously. A thread is spawned for each new connection, and the thread exits when the client terminates.
I've noticed that the thread ID increases. On one process I checked
today I am up to thread number 3300. The number of running threads is not increasing; only six threads are running on this particular process. The threads are cleaned up and exit. The thread _ID_ is continually increasing.
Is this going to cause a problem when the thread ID exceeds some value? Do I have to force the server process to recycle periodically?
These processes are designed to run continuously, and are running in a fairly demanding commercial environment for extended periods of time. They have proven to be very stable and reliable. I'm hopeful that it will not be necessary to recycle to force the thread ID to restart.
The thread ID assigned to new threads will wrap around when it reaches 2147483647. In 6.6 we made thread IDs 64 bits
oops, I lie. It's still 32 bits in 6.6. Cheers, Simon

Simon,
Thanks for the info. I don't compare thread IDs. At the moment I merely print out the thread ID in a trace message. Shortly I will be using the thread ID when a need arises to kill a thread. It sounds like the rollover is harmless for these situations.
When you talk about comparing thread IDs, are you thinking that one might compare two thread IDs to see which one is more recently spawned? I can see where you might have a situation where you would compare thread IDs to determine whether two somehow related values "belong" in some sense to the same thread. I'm curious about why one might compare thread IDs in such a way that the rollover would cause the comparison to produce the "wrong" answer.
Seth
On Wed, 21 Jun 2006 12:48:42 +0100
Simon Marlow
Another related question. I have some threaded applications running which are servers and run continuously. A thread is spawned for each new connection, and the thread exits when the client terminates.
I've noticed that the thread ID increases. On one process I checked today I am up to thread number 3300. The number of running threads is not increasing; only six threads are running on this particular
Seth Kurtzberg wrote: process.
The threads are cleaned up and exit. The thread _ID_ is continually increasing.
Is this going to cause a problem when the thread ID exceeds some value? Do I have to force the server process to recycle periodically?
These processes are designed to run continuously, and are running in a fairly demanding commercial environment for extended periods of time. They have proven to be very stable and reliable. I'm hopeful that it will not be necessary to recycle to force the thread ID to restart.
The thread ID assigned to new threads will wrap around when it reaches 2147483647. In 6.6 we made thread IDs 64 bits, so you get a bit longer before they wrap around. Even if you manage to wrap the thread ID, it'll only be a problem if you actually compare ThreadIDs.
Cheers, Simon

Seth Kurtzberg wrote:
Thanks for the info. I don't compare thread IDs. At the moment I merely print out the thread ID in a trace message. Shortly I will be using the thread ID when a need arises to kill a thread. It sounds like the rollover is harmless for these situations.
When you talk about comparing thread IDs, are you thinking that one might compare two thread IDs to see which one is more recently spawned? I can see where you might have a situation where you would compare thread IDs to determine whether two somehow related values "belong" in some sense to the same thread. I'm curious about why one might compare thread IDs in such a way that the rollover would cause the comparison to produce the "wrong" answer.
The runtime doesn't currently check that it isn't reusing thread IDs, so if the thread ID wraps around it is possible that you end up with two threads with the same ID, so then comparing IDs becomes meaningless. Cheers, Simon

Li, Peng wrote:
The paper "Extending the Haskell FFI with Concurrency" mentioned the following in Section 6.3:
"GHC's run-time system employs one OS thread for every bound thread; additionally, there is a variable number of so-called "worker" OS threads that are used to execute the unbounded (lightweight) threads."
How does the runtime system determine the number of worker threads? Is the number hardcoded in the RTS or dynamically adjustable? Can a programmer specify it as an RTS option or change it using an API?
I would like to use a large number (say, 2000) of unbounded threads, each calling a blocking, safe foreign function via FFI import. What is supposed to happen if all the worker threads are used up? I tried this in the recent GHC 6.5 and got some kind of "runaway worker threads?" RTS failure message when more than 32 threads are used. Is it a current limitation of the RTS, or should I file a bug report for it?
As mentioned by Simon PJ, the number of worker threads grows as needed (but doesn't decrease - that would be a useful improvement). The message you're seeing is due to an arbitrary limit I put on the number of workers in order to catch bugs in the runtime before they kill the machine. Perhaps I should remove the limit, or at least make it very much larger. Cheers, Simon
participants (8)
-
Bulat Ziganshin
-
Duncan Coutts
-
Li, Peng
-
Seth Kurtzberg
-
Simon Marlow
-
Simon Peyton-Jones
-
skaller
-
Tony Finch