RE: [Haskell-cafe] Re: Bound threads

On 01 March 2005 11:21, Marcin 'Qrczak' Kowalczyk wrote:
Marcin 'Qrczak' Kowalczyk
writes: Why is the main thread bound?
I can answer myself: if the main thread is unbound, the end of the program can be reached in a different OS thread, which may be a problem if we want to return cleanly to the calling code.
I've now implemented a threaded runtime in my language Kogut, based on the design of Haskell. The main thread is bound. The thread which holds the capability performs I/O multiplexing itself, without a separate service thread.
We found that doing this was excessively complex (well, I thought so anyway). The problem is, when there is no other work to do but there are outstanding I/O requests to wait for, the thread holding the capability has to wait in select(). But then, another OS thread making an external call into Haskell has to somehow indicate that it needs the capability, and wake up the thread blocked in select(). It can do this with a pipe, but then you still have the problem that each call-in requires a context switch, which is sub-optimal if you're implementing a library in Haskell to be called from another language.
Producer/consumer ping-pong is 15 times slower between threads running on different OS threads than on two unbound threads.
You might also want to measure throughput of call-ins. Cheers, Simon

"Simon Marlow"
I've now implemented a threaded runtime in my language Kogut, based on the design of Haskell. The main thread is bound. The thread which holds the capability performs I/O multiplexing itself, without a separate service thread.
We found that doing this was excessively complex (well, I thought so anyway).
Indeed, my brain is melting, but I did it :-) I think our approaches are incomparable in terms of additional overhead, it depends on the program. I have added some optimizations: If a thread which wants to perform a safe C call sees that there are no other threads running, waiting for I/O, or waiting for timeout, and that we are the thread which handles Unix signals, then it doesn't notify or start another thread to enter the scheduler. When a C call returns, it doesn't have to wake up the scheduler in this case. Even if other threads are running, if there is currently no scheduler doing epoll/poll/select, then a returning C call doesn't wake up the scheduler. It only links itself to a list which will be examined by the scheduler. * * * There are interesting complications with fork. POSIX only provides fork which causes other pthreads in the child process to evaporate. This is exactly what is wanted if the fork is soon followed by exec, but can be disastrous if the program tries to use other threads in the meantime. Depending on the system pthread_join on a thread which has existed before the fork either says that it has returned, or hangs, or fails with ESRCH or EINVAL. And there is no way to fork while keeping other threads running (there has been some proposal for forkall but it has been rejected). This means that a fork in an unfortunate state, e.g. while some thread was holding a mutex, will left the mutex permanently locked; pthread_atfork is supposed to protect against that. It also means that if our language tries to continue running its threads after the fork, then there is no way to do this if they are bound to other OS threads. And the worker pool is useless, it should better be emptied before the fork to reduce resource leak. There is no semantics of fork wrt. threads which would be correct in all cases. Shortly before implementing bound threads I've designed and implemented a semantics for three variants of fork, which were easy when I have full control over what happens with my threads in the child process (well, the third was a challenge to implement): - ForkProcessCloneThreads - easiest to describe, but the least useful. Threads continue to run in both processes. - ForkProcessKillThreads - other threads are atomically killed in the child process, similarly to raw POSIX. This is used before exec. If the program attempts to wait for the threads, the behavior is defined: they look as if they failed with ThreadKilled exception, even though they were killed without a chance to recover (this is a different exception than the one used for cancellation which signifies that it could recover). - ForkProcess - the safest default: all threads are sent "signals" (in the sense of asynchronous communication in my language) which cause them to be suspended when they have signal handling unblocked (this roughly corresponds to Haskell's blocking of asynchronous exceptions, but e.g. a thread holding a mutex has signals blocked by default). This includes chasing newly created threads. When all threads are suspended, we do ForkProcessCloneThreads. Then in the parent process threads are resumed, and in the child they are cancelled in a polite way so they can release resources. Bound threads introduced problems. They can partially be solved, e.g. the worker pool, the wakeup pipe, epoll descriptor are correctly recreated. But there is simply no way to return from callbacks because the corresponding C contexts no longer exist. So I made them as follows: All threads except the thread performing the fork become unbound. They have a chance to handle the thread cancellation exception until they return from their innermost callbacks. At this time they become killed. If ForkProcessAllThreads is done while some threads were executing non-blocking foreign code, they become killed as well. Besides this, there are "at fork" handlers, similar to pthread_atfork but scoped over the forking action. * * * I measured the speed of some syscalls on my system, to see what is worth optimizing: - pthread_mutex_lock + unlock (NPTL) 0.1 us - pthread_sigmask 0.3 us - setitimer 0.3 us - read + write through a pipe 2.5 us - gettimeofday 1.9 us A producer/consumer test in my language (which uses mutexes and condition variables) needs 1.4 us for one iteration if both threads are unbound. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
participants (2)
-
Marcin 'Qrczak' Kowalczyk
-
Simon Marlow