Summary so far (was: HOpenGL and --enable-threaded-rts)
This discussion is getting rather long, so I thought I'd summarise (as much for my benefit as everyone else's). Please let me know if I get anything wrong. It turns out that some C libraries designed to be used from multi-threaded programs make use of thread-local state. This is at odds with GHC's new extension to support using OS threads to multiplex calls to blocking foreign functions - this is the extension we call the "threaded RTS", which is off by default but turned on if you configure GHC with --enable-threaded-rts. The threaded-rts extension is important if you want to call foreign functions that might block - without thread-rts this would block all the other Haskell threads until the blocking foreign call returns. The problems arise because GHC's threaded RTS doesn't make any distinction between OS threads; as far as it is concerned any OS thread is as good as any other. We hadn't considered the use of thread-local state by external C libraries when we designed this (obviously :-{). Ok, so what can we do? 1. Swap the thread-local state in ================================== Wolfgang's proposed fix is to allow the right thread-local state to be swapped in at the right moment, just before running a Haskell thread. I don't think this will work in general, because part of the thread-local state is the thread ID of the OS thread itself, which can't be swapped in. Also, Sven pointed out that swapping in the context in the GLUT case can have other drastic performance implications. 2. Every Haskell thread has its own OS thread ============================================= Some other folk proposed moving to a 1-1 correspondence between Haskell threads and OS threads. I think this is a poor solution simply because of the overhead - Haskell threads are very lightweight (1000s of threads is entirely reasonable), but OS threads tend to be much heavier. For example, I'm sure this would kill the performance of the Haskell web server. 3. Some Haskell threads have their own OS thread ================================================ Another solution is to fix a 1-1 correspondence between Haskell threads and OS threads for some Haskell threads only, perhaps selected by a different version of forkIO. We think this is implementable, has zero overhead if you don't use it, but it does require that the user of the external binding remembers to use the right flavour of forkIO. Callbacks have to create a new Haskell thread which is bound to the current OS thread. Alastair points out that it might be significant which Haskell thread runs a particular finalizer. 4. Thread groups ================ Claus's suggestion is similar, but gives the Haskell programmer more control over the mapping between OS threads and Haskell threads. I must admit I'd been wondering about something similar myself. He suggests that every Haskell thread is bound to a specific OS thread, but that more than one Haskell thread can map to the same OS thread (a thread group). This is slightly less convenient for the Haskell programmer - one has to be careful to fork a new thread group to avoid being blocked by a foreign call. ------------------ We can afford to discuss this a while longer, because Simon & I are currently focussed on the next release (I don't want to hold up 5.04 for a fix, and it wouldn't be a disaster if we had go straight to 5.06 in a couple of months or so). Personally I can't decide whether (3) or (4) is the better solution. I'm pretty sure (1) and (2) aren't viable, though. Cheers, Simon
Simon Marlow wrote:
This discussion is getting rather long, so I thought I'd summarise (as much for my benefit as everyone else's). Please let me know if I get anything wrong.
I haven't found anything wrong.
I'm pretty sure (1) and (2) aren't viable, though.
I basically agree. In the presence (3) or (4) [or (5) ;-) ], my own hack looks like - well - a hack. I'll definitely use it as a short term solution for my own toy projects, though. I won't commit any code, but if anyone else needs a short-term solution for HOpenGL, they can ask me.
Personally I can't decide whether (3) or (4) is the better solution.
I'd say, let's go for (5) - that is, some blend between (3) or (4). (4) almost sounds like (3) could be implemented on top of it. The simplicity of (3) is needed in most cases, the power of something like (4) in some. Some other random thoughts: Does the "OS-thread"-binding have to be a permanent attribute of a thread, or can we also have something like: inOSThread theGLUTThread $ do ... This could be useful for finalizers, or when some thread-sensitive API is used only "some of the time". The problem is that the haskell thread in question could be blocked indefinitely until the OS thread decides to return to Haskell code. I do also like the idea of a forkHeavyIOThread primitive. If something like (4) is implemented, it should still be possible to say that a Haskell thread can run in any OS thread. We never know what new ways of juggling threads (SMP, distributed systems?) will be supported in the future for code that "doesn't mind" being executed in different threads. The current limitations of the RTS (i.e. haskell code can only run in one OS thread at one time) should be a well-documented implementation detail, not a fundamental assumption for the threading primitives. The "don't care" thread group could still support features like the current threaded rts. The documentation will just have to make clear that it can't be predicted which OS thread those haskell threads will be run in, and that some libraries don't like that. CU, Wolfgang
participants (2)
-
Simon Marlow -
Wolfgang Thaller