RE: [HOpenGL] HOpenGL and --enable-threaded-rts
| > Another possibility that Simon and I have discussed is to provide a | > sort of forkIO that says "create a Haskell thread | permanently bound to | > an OS thread". Much more expensive than normal forkIO. More like | > having a permanent secretary at your beck and call, rather than the | > services of a typist from the typing pool. | | So calling C from this thread would happen inside that OS thread, | callbacks would happen in that OS thread, and the RTS would | continue to | run while that OS thread is blocked? | How much overhead would that create? I wouldn't like much additional | overhead for my OpenGL programs :-(. Would this require an OS | mutex lock | for every heap allocation, or is there a better way? The idea would be that we'd keep the invariant that only one OS thread can be executing Haskell at any moment. So no mutex locks on allocation or thunk entry. Suppose we call one of a Haskell-thread-with-a-dedicated-OS-thread a "Haskell super-thread". At any moment, a vanilla OS worker thread is executing Haskell threads. It keeps picking a new Haskell thread, running it for a while, then picking a new one. OK, so it decides that the next Haskell thread to run is a super-thread. The OS thread hands off control to the dedicated OS thread bound to the super-thread. The dedicated OS thread runs the super-thread. When its time slice is up, the dedicated OS thread picks a new Haskell thread to run, but it doesn't run it! No, it hands off control to a vanilla OS worker thread instead. The intention is zero overhead if there are no super threads. There'll be an OS thread switch whenever the super-thread is scheduled, but that's life. Havn't worked out the details, but it looks possible. Sigbjorn: any comments? You'd implemented all this thread-y stuff. Simon
pardon me if I chime in without knowing the implementation details, but I'm an HOpenGL user who tends to like Haskell concurrency (if only we had Erlang-style distribution in the main-ghc line..:). And I do have the feeling that the suggested solutions keep getting more complex, moving away from programming in the language into controlling its implementation. I still like Sven's idea - noone has asked for a new OS thread for the callbacks, so there shouldn't be one, so there shouldn't be a problem with the admittedly not nice use of OS-thread-local storage. The argument against that suggestion was that --enable-threaded-rts explicitly asks ghc to run Haskell threads in >= 2 OS threads, to reduce problems with blocking external calls. That alone still isn't a problem, but the default to run callbacks in a different OS-thread is (so the RTS doesn't seem to manage OS-threads as resources, but allocates them according to a pre-defined scheme that works in many sane situations). Incidentally, while GLUT callbacks are made, the original Haskell thread should be calling out to the GLUT mainloop, which doesn't return, also suggesting that one should reuse that OS thread. Others have suggested a 1-1 correspondence between OS and Haskell threads, but I don't believe that would be a good idea with current OSs. But if you mix all this together, you get another option: - introduce Haskell thread groups (HTG) - a HTG is a collection of Haskell threads that share one OS thread (alternatively, each HTG could have two OS-threads - one for *all* outgoing and incoming external calls, one for internal processing) - add a forkGroup, which starts a new Haskell thread in its own HTG (using different OS-threads is the only difference between Haskell threads in different HTGs) Now, there is no need for the --enable-threaded-rts option: if the Haskell programmer expects problems with blocking external calls, she can simply run several HTGs. There is no problem with OpenGLs peculiarities, as long as callbacks really call back, into the same HTG, that is. This probably needs some additional bookkeeping, to make callbacks HTG/OS-thread-specific. And, best of all, there are fewer low-level controls out of reach of the Haskell program. Of course, there must be disadvantages as well.. Claus
Simon Peyton-Jones wrote:
The idea would be that we'd keep the invariant that only one OS thread can be executing Haskell at any moment. So no mutex locks on allocation or thunk entry.
Ah that sounds good. I think I can live with some OS-thread-switching. I think this is the way to go. Is anyone available to implement this? (How long will I have to keep using my own ugly hack?)
Suppose we call one of a Haskell-thread-with-a-dedicated-OS-thread a "Haskell super-thread"
I'd suggest "heavy threads" vs. "light threads" :-). Claus Reinke wrote:
I still like Sven's idea - noone has asked for a new OS thread for the callbacks, so there shouldn't be one, so there shouldn't be a problem with the admittedly not nice use of OS-thread-local storage.
The problem with that is that it will fail miserably (read: crash) if another haskell thread does something that _requires_ an OS thread switch (like call a foreign import threadsafe function). Sven Panne wrote:
Furthermore, it would be extremely system-dependent and would cost a *vast* amount of performance: Switching OpenGL contexts can require a round-trip to a remote server, can trigger swapping a few MB of textures into your favourite graphics card, etc.
I don't think so. Context-switching is a client-side affair. After all, a single-threaded two-window OpenGL application has to switch between its two contexts all the time. Cheers, Wolfgang
btw Don't forget that it might matter which thread runs a finalizer. This is probably not often an issue since finalizers tend not to do very much but it would be good if the design can provide sufficient control over which thread runs a finalizer. -- Alastair Reid
participants (4)
-
Alastair Reid -
Claus Reinke -
Simon Peyton-Jones -
Wolfgang Thaller