Re: [HOpenGL] HOpenGL and --enable-threaded-rts

18 Jun 2002

      ...
Yes, these are all problems.  However, there is a nice abstraction
of the OS thread API in GHC's RTS, thanks to Sigbjorn.  So I'm sure
this API could be extended to include some thread-local state
operations.
A further piece of what one might call thread local state is
'recursive locks' like those found in Java.  With normal locks, if a
thread executes this:

   take(lock);
   take(lock);
   ...
   release(lock);
   release(lock);

then the 2nd call to take will block because a thread already has the lock.

With recursive locks, the implementation of take records who has the
lock and just increments a counter if the same thread takes the lock
again.  Likewise, release decrements the counter and only releases the
lock when the counter reaches 0.

And arbitrary user code and libraries are free to implement all kinds
of code that depends on the current thread id.  I'll bet you're going
to see a lot of cases like this.
...
...
The only viable solution I can see is to provide a way to capture
the calling thread (as a first class entity) when you call into
Haskell and to explicitly specify whcih thread to use when you call
out from Haskell.  (Hmmm, sounds like callcc for C :-))
...
The trouble is, that is *way* too much overhead for a C call.
HOpenGL does lots of these, and I strongly suspect that adding a
full OS-thread context switch (well two, including the return) for
each one would be a killer.
So don't do it for every foreign function call - only do it for the
ones that request it.  Here's the implementation I imagine:

For foreign imports and exports that have not requested any special
thread behaviour, do exactly what GHC currently does.  Overhead == 0.

For foreign exports that have requested thread capture, the call goes
like this:

  1) Get a thread from GHC's thread pool

  2) Allocate a first-class C-thread object on the Haskell heap.
     and fill in the details for this thread.

  3) Add the normal foreign export function arguments plus a pointer
     to the C thread object to the GHC thread and make it runnable.

  4) Block this thread so that it is ready to use later.

  5) When C function returns, do so in the first-class C-thread object.

For foreign imports that have requested explicit thread choice, the
call goes like this:

  1) Get the C-thread object (it's an argument to the Haskell function
     so this is easy), perform suitable sanity checks (i.e., not
     already in use).

  2) Marshall argumens to C function into the C-thread.

  3) Unblock the C thread, block the Haskell thread waiting for
     response.

  4) When C function returns, context switch back to a GHC thread.

Overhead for foreign export is higher.  Overhead for foreign import is
not much different from existing safe foreign imports.  Overhead only
occurs if you request this feature.

-- 
Alastair Reid        reid@cs.utah.edu        http://www.cs.utah.edu/~reid/