
This sounds like a lot of work and a porting nightmare (what do you mean Linux/Win32/HPUX/... doesn't have thread manipulation function X, it's available on FreeBSD/Win32/... What if there are other forms of thread-local state (e.g., errno)? What about setjmp/longjmp?).
Yes, these are all problems. However, there is a nice abstraction of the OS thread API in GHC's RTS, thanks to Sigbjorn. So I'm sure this API could be extended to include some thread-local state operations.
The only viable solution I can see is to provide a way to capture the calling thread (as a first class entity) when you call into Haskell and to explicitly specify whcih thread to use when you call out from Haskell. (Hmmm, sounds like callcc for C :-))
The trouble is, that is *way* too much overhead for a C call. HOpenGL does lots of these, and I strongly suspect that adding a full OS-thread context switch (well two, including the return) for each one would be a killer. Cheers, Simon