
#7602: Threaded RTS performing badly on recent OS X (10.8?)
---------------------------------+------------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: new
Priority: normal | Milestone: _|_
Component: Runtime System | Version: 7.6.1
Keywords: | Os: Unknown/Multiple
Architecture: Unknown/Multiple | Failure: None/Unknown
Difficulty: Unknown | Testcase:
Blockedby: | Blocking:
Related: |
---------------------------------+------------------------------------------
Comment(by thoughtpolice):
Alright, I think my patch is almost working, but in the mean time I've
verified with a small snippet the behavior I think we want. Simon, can you
please tell me if this approach would be OK?
Essentially, there is a small set of predefined TLS keys in the OS X C
library for various Apple-internal things. There are about 100 of these
special keys. With them, it's possible to use very special inline variants
of ```pthread_getspecific``` and ```pthread_setspecific``` that directly
write into an offset block of the ```%gs``` register. Performance-wise,
this should be very close to Linux's implementation.
One of these things on modern OS X and its libc is WebKit. pthread has a
specific range of keys (5 to be exact) dedicated to WebKit. These are used
in JavaScriptCore's FastMalloc allocator for performance critical sections
- likely for their GC! But only a single key is used by WebKit at all, and
there are 0 references to it elsewhere that I can find on the internet.
You can see this here:
http://www.opensource.apple.com/source/Libc/Libc-825.25/pthreads/pthread_mac...
This defines the inline get/set routines for special TLS keys. If you
scroll down a little you can see the ```JavaScriptCore``` keys (keys 90-94
to be exact.)
Now, look here:
http://code.google.com/codesearch#mcaWan7Aaio/trunk/WebKit-r115846/Source/WTF/wtf/FastMalloc.cpp&q=__PTK_FRAMEWORK_JAVASCRIPTCORE_KEY0&type=cs&l=453
And you can see there's a special stubbed out ```pthread_getspecific```
and ```pthread_setspecific``` routine for this exact purpose.
Therefore, I propose we steal one of the high TLS keys that dedicated to
WebKit's JS engine for the GC. Unfortunately, ```pthread_machdep.h``` is
not installed by default in modern variants of XCode, so we must inline
the definitions ourselves for the necessary architectures.
The following example demonstrates the use of these special keys:
{{{
#include