Yet another top-level IO proposal

Fellow haskellers, I have a proposal I would like to enter into the eternal top-level IO debate. The proposal involves a minor language extension and some runtime support for thread local reference cells. I believe it has the potential to meet many of the needs of those requesting top level IO. My apologies for this rather lengthy message, but given the volatility of discussion on this matter in the past, it seemed best to lay out my thoughts as precisely as possible at the beginning. So without further ado, the proposal. The language extension: -- Add a new keyword 'threadlocal' and a top level declaration with the syntax threadlocal <identifier> <type expression> <initializer expression> -- The declaration consists of three parts 1) the name of the reference cell 2) the type of data stored in the cell and 3) an initializer action. -- The name of the cell declared with 'threadlocal' shares the function namespace and introduces a CAF with the type 'TLRef a', where 'a' stands for the type given in the declaration. -- The initializer action must be an expression with type 'TL a', where 'a' stands for the the type given in the 'threadlocal' declaration, and 'TL' is the thread-local initialization monad (a little like the ACIO monad, more on this below). The semantics: -- Each thread in a program (the main thread and threads sparked from forkIO and forkOS) has a "bank" of thread local variables. Writes to and reads from a thread- local cell are only written to/read from the bank of the thread performing the write/read. -- For any given bank, a thread-local cell may be "empty" (which means it holds no value) or "full" with a value of its declared type. -- There is a phantom bank of thread-local values belonging to no thread in which the value of all thread-local cells is "empty". This represents the state of thread local variables before program start. -- Whenever a thread is sparked (including the main thread) and before it begins executing, its thread-local variables are initialized. For each declared thread-local variable (in the transitive closure of imported modules), the declared initilzation action is run and the generated value initializes the thread-local cell for that thread. The initializer actions are run in an unspecified order. -- The primitives of the TL are strictly limited and include only actions which have no observable side effects (a proposed list of primitives is listed below). A TL action may read from (but NOT write to) thread-local cells in the bank of the sparking thread (the bank of the thread calling forkIO, or the special phantom bank for the main thread). -- Any exceptions generated during a thread-local initilization action are propigated to the thread which called forkIO/forkOS or, in the case of the main thread, directly to the runtime system just as though an uncaught exception bubbled off the main thread. -- New IO primitives are added to read from, write to and clear (set to empty) thread-local variables. Advantages: This proposal seems to hit most of the use cases that I recall having seen (including the very important allocate-a-top-level-concurrency-variable use case) and seems to provide a nice way to reinterpret some of the "magic" currently in the standard libraries. In addition, this proposal does not suffer from the module loading order problem that some previous proposals have; because thread local initializer actions depend only on the "previous" bank of values, the order in which they are run makes very little difference (only for the primitives that read clock time or some such). The value of a thread-local cell is always well-defined, even before the main thread starts. Values in a thread-local have a well defined lifetime that is tied to the owning thread. I think that efficient implementation is possible (maybe we can play some copy-on-write games?). I especially like that variables are only as "global" as desired for any given task; if a library writer uses thread-locals for some manner of shared "global" state, later users are always able to write programs that use more than one instance of the "global" state without needing to alter the library. Disadvantages: Requires a language extension (but I don't know of a serious alternate proposal that doesn't). Requires non-trivial runtime system support. Not sure what effect this has on garbage collection. Adds overhead to thread creation (this could perhaps be mitigated by introducing new primitives that distinguish heavyweight threads with their own thread-local banks from lightweight threads, which do not have separate thread-local banks). Its a bit complicated. You can shoot yourself in the foot (true of most of the other proposals). Some representative use cases: -- Implicit parameter style use case: You want to provide a default value that you expect will be rarely changed. Threading the parameter deeply through the code obsfucates meaning and the code is in the IO monad. Solution: * Define a thread-local variable for the value and set the initializer to set the variable to some default value if it was empty, or to copy the parent thread's value otherwise. * If desired, change the variable's value early in main (before any other threads are sparked); the new value will be propagated to all new threads and be available in main. * If desired, different threads can set different values of the parameter which will then be propagated to their sub-threads. -- Top level synchronization variable use case: You need an MVar to manage some "global" resource. Solution: * Define a thread-local variable to hold the MVar. Define the initializer to create a new MVar if the TLRef was empty, and to copy the parent thread's value otherwise. * Read the MVar from the thread-local var, and use it as usual. * Also, allows you to partition the program into sandboxes which use distinct MVars to manage distinct pools of the "global" resource without needing to change or complicate any underlying libraries. -- Running time statistics use case: You want to easily keep track of how long each thread in a program has been running in wall-clock time. Solution: * Define a thread-local variable with an initializer that reads the current wall-clock time. * Calculate the thread running time by taking the difference between the current wall-clock time and the time in the thread-local var. -- Give better semantics to standard handles use case: You want to make the handling of stdin, stdout and stderr in System.IO less "magic" and baked-in. Solution: * Define a thread-local variable for each of stdin, stdout, and stderr. The initializer action creates appropriate handles for each one in the case that the thread-local was empty, and copies the parent's value otherwise. * Nice feature: allows you to override the values of stdin, stdout and stderr for sub-threads, shell style. -- Give better semantics to getArgs use case: You want getArgs/withArgs and getProgName/withProgName to have better semantics. Solution: * Define a thread-local variable for the list of arguments and for the program name. Define an initializer which reads these values from C land when the thread-local value is empty and copies the parent thread's value otherwise. * Allows you to spark multiple threads in a single program with different "command line" arguments. The proposed list of new primitives: -- thread-locals maintained by the standard libs currentWorkingDirectory :: TLRef FilePath stdin :: TLRef Handle stdout :: TLRef Handle stderr :: TLRef Handle -- in the TL monad readTL :: TLRef a -> TL a -- ^ Reads a thread-local variable in the bank of the parent thread. -- Returns bottom if the cell is empty. tryReadTL :: TLRef a -> TL (Maybe a) -- ^ Reads a thread-local variable in the bank of the parent thread. -- Returns Nothing if the cell is empty. getClocktimeTL :: TL ClockTime getCPUTimeTL :: TL Integer getDefaultStdinHandle :: TL Handle getDefaultStdoutHandle :: TL Handle getDefaultStderrHandle :: TL Handle getDefaultArgs :: TL [String] getDefaultProgramName :: TL String newIORefTL :: a -> TL (IORef a) newMVarTL :: a -> TL (MVar a) newEmptyMVarTL :: TL (MVar a) newSTRefTL :: a -> TL (STRef a) -- in the IO monad readThreadLocal :: TLRef a -> IO a -- ^ bottom on empty tryReadThreadLocal :: TLRef a -> IO (Maybe a) -- ^ Nothing on empty writeThreadLocal :: a -> TLRef a -> IO () clearThreadLocal :: TLRef a -> IO () -- ^ Reset the thread-local to empty clearBank :: IO () -- ^ Clear all thread-local cells in the current thread ------------------------------------------------------------------------ ---------------- In order to get discussion flowing and experiment with the semantics I have put together a demonstration module which implements thread-local variables using the standard "unsafePerformIO" hacks. The module and an example usage are attached. If there is any interest in these ideas, I will post this proposal to the wiki. Please respond with thoughts and comments, Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Hello, Well it seems like you haven't started another flame war (yet :-). I'm afraid I haven't properly understood your proposal, because I don't have much time right now. It seems to be a bit like George Russels proposal (aka "execution contexts"). Personally I have never felt the need for thread local state, but I have often needed to use the unsafePerformIO hack to create *unique state* for API's that are both sane from a users point of view and are also invulnerable to accidental or malicious state "spoofing". So thread local state isn't really what I want (it's a sure way to guarantee that spoofing will occur :-) You seem to indicate that this is still possible with your scheme, but I'm not sure of the details. Maybe you should put all this on the wiki page. I'd like to see how/if you could implement the hypothetical device driver API I put there, or even just use the "oneShot" function or similar at the top level. Regards -- Adrian Hey

Hello,
Well it seems like you haven't started another flame war (yet :-).
Indeed; I am a little surprised to hear the silence.
I'm afraid I haven't properly understood your proposal, because I don't have much time right now. It seems to be a bit like George Russels proposal (aka "execution contexts").
Somewhat. However, execution contexts don't have the initializer concept.
Personally I have never felt the need for thread local state, but I have often needed to use the unsafePerformIO hack to create *unique state* for API's that are both sane from a users point of view and are also invulnerable to accidental or malicious state "spoofing". So thread local state isn't really what I want (it's a sure way to guarantee that spoofing will occur :-)
You seem to indicate that this is still possible with your scheme, but I'm not sure of the details.
If you declare a thread-local in a module but don't export it, then you have exclusive control over what happens to that thread-local. You set the initializer and no one else can touch it. If you set an initializer that copies parent values and never write to the cell, you effectively have a variable that is set exactly once at program start. The only way to alter this from outside the module is to use the "clearBank" primitive, which resets all thread-locals to empty. It may be that this primitive is too dangerous to include. On the other hand, I'm not convinced that absolutely unique state is that great. Suppose I want to run multiple copies of my Haskell OS in an emulator so I can test the TCP/IP stack I just wrote? I'll need some way to keep the "unique" state for each OS separate.
Maybe you should put all this on the wiki page. I'd like to see how/if you could implement the hypothetical device driver API I put there, or even just use the "oneShot" function or similar at the top level.
I've attached a hypothetical implementation in the proposed syntax.
Regards -- Adrian Hey
Robert Dockins
participants (2)
-
Adrian Hey
-
Robert Dockins