
On 29 March 2006 09:11, John Meacham wrote:
It would be nice if we can deprecate the not very informative 'safe' and 'unsafe' names and use more descriptive ones that tell you what is actually allowed.
'reentrant' - routine might call back into the haskell run-time 'blockable' - routine might block indefinitly
I've been meaning to bring this up. First, I don't think 'blockable' is the right term here. This relates to Malcolm's point too:
Another piece of terminology to clear up. By "non-blocking foreign call", you actually mean a foreign call that *can* block. As a consequence of the fairness policy, you wish to place the requirement on implementations that such a blocking foreign call _should_not_ block progress of other Haskell threads. The thread-nature of the foreign call is "blocking". The Haskell-API nature is desired to be "non-blocking".
Malcolm correctly notes that when I say "non-blocking" I'm referring to the behaviour from Haskell's point of view, not a property of the foreign code being invoked. In fact, whether the foreign code being invoked blocks or not is largely immaterial. The property we want to capture is just this: During execution of the foreign call, other Haskell threads should make progress as usual. It doesn't matter whether the foreign call "blocks" or not (although that is a common use for this feature). I'd rather call it 'concurrent', to indicate that the foreign call runs concurrently with other Haskell threads. Back to 'reentrant' vs. 'blockable'. I'm not convinced that 'blockable unsafe' is that useful. The reason is that if other Haskell threads continue running during the call, at some point a GC will be required, at which point the runtime needs to traverse the stack of the thread involved in the foreign call, which means the call is subject to the same requirements as a 'reentrant' call anyway. I don't think it's necessary to add this finer distinction. Unless perhaps you have in mind an implementation that doesn't do GC in the traditional way... but then I'm concerned that this is requiring programmers to make a distinction in their code to improve performance for a minority implementation technique, and that's not good language design. If 'reentrant' in its full glory is too hard to implement, then by all means don't implement it, and emit a run-time error if someone tries to use it. Cheers, Simon

On Wed, Mar 29, 2006 at 11:15:27AM +0100, Simon Marlow wrote:
On 29 March 2006 09:11, John Meacham wrote:
It would be nice if we can deprecate the not very informative 'safe' and 'unsafe' names and use more descriptive ones that tell you what is actually allowed.
'reentrant' - routine might call back into the haskell run-time 'blockable' - routine might block indefinitly
I've been meaning to bring this up. First, I don't think 'blockable' is the right term here. This relates to Malcolm's point too:
yeah, I am not happy with that term either. 'blocking'? 'canblock'?
Another piece of terminology to clear up. By "non-blocking foreign call", you actually mean a foreign call that *can* block. As a consequence of the fairness policy, you wish to place the requirement on implementations that such a blocking foreign call _should_not_ block progress of other Haskell threads. The thread-nature of the foreign call is "blocking". The Haskell-API nature is desired to be "non-blocking".
Malcolm correctly notes that when I say "non-blocking" I'm referring to the behaviour from Haskell's point of view, not a property of the foreign code being invoked.
In fact, whether the foreign code being invoked blocks or not is largely immaterial. The property we want to capture is just this:
During execution of the foreign call, other Haskell threads should make progress as usual.
It doesn't matter whether the foreign call "blocks" or not (although that is a common use for this feature). I'd rather call it 'concurrent', to indicate that the foreign call runs concurrently with other Haskell threads.
'concurrent' sounds fine to me, I have little preference. other than please not 'threadsafe', a word so overloaded as to be meaningless :)
Back to 'reentrant' vs. 'blockable'. I'm not convinced that 'blockable unsafe' is that useful. The reason is that if other Haskell threads continue running during the call, at some point a GC will be required, at which point the runtime needs to traverse the stack of the thread involved in the foreign call, which means the call is subject to the same requirements as a 'reentrant' call anyway. I don't think it's necessary to add this finer distinction. Unless perhaps you have in mind an implementation that doesn't do GC in the traditional way... but then I'm concerned that this is requiring programmers to make a distinction in their code to improve performance for a minority implementation technique, and that's not good language design.
it has nothing to do with performance, they are just fundamentally different concepts that just happen by coincidence to have the same solution in ghc. there is no fundamental relation between the two. This is one of those things that I said was "GHC-centric even though no one realizes it" :) in any cooperative/event loop based system, 'blockable unsafe' can be implemented by 1 spawning a new system thread, calling the routine in it, having the routine write a value to a pipe when done. the pipe is integrated into the standard event loop of the run-time. however, 'blockable safe' or 'blockable reentrant' now implies that a call may come back into the haskell run-time _on another OS level thread_ which implys we have to set up pthread_mutexes everywhere, perhaps switch to a completely different run-time or at least switch to a different incoming foreign calling boilerplate. note that none of this has anything to do with the GC (though, likely implementations will have to do something special with their GC stack too) and there are a lot of other possible models of concurrency that we have not even thought of yet.
If 'reentrant' in its full glory is too hard to implement, then by all means don't implement it, and emit a run-time error if someone tries to use it.
but reentrant is perfectly fine, blocking is perfectly fine, the combination is not. giving up the ability to have haskell callbacks from C code is not so good. besides, for a language standard we should avoid any implementation details so specifying _exactly_ what we mean is a good thing. the fact that reentrant and blocking produce the same code in GHC is _very much_ an implementation detail. John -- John Meacham - ⑆repetae.net⑆john⑈

Malcolm correctly notes that when I say "non-blocking" I'm referring to the behaviour from Haskell's point of view, not a property of the foreign code being invoked. In fact, whether the foreign code being invoked blocks or not is largely immaterial. The property we want to capture is just this: During execution of the foreign call, other Haskell threads should make progress as usual.
if that is really what you want to capture, the standard terminology would be "asynchronous call" (as opposed to "synchronous call"). hence all that separation between synchronous and asynchronous concurrent languages (so "concurrent" would not be a useful qualifier). the only remaining ambiguity would be that concurrent languages (eg, Erlang) tend to use "asynchronous calls" to mean that the _calling thread_ does not need to synchronise, whereas you want to express that the _calling RTS_ does not need to synchronise while the _calling thread_ does need to. which makes me wonder why one would ever want the RTS to block if one of its threads makes a call? if the RTS is sequential (with or without user-level threads), it can't do anything but synchronous foreign calls, can it? and if the RTS does support non-sequential execution, I can see few reasons for it to block other threads when one thread makes a foreign call. I think what you're after is something quite different: by default, we don't know anything about the behaviour of foreign call, so once we pass control to foreign, it is out of our hands until foreign decides to return it to us. for sequential RTS, that's the way it is, no way around it. for non-sequential RTS, that need not be a problem: if the foreign call can be given its own asynchronous (OS-level) thread of control, it can take however long it needs to before returning, and other (user-level) threads can continue to run, asynchronously. but that means overhead that may not always be necessary. so what I think you're trying to specify is whether it is safe for the RTS to assume that the foreign call is just another primitive RTS execution step (it will return control, and it won't take long before doing so). the standard terminology for that is, I believe, "atomic action". in other words, if the programmer assures the RTS that a foreign call is "atomic", the RTS is free to treat it as any other RTS step (it won't block the current OS-level thread of control entirely, and it won't hog the thread for long enough to upset scheduling guarantees). if, on the other hand, a foreign call is not annotated as "atomic", there is a potential problem: non-sequential RTS can work around that, with some overhead, while sequential RTS can at best issue a warning and hope for the best. so my suggestion would be to make no assumption about unannotated calls (don't rely on the programmer too much;), and to have optional keywords "atomic" and "non-reentrant". [one might assume that an "atomic" call should never be permitted to reenter, so the annotations could be ordered instead of accumulated, but such assumptions tend to have exceptions] cheers, claus
It doesn't matter whether the foreign call "blocks" or not (although that is a common use for this feature). I'd rather call it 'concurrent', to indicate that the foreign call runs concurrently with other Haskell threads.
participants (3)
-
Claus Reinke
-
John Meacham
-
Simon Marlow