Why do "unsafe" foreign calls block other threads?

Hey everyone, Could someone explain to me the logic behind having "unsafe" calls block other threads from executing? It seems to me that if anything it would make more sense for "safe" calls to block other threads since the call can call back into the Haskell runtime, as opposed to "unsafe" calls which (by assertion) will never call back into Haskell and therefore should be safer to run in parallel with other threads. What am I missing here? Cheers, Greg

It's a matter of perspective. Either the function you're FFI'ing to is safe/unsafe or your use of it is safe/unsafe. The FFI spec seems to be using the former, so if you think that the function you're calling is unsafe (i.e., can call back into Haskell) then it blocks the world. But I do think it's unintuitive and a less ambiguous naming scheme would be nicer. Dan On Tue, Aug 3, 2010 at 11:54 PM, Gregory Crosswhite < gcross@phys.washington.edu> wrote:
Hey everyone,
Could someone explain to me the logic behind having "unsafe" calls block other threads from executing? It seems to me that if anything it would make more sense for "safe" calls to block other threads since the call can call back into the Haskell runtime, as opposed to "unsafe" calls which (by assertion) will never call back into Haskell and therefore should be safer to run in parallel with other threads. What am I missing here?
Cheers, Greg _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

But you've got it backwards: if the function I am calling can call back into Haskell (i.e., is marked as "safe"), then GHC *doesn't* block the world, but if the function I am calling will never call back into Haskell (i.e., is marked as "unsafe"), then GHC *does* block the world. The reasoning behind this choice of behaviors is exactly what I do not understand. Cheers, Greg On 08/03/10 14:58, Daniel Peebles wrote:
It's a matter of perspective. Either the function you're FFI'ing to is safe/unsafe or your use of it is safe/unsafe. The FFI spec seems to be using the former, so if you think that the function you're calling is unsafe (i.e., can call back into Haskell) then it blocks the world.
But I do think it's unintuitive and a less ambiguous naming scheme would be nicer. Dan
On Tue, Aug 3, 2010 at 11:54 PM, Gregory Crosswhite
mailto:gcross@phys.washington.edu> wrote: Hey everyone,
Could someone explain to me the logic behind having "unsafe" calls block other threads from executing? It seems to me that if anything it would make more sense for "safe" calls to block other threads since the call can call back into the Haskell runtime, as opposed to "unsafe" calls which (by assertion) will never call back into Haskell and therefore should be safer to run in parallel with other threads. What am I missing here?
Cheers, Greg _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org mailto:Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Aug 3, 2010 at 3:06 PM, Gregory Crosswhite
But you've got it backwards: if the function I am calling can call back into Haskell (i.e., is marked as "safe"), then GHC *doesn't* block the world, but if the function I am calling will never call back into Haskell (i.e., is marked as "unsafe"), then GHC *does* block the world. The reasoning behind this choice of behaviors is exactly what I do not understand.
"unsafe" calls are fast but unsafe. If they call back to haskell, or take a long time, bad things will happen. "safe" calls are safer: They can call back to haskell, block on IO, or just wander off and do something CPU intensive. As is often the case, you pay a little bit of efficiency for the safety, since the RTS has to line up a few ducks to allow callbacks (and presumably concurrency). Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds.

On 08/03/10 15:22, Evan Laforge wrote:
On Tue, Aug 3, 2010 at 3:06 PM, Gregory Crosswhite
wrote: But you've got it backwards: if the function I am calling can call back into Haskell (i.e., is marked as "safe"), then GHC *doesn't* block the world, but if the function I am calling will never call back into Haskell (i.e., is marked as "unsafe"), then GHC *does* block the world. The reasoning behind this choice of behaviors is exactly what I do not understand. "unsafe" calls are fast but unsafe. If they call back to haskell, or take a long time, bad things will happen.
"safe" calls are safer: They can call back to haskell, block on IO, or just wander off and do something CPU intensive. As is often the case, you pay a little bit of efficiency for the safety, since the RTS has to line up a few ducks to allow callbacks (and presumably concurrency).
Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds. Yes, but the whole reason to use "unsafe" is to get higher performance at the cost of safety. If the result of calling an "unsafe" foreign function is that you *lose* performance because the other threads have to be halted first, then this seems to defeat the whole point of marking a call as "unsafe" in the first place.
Cheers, Greg

Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds. Yes, but the whole reason to use "unsafe" is to get higher performance at the cost of safety. If the result of calling an "unsafe" foreign function is that you *lose* performance because the other threads have to be halted first, then this seems to defeat the whole point of marking a call as "unsafe" in the first place.
That's why the function has to return soon and shouldn't do a lot of work.

Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds. Yes, but the whole reason to use "unsafe" is to get higher performance at the cost of safety. If the result of calling an "unsafe" foreign function is that you *lose* performance because the other threads have to be halted first, then this seems to defeat the whole point of marking a call as "unsafe" in the first place. That's why the function has to return soon and shouldn't do a lot of work. But again, then what is the point of marking it "unsafe" if it means
On 08/03/10 15:33, Evan Laforge wrote: that you have to pay a hefty cost of waiting for all the other threads to halt? Is the cost of halting all of the other threads really less than the cost of setting up for a "safe" call? Maybe it is, and that is what I am missing here. If it is not, though, then it seems to me that marking a call as "unsafe" will *never* gain you performance in a multi-threaded environment, so that there is never any point in using it in such an environment. (Though, of course, it could gain you performance in a single-threaded environment.) Cheers, Greg

As far as I know, it works like this:
"unsafe" calls are just executed directly, like any other C function
call; as a result, any lightweight haskell threads which were mapped
onto the OS thread in which the call is made are blocked for the
duration of the call; hence why it's a good idea that these calls
should be short ones. So the blocking is not by intent, but -is- a
direct consequence (of how unsafe calls are made and how GHC's
threading system works). Other OS threads and the haskell threads
mapped to them are not blocked, afaik.
"safe" calls spawn a new OS thread (maybe reuse an existing one if
available?), move the haskell threads over, (do various other
housekeeping?), and then make the call.
On Wed, Aug 4, 2010 at 12:41 AM, Gregory Crosswhite
Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds. Yes, but the whole reason to use "unsafe" is to get higher performance at the cost of safety. If the result of calling an "unsafe" foreign function is that you *lose* performance because the other threads have to be halted first, then this seems to defeat the whole point of marking a call as "unsafe" in the first place. That's why the function has to return soon and shouldn't do a lot of work. But again, then what is the point of marking it "unsafe" if it means
On 08/03/10 15:33, Evan Laforge wrote: that you have to pay a hefty cost of waiting for all the other threads to halt? Is the cost of halting all of the other threads really less than the cost of setting up for a "safe" call? Maybe it is, and that is what I am missing here. If it is not, though, then it seems to me that marking a call as "unsafe" will *never* gain you performance in a multi-threaded environment, so that there is never any point in using it in such an environment. (Though, of course, it could gain you performance in a single-threaded environment.)
Cheers, Greg
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

OHHHHHHHHHHHHHHHH, okay, that makes a LOT more sense! I had thought that "blocking" meant that *all* of the *OS* threads were halted before making the call, but if "blocking" really just means that the calling *OS* thread can't do any other work until the call returns (hence blocking the other *IO* threads mapped to that OS thread) then it all makes sense to me. :-) Cheers, Greg On 08/03/10 15:49, Gábor Lehel wrote:
As far as I know, it works like this:
"unsafe" calls are just executed directly, like any other C function call; as a result, any lightweight haskell threads which were mapped onto the OS thread in which the call is made are blocked for the duration of the call; hence why it's a good idea that these calls should be short ones. So the blocking is not by intent, but -is- a direct consequence (of how unsafe calls are made and how GHC's threading system works). Other OS threads and the haskell threads mapped to them are not blocked, afaik.
"safe" calls spawn a new OS thread (maybe reuse an existing one if available?), move the haskell threads over, (do various other housekeeping?), and then make the call.
On Wed, Aug 4, 2010 at 12:41 AM, Gregory Crosswhite
wrote: Just think of "unsafe" in relation to "unsafeIndex" or something. It's faster, but you have to be sure the index is in bounds. Yes, but the whole reason to use "unsafe" is to get higher performance at the cost of safety. If the result of calling an "unsafe" foreign function is that you *lose* performance because the other threads have to be halted first, then this seems to defeat the whole point of marking a call as "unsafe" in the first place. That's why the function has to return soon and shouldn't do a lot of work. But again, then what is the point of marking it "unsafe" if it means
On 08/03/10 15:33, Evan Laforge wrote: that you have to pay a hefty cost of waiting for all the other threads to halt? Is the cost of halting all of the other threads really less than the cost of setting up for a "safe" call? Maybe it is, and that is what I am missing here. If it is not, though, then it seems to me that marking a call as "unsafe" will *never* gain you performance in a multi-threaded environment, so that there is never any point in using it in such an environment. (Though, of course, it could gain you performance in a single-threaded environment.)
Cheers, Greg
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Aug 03, 2010 at 02:54:40PM -0700, Gregory Crosswhite wrote:
Could someone explain to me the logic behind having "unsafe" calls block other threads from executing? It seems to me that if anything it would make more sense for "safe" calls to block other threads since the call can call back into the Haskell runtime, as opposed to "unsafe" calls which (by assertion) will never call back into Haskell and therefore should be safer to run in parallel with other threads. What am I missing here?
It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

On 08/03/10 15:23, John Meacham wrote:
It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
John
Okay, that makes a lot more sense. So really when marking a call "safe" or "unsafe" I shouldn't be thinking in terms of whether I want to avoid the increased overhead of allowing it to call back into Haskell, but rather I should be considering whether I want to call to block other threads or not. However, it would seem that within this scheme there is no way for me to specify that a foreign call should be both "blocking" and also allowed to call Haskell functions, which to me would be the "safe"est possible alternative. Cheers, Greg

Quoth John Meacham
It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
Is the concurrency issue documented somewhere? What does `non blocking' mean, and why would it not just always be that way? In my situation, thread creation and dispatch happens in foreign library code, and execution in the Haskell runtime happens _only_ via callbacks. I don't need those callbacks to compute in parallel, generally, but it would be disappointing to hear that a callback strictly blocks execution of any others for its entire duration, for example even during a potentially slow I/O. (Will test for that, but not sure whether it would be conclusive since the system seems to be slightly broken at this point - need to disable RTS timer signals ( -V0 ) to survive externally generated thread dispatch events.) thanks, Donn Cave, donn@avvanta.com

On Wed, Aug 4, 2010 at 1:50 AM, Donn Cave
Quoth John Meacham
, It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
Is the concurrency issue documented somewhere? What does `non blocking' mean, and why would it not just always be that way?
In my situation, thread creation and dispatch happens in foreign library code, and execution in the Haskell runtime happens _only_ via callbacks. I don't need those callbacks to compute in parallel, generally, but it would be disappointing to hear that a callback strictly blocks execution of any others for its entire duration, for example even during a potentially slow I/O.
(Will test for that, but not sure whether it would be conclusive since the system seems to be slightly broken at this point - need to disable RTS timer signals ( -V0 ) to survive externally generated thread dispatch events.)
This is slightly out of date (GHC couldn't multiplex haskell threads onto multiple OS threads at the time, but now does), but basically spells out the situation: http://www.haskell.org/~simonmar/papers/conc-ffi.pdf As for the specific question, callbacks do happen concurrently.
thanks, Donn Cave, donn@avvanta.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

On Thu, Aug 05, 2010 at 03:48:57PM -0400, wren ng thornton wrote:
John Meacham wrote:
'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
+1.
Perhaps we should propose it to the haskell' committee.
It is already on the wiki mixed in under 'concurrency'. It was discussed at length, the general consensus was that even if concurrency isn't in the standard, we should make these annotations part of it so FFI bindings could be written that would be portable between concurrent and non-concurrent implementations. The actual naming of the annotations and what the default should be got into some serious dogshedding which I'd hate to re-hash, but most anything would be better than 'safe' and 'unsafe' IMHO. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

On 2010-08-03 15:23 -0700, John Meacham wrote:
It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
I thought "safe" meant "the foreign function is allowed to call Haskell functions", which seems to not have anything to do with whether the function is re-entrant (a very strong condition). -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

On Thu, Aug 05, 2010 at 04:08:38PM -0400, Nick Bowler wrote:
On 2010-08-03 15:23 -0700, John Meacham wrote:
It is more an accident of ghc's design than anything, the same mechanism that allowed threads to call back into the runtime also allowed them to be non blocking so the previously used 'safe' and 'unsafe' terms got re-used. personally, I really don't like those terms, they are non-descriptive in terms of what they actually mean and presuppose a RTS similar to ghcs current design. 'reentrant' and 'blocking' which could be specified independently would be better and would be more future-proof against changes in the RTS or between compilers.
I thought "safe" meant "the foreign function is allowed to call Haskell functions", which seems to not have anything to do with whether the function is re-entrant (a very strong condition).
Yeah, that is probably not the right term, I was thinking 're-entrant' as in it re-enters the haskell run-time, but that could cause confusion with other meanings of that word. Perhaps 'nocallbacks' or 'nohs' 'nonnative'. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
participants (8)
-
Daniel Peebles
-
Donn Cave
-
Evan Laforge
-
Gregory Crosswhite
-
Gábor Lehel
-
John Meacham
-
Nick Bowler
-
wren ng thornton