RE: FFI, safe vs unsafe

older
preemptive vs cooperative: attempt...

Simon Marlow

11 Apr 2006 11 Apr '06

8:13 a.m.

What are the conclusions of this thread? I think, but correct me if I'm wrong, that the eventual outcome was: - concurrent reentrant should be supported, because it is not significantly more difficult to implement than just concurrent. - the different varieties of foreign call should all be identifiable, because there are efficiency gains to be had in some implementations. - the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation). So, can I go ahead and update the wiki? I'll try to record the rationale from the discussion too. I'd like to pull out something from the discussion that got a bit lost in the swamp: the primary use case we have for concurrent reentrant is for calling the main loop of a GUI library. The main loop usually never returns (at least, not until the application exits), hence concurrent, and it needs to invoke callbacks, hence reentrant. Cheers, Simon

Show replies by date

Ross Paterson

11 Apr 11 Apr

8:40 a.m.

New subject: FFI, safe vs unsafe

On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

...

- the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation).

I think the name "concurrent" has a similar problem to "safe": it reads as an instruction to the implementation, rather than a declaration by the programmer of the properties of a particular function; as Wolfgang put it, "this function might spend a lot of time in foreign lands".

Aaron Denney

4:49 p.m.

New subject: FFI, safe vs unsafe

On 2006-04-11, Ross Paterson wrote:

...

On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

...
- the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation).

I think the name "concurrent" has a similar problem to "safe": it reads as an instruction to the implementation, rather than a declaration by the programmer of the properties of a particular function; as Wolfgang put it, "this function might spend a lot of time in foreign lands".

I'd like to second this. -- Aaron Denney -><-

Simon Marlow

12 Apr 12 Apr

10:05 a.m.

New subject: FFI, safe vs unsafe

On 11 April 2006 17:49, Aaron Denney wrote:

...

On 2006-04-11, Ross Paterson wrote:

...
On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

...
- the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation).

I think the name "concurrent" has a similar problem to "safe": it reads as an instruction to the implementation, rather than a declaration by the programmer of the properties of a particular function; as Wolfgang put it, "this function might spend a lot of time in foreign lands".

I'd like to second this.

I agree. So other suggestions? longrunning? mightblock or mayblock? I don't much like 'nonreentrant', it's a bit of a mouthful. Any other suggestions for that? nocallback? Cheers, Simon

Wolfgang Thaller

4:07 p.m.

New subject: FFI, safe vs unsafe

Simon Marlow wrote:

...

I agree. So other suggestions? longrunning? mightblock or mayblock?

I don't like "*block", because the question of blocking is irrelevant to this issue. It's about whether the foreign call returns sooner or later, not about whether it spends the time until then blocked or runnning. Personally, I'm still in favour of inverting this. We are not in court here, so every foreign function is guilty until proven innocent. Every foreign function might be "longrunning" unless the programmer happens to know otherwise. So maybe... "returnsquickly"? As far as I can gather, there are two arguments *against* making longrunning/concurrent the default: 1) It's not needed often, and it might be inefficient. 2) There might be implementations that don't support it at all (I might have convinced John that everyone should support it though..). 3) There might be implementations where concurrent calls run on a different thread than nonconcurrent calls. Now I don't buy argument 1; For me the safety/expected behaviour/easy for beginners argument easily trumps the performance argument. ad 3). For implementations that don't support Bound Threads, John Meacham proposed saying that nonconcurrent calls are guaranteed to be executed on the main OS thread, but no guarantees where made about concurrent calls; that makes a lot of sense implementation-wise. However, this means that calls to the main loops for GLUT, Carbon and Cocoa (and maybe others) have to be annotated "nonconcurrent"/"returnsquickly" despite the fact that they return only after a long time, just to keep access to the right thread-local state. I see a big fat #ifdef heading our way. Cheers, Wolfgang

John Meacham

8:21 p.m.

New subject: FFI, safe vs unsafe

On Wed, Apr 12, 2006 at 12:07:06PM -0400, Wolfgang Thaller wrote:

...

3) There might be implementations where concurrent calls run on a different thread than nonconcurrent calls.

this is necessarily true for non-OS threaded implementations. there is no other way to wait for an arbitrary C call to end than to spawn a thread to run it in. This doesn't have to do with bound threads, to support those you just need to make sure the other thread you run concurrent calls on is always the same thread. it is the cost of setting up the mechanism to pass control to the other thread and wait for the response that is an issue. turning a single call instruction into several system calls, some memory mashing and a context switch or two. I object to the idea that concurrent calls are 'safer'. getting it wrong either way is a bug. it should fail in the most obvious way rather than the way that can remain hidden for a long time. in any case, blocking is a pretty well defined operation on operating systems, it is when the kernel can context switch you away waiting for a resource, which is the main use of concurrent calls. the ability to use them for long calculations in C is a sort of bonus, the actual main use is to ensure the progress guarentee, that if the OS is going to take away the CPU because one part of your program is waiting for something another part of your program can make progress. Which is why I'd prefer some term involving 'blocking' because that is the issue. blocking calls are exactly those you need to make concurrent in order to ensure the progress guarentee. sayng something like 'takesawhile' muddies things, what is a while? not that concurrent calls shouldn't be used for long C calculations, it is quite a nice if uncommonly needed perk, but I don't want the report to confuse matters by making a quantitative real matter, meeting the progress guarentee, with a qualitiative one "does this take a while". I'd actually prefer it if there were no default and it had to be specified in neutral language because I think this is one of those things I want FFI library writers to think about. John -- John Meacham - ⑆repetae.net⑆john⑈

Marcin 'Qrczak' Kowalczyk

10:43 p.m.

New subject: FFI, safe vs unsafe

John Meacham writes:

...

I object to the idea that concurrent calls are 'safer'. getting it wrong either way is a bug. it should fail in the most obvious way rather than the way that can remain hidden for a long time.

I wouldn't consider it a bug of an implementation if it makes a call behave like concurrent when it's specified as non-concurrent. If a library wants to make it a critical section, it should use a mutex (MVar). Or there should be another kind of foreign call which requires serialization of calls. But of which calls? it's rarely the case that it needs to be serialized with other calls to the same function only, and also rare that it must be serialized with everything else, so the granularity of the mutex must be explicit. It's fine to code the mutex explicitly if there is a kosher way to make it global. Non-concurrent calls which really blocks other thread should be treated only as an efficiency trick, as in implementations where the runtime is non-reentrant and dispatches threads running Haskell code internally, making such call without ensuring that other Haskell threads have other OS threads to run them is faster. OTOH in implementations which run Haskell threads truly in parallel, the natural default is to let C code behave concurrently. Ensuring that it is serialized would require extra work which is counter-productive. For functions like sqrt() the programmer wants to say that there is no need to make it concurrent, without also saying that it requires calls to be serialized.

...

Which is why I'd prefer some term involving 'blocking' because that is the issue. blocking calls are exactly those you need to make concurrent in order to ensure the progress guarentee.

What about getaddrinfo()? It doesn't synchronize with the rest of the program, it will eventually complete no matter whether other threads make progress, so making it concurrent is not necessary for correctness. It should be made concurrent nevertheless because it might take a long time. It does block; if it didn't block but needed the same time for an internal computation which doesn't go back to Haskell, it would still benefit from making the call concurrent. It is true that concurrent calls often coincide with blocking. It's simply the most common reason for a single non-calling-back function to take a long time, and one which can often be predicted statically (operations of extremely long integers might take a long time too, but it would be hard to differentiate them from the most which don't). The name 'concurrent' would be fine with me if the default is 'not necessarily concurrent'. If concurrent calls are the default, the name 'nonconcurrent' is not so good, because it would seem to imply some serialization which is not mandatory. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

John Meacham

11:03 p.m.

New subject: FFI, safe vs unsafe

On Thu, Apr 13, 2006 at 12:43:26AM +0200, Marcin 'Qrczak' Kowalczyk wrote:

...

What about getaddrinfo()? It doesn't synchronize with the rest of the program, it will eventually complete no matter whether other threads make progress, so making it concurrent is not necessary for correctness. It should be made concurrent nevertheless because it might take a long time. It does block; if it didn't block but needed the same time for an internal computation which doesn't go back to Haskell, it would still benefit from making the call concurrent.

getaddrinfo most definitly blocks so should be made concurrent, it uses sockets internally. The progress guarentee is meant to imply "if something can effectivly use the CPU it will be given it if nothing else is using it" not that it will just eventually complete. Performing a long calculation is progress whether in haskell or C, waiting on a file descriptor isn't. John -- John Meacham - ⑆repetae.net⑆john⑈

Wolfgang Thaller

11:35 p.m.

New subject: FFI, safe vs unsafe

John Meacham wrote:

...

This doesn't have to do with bound threads, [...]

I brought it up because the implementation you are proposing fullfills the most important feature provided by bound threads, namely to be able to access the thread local state of the "main" OS thread (the one that runs C main ()), only for nonconcurrent calls, but not concurrent calls. This gives people a reason to specify some calls as nonconcurrent, even when they are actually expected to block, and it is desirable for other threads to continue running. This creates an implementation-specific link between the concurrent/ nonconcurrent question and support for OS-thread-local-state. I would probably end up writing different code for different Haskell implemlementations in this situation. Note that some predictable way of interacting with OS threads (and OS- thread local state) is necessary in order to be able to use some libraries at all, so not using OS threads *at all* might not be a feasible method of implementing a general-purpose programming language (or at least, not a feasible implementation method for general purpose implementations of general purpose programming languages). The whole point of having bound threads is to NOT require a 1:1 correspondence between OS threads and Haskell threads but still be able to interact with libraries that use OS-thread-local-state. They allow implementers to use OS threads just for *some* threads (i.e. just where necessary), while still having full efficiency and freedom of implementation for the other ("non-bound") threads. There might be simpler schemes that can support libraries requiring OS-thread-local-state for the most common use cases, but there is a danger that it will interact with the concurrent/nonconcurrent issue in implementation-specific ways if we don't pay attention.

...

I object to the idea that concurrent calls are 'safer'. getting it wrong either way is a bug. it should fail in the most obvious way rather than the way that can remain hidden for a long time.

How can making a call "concurrent" rather than "nonconcurrent" ever be a bug?

...

in any case, blocking is a pretty well defined operation on operating systems, it is when the kernel can context switch you away waiting for a resource, which is the main use of concurrent calls. the ability to use them for long calculations in C is a sort of bonus, the actual main use is to ensure the progress guarentee,

I disagree with this. First, concurrent calls serve a real-world purpose for all interactive programs. GUI applications are soft realtime systems; if a GUI application stops processing events for more than 2 seconds (under regular system load), I consider it buggy. Second, although blocking is well-defined for kernel operations, the documented interface of most libraries does not include any guarantees on whether they will block the process or not; sometimes the difference might be entirely irrelevant; does it make a difference whether a drawing function in a library writes to video memory or sends an X request accross the network? Saying something "takesawhile" doesn't muddy things; it is a strictly weaker condition than whether something blocks. Calculations done by foreign calls are not a "bonus", but an important use case for concurrent calls. Think of a library call that causes a multimedia library to recompress an entire video file; estimated time required is between a few seconds and a day. In a multithreaded program, this call needs to be concurrent. It is true that the program will still terminate even if the call is nonconcurrent, but in the real world, termination will probably occur by the user choosing to "force quit" an application that is "not responding" (also known as sending SIGTERM or even SIGKILL). Reducing the issue to the question whether a function blocks or not is just plain wrong.

...

I'd actually prefer it if there were no default and it had to be specified in neutral language because I think this is one of those things I want FFI library writers to think about.

But as I have been saying, the decision that FFI library writers have to make, or rather the only decision that they *can* make, is the simple decision of "can I guarantee that this call will return to its caller (or reenter Haskell via a callback) before the rest of the program is negatively affected by a pause". If the function could block, the answer is a definite "no", otherwise the question is inherently fuzzy. Unfortunately, I don't see a way of avoiding this fuzziness. The question "can I provide a certain guarantee or not" could be answered with "no" by default to flatten the learning curve a bit. My objection against having "no default" is not very strong, but I do object to specifying this "in neutral language". This situation does not call for neutral language; rather, it has to be made clear that one of the options comes with a proof obligation and the other only with a performance penalty. Cheers, Wolfgang P.S.: I'm sticking to the concurrent/nonconcurrent terminology for now. What terminology is really best will depend on the outcome of this discussion.

John Meacham

13 Apr 13 Apr

12:16 a.m.

New subject: FFI, safe vs unsafe

On Wed, Apr 12, 2006 at 07:35:22PM -0400, Wolfgang Thaller wrote:

...

John Meacham wrote:

...
This doesn't have to do with bound threads, [...]

I brought it up because the implementation you are proposing fullfills the most important feature provided by bound threads, namely to be able to access the thread local state of the "main" OS thread (the one that runs C main ()), only for nonconcurrent calls, but not concurrent calls. This gives people a reason to specify some calls as nonconcurrent, even when they are actually expected to block, and it is desirable for other threads to continue running. This creates an implementation-specific link between the concurrent/ nonconcurrent question and support for OS-thread-local-state. I would probably end up writing different code for different Haskell implemlementations in this situation.

Oh, I made that proposal a while ago as a first draft, bound threads should be possible whether calls are concurrent or not, I am not positive I like the ghc semantics, but bound threads themselves pose not much more overhead than supporting concurrent in the first place (which is a fairly substantial overhead to begin with). but that doesn't matter to me if there isn't a performance impact in the case where they arn't used. However, in order to achieve that we would have to annotate the foreign functions with whether they use thread local state. it would pretty much be vital for implementing them efficiently on a non OS-threaded implemenation of the language. you need to perform a stack-pass-the-baton dance between threads to pass the haskell stack to the right OS thread which is a substantial overhead you can't pay just in case it might be running in a 'forkOS' created thread. Checking thread local state for _every_ foregin call is definitly not an option either. (but for specificially annotated ones it is fine.) ghc doesn't have this issue because it can have multiple haskell threads running at once on different OS threads, so it just needs to create one that doesn't jump between threads and let foreign calls proceed naturally. non-os threaded implementations have the opposite problem, they need to support a haskell thread that _can_ (and does) jump between OS threads. one pays the cost at thread creation time, the other pays the cost at foreign call time. the only way to reconcile these would be to annotate both. (which is perfectly fine by me if bound threads are needed, which I presume they are) Oddly enough, depending on the implementation it might actually be easier to just make every 'threadlocal' function fully concurrent. you have already paid the cost of dealing with OS threads.

...

Calculations done by foreign calls are not a "bonus", but an important use case for concurrent calls. Think of a library call that causes a multimedia library to recompress an entire video file; estimated time required is between a few seconds and a day. In a multithreaded program, this call needs to be concurrent. It is true that the program will still terminate even if the call is nonconcurrent, but in the real world, termination will probably occur by the user choosing to "force quit" an application that is "not responding" (also known as sending SIGTERM or even SIGKILL).

they are a bonus in that you can't run concurrent computing haskell threads at the same time. you get "free" concurrent threads in other languages that you would not get if the libraries just happened to be implemented in haskell. However, if the libraries were implemented in haskell, you would still get concurrency on OS blocking events because the progress guarentee says so.

...

The question "can I provide a certain guarantee or not" could be answered with "no" by default to flatten the learning curve a bit. My objection against having "no default" is not very strong, but I do object to specifying this "in neutral language". This situation does not call for neutral language; rather, it has to be made clear that one of the options comes with a proof obligation and the other only with a performance penalty.

you seem to be contradicting yourself, above you say a performance penalty is vitally important in the GUI case if a call takes too long, but here you call it 'just a performance penalty'. The overhead of concurrent calls is quite substantial. Who is to say whether a app that muddles along is better or worse than one that is generally snappy but has an occasional delay. Though, I am a fan of neutral language in general. you can't crash the system like you can with unsafePerformIO, FFI calls that take a while and arn't already wrapped by the standard libraries are relatively rare. no need for strong language. John -- John Meacham - ⑆repetae.net⑆john⑈

Wolfgang Thaller

3:37 a.m.

New subject: FFI, safe vs unsafe

John Meacham wrote:

...

However, in order to achieve that we would have to annotate the foreign functions with whether they use thread local state.

I am not opposed to that; however, you might not like that here again, there would be the safe, possibly inefficient default choice, which means "might access thread local data", and the possibly more efficient annotation that comes with a proof obligation, which says "guaranteed not to access thread local data". The main counterargument is that some libraries, like OpenGL require many *fast* nonconcurrent, nonreentrant but tls-using calls (and, nost likely, one reentrant and possibly concurrent call for the GLUT main event loop). Using OpenGL would probably be infeasible from an implementation which requires a "notls" annotation to make foreign imports fast.

...

it would pretty much be vital for implementing them efficiently on a non OS-threaded implemenation of the language.

True, with the implementation plan you've outlined so far. Have you considered hybrid models where most threads are state threads (all running in one OS thread) and a few threads (=the bound threads) are OS threads which are prevented from actually executing in parallel by a few well-placed locks and condition variables? You could basically write an wrapper around the state threads and pthreads libraries, and you'd get the best of both worlds. I feel it wouldn't be that hard to implement, either.

...

Oddly enough, depending on the implementation it might actually be easier to just make every 'threadlocal' function fully concurrent. you have already paid the cost of dealing with OS threads.

Depending on the implementation, yes. This is the case for the inefficient implementation we recommended for interpreters like Hugs in our bound threads paper; there, the implementation might be constrained by the fact that Hugs implements cooperative threading in Haskell using continuation passing in the IO monad; the interpreter itself doesn't even really know about threads. For jhc, I fell that a hybrid implementation would be better.

...

they are a bonus in that you can't run concurrent computing haskell threads at the same time. you get "free" concurrent threads in other languages that you would not get if the libraries just happened to be implemented in haskell. However, if the libraries were implemented in haskell, you would still get concurrency on OS blocking events because the progress guarentee says so.

Hmm... it sounds like you've been assuming cooperative scheduling, while I've been assuming preemptive scheduling (at least GHC-style preemption, which only checks after x bytes of allocation). Maybe, in a cooperative system, it is a little bit of a bonus, although I'd still want it for practical reasons. I can make my Haskell computations call yield, but how do I make a foreign library (whose author will just say "Let them use threads") cooperate? In a preemtive system, the ability to run a C computation in the background remains a normal use case, not a bonus.

...

...
The question "can I provide a certain guarantee or not" could be answered with "no" by default to flatten the learning curve a bit. My objection against having "no default" is not very strong, but I do object to specifying this "in neutral language". This situation does not call for neutral language; rather, it has to be made clear that one of the options comes with a proof obligation and the other only with a performance penalty.

you seem to be contradicting yourself, above you say a performance penalty is vitally important in the GUI case if a call takes too long, [...]

I am not. What I was talking about above was not performance, but responsiveness; it's somewhat related to fairness in scheduling. If a foreign call takes 10 microseconds instead of 10 nanoseconds, that is a performance penalty that will matter in some circumstances, and not in others (after all, people are writing real programs in Python...). If a GUI does not respond to events for more than two seconds, it is badly written. If the computer or the programming language implementation are just too slow (performance) to achieve a certain task in that time, the Right Thing To Do is to put up a progress bar and keep processing screen update events while doing it, or even do it entirely "in the background". Of course, responsiveness is not an issue for non-interactive processes, but for GUIs it is very important.

...

Who is to say whether a app that muddles along is better or worse than one that is generally snappy but has an occasional delay.

I am ;-). Apart from that, I feel that is a false dichotomy, as even a factor 1000 slowdown in foreign calls is no excuse to make a GUI "generally muddle along".

...

Though, I am a fan of neutral language in general. you can't crash the system like you can with unsafePerformIO, FFI calls that take a while and arn't already wrapped by the standard libraries are relatively rare. no need for strong language.

To the end user, all unusable programs are equivalent :-). Also, one decision can make a library unusable for interactive programs while the other can't; the language should be strong enough to make that clear to library writers who have never heard the words "state threads" and who don't care much about concurrency in general. As for your claim about the relative rarity of such calls, I see that your bias is very different from mine. My world consists mostly of console-based programs that compute something (compilers, etc.) and don't need any FFI or concurrency to speak of, and of interactive graphical applications (GUIs + games) for which the standard libraries are only a tiny fragment of the FFI world. In your world, network servers (or applications with a similar structure) seem to figure more prominently. Cheers, Wolfgang

John Meacham

4:26 a.m.

New subject: FFI, safe vs unsafe

On Wed, Apr 12, 2006 at 11:37:57PM -0400, Wolfgang Thaller wrote:

...

John Meacham wrote:

...
However, in order to achieve that we would have to annotate the foreign functions with whether they use thread local state.

I am not opposed to that; however, you might not like that here again, there would be the safe, possibly inefficient default choice, which means "might access thread local data", and the possibly more efficient annotation that comes with a proof obligation, which says "guaranteed not to access thread local data". The main counterargument is that some libraries, like OpenGL require many *fast* nonconcurrent, nonreentrant but tls-using calls (and, nost likely, one reentrant and possibly concurrent call for the GLUT main event loop). Using OpenGL would probably be infeasible from an implementation which requires a "notls" annotation to make foreign imports fast.

this is getting absurd, 95% of foreign imports are going to be nonreentrant, nonconcurrent, nonthreadlocalusing. Worrying about the minor inconvinience of the small chance someone might accidentally writing buggy code is silly when you have 'peek' and 'poke' and the ability to just deadlock right out there in the open. The FFI is inherently unsafe. We do not need to coddle the programer who is writing raw FFI code. _any_ time you use the FFI there are a large number of proof obligations you are commiting to that arn't necessarily apparent, why make these _very rare_ cases so visible. There is a reason they arn't named 'unsafePoke' and 'unsafePeek', the convinience of using the names poke and peek outweighs the unsafety concern becaues you are already using the FFI and already know everything is unsafe and you need to be careful. these problems can't even crash the runtime, way safer than a lot of the unannotated routines in the FFI.

...

...
it would pretty much be vital for implementing them efficiently on a non OS-threaded implemenation of the language.

True, with the implementation plan you've outlined so far. Have you considered hybrid models where most threads are state threads (all running in one OS thread) and a few threads (=the bound threads) are OS threads which are prevented from actually executing in parallel by a few well-placed locks and condition variables? You could basically write an wrapper around the state threads and pthreads libraries, and you'd get the best of both worlds. I feel it wouldn't be that hard to implement, either.

well, I plan a hybrid model of some sort, simply because it is needed to support foreign concurrent calls. exactly where I will draw the line between them is still up in the air. but in any case, I really like explicit annotations on everything as we can't predict what future implementations might come about so we should play it safe in the standard.

...

...
Oddly enough, depending on the implementation it might actually be easier to just make every 'threadlocal' function fully concurrent. you have already paid the cost of dealing with OS threads.

Depending on the implementation, yes. This is the case for the inefficient implementation we recommended for interpreters like Hugs in our bound threads paper; there, the implementation might be constrained by the fact that Hugs implements cooperative threading in Haskell using continuation passing in the IO monad; the interpreter itself doesn't even really know about threads. For jhc, I fell that a hybrid implementation would be better.

yeah, what I am planning is just providing a create new stack and jump to a different stack(longjmp) primitive, and everything else being implemented in haskell as a part of the standard libraries. (with liberal use of the FFI to call things like pthread_create and epoll) so actually fairly close to the hugs implementation in that it is mostly haskell based, but with some better primitives to work with. (from what I understand of how hugs works)

...

...
you seem to be contradicting yourself, above you say a performance penalty is vitally important in the GUI case if a call takes too long, [...]

I am not. What I was talking about above was not performance, but responsiveness; it's somewhat related to fairness in scheduling. If a foreign call takes 10 microseconds instead of 10 nanoseconds, that is a performance penalty that will matter in some circumstances, and not in others (after all, people are writing real programs in Python...). If a GUI does not respond to events for more than two seconds, it is badly written. If the computer or the programming language implementation are just too slow (performance) to achieve a certain task in that time, the Right Thing To Do is to put up a progress bar and keep processing screen update events while doing it, or even do it entirely "in the background". Of course, responsiveness is not an issue for non-interactive processes, but for GUIs it is very important.

at some point, people might just decide that their program requires an OS threaded implementation and that is fine. that is why I want it as an explicit option so the manual can say "this needs a compiler that supports the OS threading option" rather than "this needs GHC".

...

...
Who is to say whether a app that muddles along is better or worse than one that is generally snappy but has an occasional delay.

I am ;-). Apart from that, I feel that is a false dichotomy, as even a factor 1000 slowdown in foreign calls is no excuse to make a GUI "generally muddle along".

jhc has no concept of primitive. every thing you can do is imported via the FFI from arithmetic routines to IO. it is pretty nice actually, my standard libraries can almost be compiled and tested on other haskell compilers and everything available is right there in the source with haddock comments (well, working on those). but a 1000x slowdown in FFI calls would mean a 1000x slowdown in pretty much everything compiled with jhc. I subscribe to the idea that nothing should be built into the compiler, if a user want's to define their own datatypes, they should optimize just as well as the isomorphic built in types and if they want their own primitives, the implementation should have no advantage.

...

As for your claim about the relative rarity of such calls, I see that your bias is very different from mine. My world consists mostly of console-based programs that compute something (compilers, etc.) and don't need any FFI or concurrency to speak of, and of interactive graphical applications (GUIs + games) for which the standard libraries are only a tiny fragment of the FFI world. In your world, network servers (or applications with a similar structure) seem to figure more prominently.

I am also thinking of ginsu (my chat client) and other interactive stuff that respond to events like that for which the basic cooperative model is more than good enough. Some programs will require the OS threading option (where everything can be pretty much concurrent whether you like it or not) but there is a large and rich body of programs which won't. FWIW (pretty much) every gtk app follows the cooperative model, they have a single event loop with cooperative scheduling where you must explicitly code in your continuations via wacky idle callbacks and whatnot. and many many things have been implemented with it just fine while just the bare minimum cooperative haskell system is worlds better off. John -- John Meacham - ⑆repetae.net⑆john⑈

Marcin 'Qrczak' Kowalczyk

9:01 a.m.

New subject: FFI, safe vs unsafe

John Meacham writes:

...

Checking thread local state for _every_ foregin call is definitly not an option either. (but for specificially annotated ones it is fine.)

BTW, does Haskell support foreign code calling Haskell in a thread which the Haskell runtime has not seen before? Does it work in GHC? If so, does it show the same ThreadId from that point until OS thread's death (like in Kogut), or a new ThreadId for each callback (like in Python)? -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Simon Marlow

11:54 a.m.

New subject: FFI, safe vs unsafe

On 13 April 2006 10:02, Marcin 'Qrczak' Kowalczyk wrote:

...

John Meacham writes:

...
Checking thread local state for _every_ foregin call is definitly not an option either. (but for specificially annotated ones it is fine.)

BTW, does Haskell support foreign code calling Haskell in a thread which the Haskell runtime has not seen before? Does it work in GHC?

Yes, yes.

...

If so, does it show the same ThreadId from that point until OS thread's death (like in Kogut), or a new ThreadId for each callback (like in Python)?

A new ThreadId, but that's not a conscious design decision, just a symptom of the fact that we don't re-use old threads. Cheers, Simon

Taral

12 Apr 12 Apr

9:40 p.m.

New subject: FFI, safe vs unsafe

On 4/12/06, Wolfgang Thaller wrote:

...

Personally, I'm still in favour of inverting this. We are not in court here, so every foreign function is guilty until proven innocent. Every foreign function might be "longrunning" unless the programmer happens to know otherwise. So maybe... "returnsquickly"?

Hear, hear: fast - takes very little time to execute pure - side-effect free nocallback - does not call back into Haskell -- Taral "You can't prove anything." -- Gödel's Incompetence Theorem

John Meacham

9:47 p.m.

New subject: FFI, safe vs unsafe

On Wed, Apr 12, 2006 at 04:40:29PM -0500, Taral wrote:

...

pure - side-effect free we don't really need pure because not having an IO type in the result implies pure. John

-- John Meacham - ⑆repetae.net⑆john⑈

Marcin 'Qrczak' Kowalczyk

11:25 p.m.

New subject: FFI, safe vs unsafe

Taral writes:

...

fast - takes very little time to execute

I was thinking about "quick". It seems to be less literal about speed if my feeling of English is good enough; the effect is indeed not just speed. They fit both as a description of the foreign function, and as a request from an implementation (to make the call faster at the expense of something). The trouble with both is that they raise the question "shouldn't I mark all FFI calls as fast, so they are faster?". They don't hint at a cost of the optimization: faster at the expense of what? -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Claus Reinke

11:03 p.m.

New subject: FFI, safe vs unsafe

if I may repeat myself (again), since my old suggestion now seems to agree with Wolfgang, Ross, and Simon: http://www.haskell.org//pipermail/haskell-prime/2006-March/001129.html ... so my suggestion would be to make no assumption about unannotated calls (don't rely on the programmer too much;), and to have optional keywords "atomic" and "non-reentrant". but yes, "non-reentrant" is rather too long - perhaps "external" (is outside Haskell and stays out)? foreign import - we don't know anything, some implementations might not support this foreign import atomic - function is neither blocking nor long-running foreign import external - function has no callbacks to Haskell cheers, claus --- Wolfgang Thaller: |Personally, I'm still in favour of inverting this. We are not in |court here, so every foreign function is guilty until proven |innocent. Every foreign function might be "longrunning" unless the |programmer happens to know otherwise. So maybe... "returnsquickly"? ---

...

On 2006-04-11, Ross Paterson wrote:

...
On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

...
- the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation). I think the name "concurrent" has a similar problem to "safe": it reads as an instruction to the implementation, rather than a declaration by the programmer of the properties of a particular function; as Wolfgang put it, "this function might spend a lot of time in foreign lands". I'd like to second this.

I agree. So other suggestions? longrunning? mightblock or mayblock? I don't much like 'nonreentrant', it's a bit of a mouthful. Any other suggestions for that? nocallback? Cheers, Simon _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://haskell.org/mailman/listinfo/haskell-prime

John Meacham

11 Apr 11 Apr

10:59 a.m.

New subject: FFI, safe vs unsafe

On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

...

What are the conclusions of this thread?

I think, but correct me if I'm wrong, that the eventual outcome was:

- concurrent reentrant should be supported, because it is not significantly more difficult to implement than just concurrent.

It wasn't a difficulty of implementation issue, it was whether there were unavoidable performance traeoffs. I have no problem with very difficult things if they are well specified and don't require unreasonable concessions elsewhere in the design. in any case, I think the __thread local storage trick makes this fast enough to implement everywhere and there were strong arguments for not having it causing issues for library developers.

...

- the different varieties of foreign call should all be identifiable, because there are efficiency gains to be had in some implementations.

indeed.

...

- the default should be... concurrent reentrant, presumably, because that is the safest. (so we need to invert the notation).

well, I like to reserve the word 'safe' for things that might crash the runtime, unsafePerformIO, so making things nonconcurrent isn't so much something unsafe as a decision. I'd prefer nonconcurrent be the default because it is the much more common case and is just as safe in that regard IMHO.

...

So, can I go ahead and update the wiki? I'll try to record the rationale from the discussion too.

sure.

...

I'd like to pull out something from the discussion that got a bit lost in the swamp: the primary use case we have for concurrent reentrant is for calling the main loop of a GUI library. The main loop usually never returns (at least, not until the application exits), hence concurrent, and it needs to invoke callbacks, hence reentrant.

this is a pain. (making various libraries main loops play nice together). not that it is a haskell specific problem though I guess we have to deal with it. I was thikning of using something like http://liboop.org/ internally in jhc.. but am not sure and would prefer a pure haskell solution without compelling reason to do otherwise. John -- John Meacham - ⑆repetae.net⑆john⑈

7030

Age (days ago)

7032

Last active (days ago)

List overview

Download

18 comments

8 participants

participants (8)

Aaron Denney
Claus Reinke
John Meacham
Marcin 'Qrczak' Kowalczyk
Ross Paterson
Simon Marlow
Taral
Wolfgang Thaller