FFI proposal: allow some control over the scope of C header files

One problem that people writing FFI bindings often run into is that they do not understand exactly where C header files are required to be available. The easy case is importing some C function defined in a well known and widely available C header file (eg gtk/gtk.h). In this case we just make sure that header is available for compiling every module in the package and add that header file to the package info file (or .cabal file) so that every module that uses the package will have the C header available. In this case there is no problem with C calls being inlined outside of the module which imported them since the C header file will be available everywhere. The tricky case is that people often use "private" header files that are #included when compiling a module/package but are not installed along with that package and so are not #included when compiling client modules. Most of the time this works, however the Haskell compiler is allowed to inline across modules and if it chooses to inline the C call into a client module then things will break. Sadly it still compiles and sometimes even works since C allows calling a C function without a prototype. However occasionally it's going to break horribly. Allowing us to limit where the C headers will be required would be very useful. Sometimes it is very convenient to have private header files that will not be installed with the package. It is also sometimes the case that it's much more convenient to not require that the user has a set of C header files installed to be able to use a library package. Examples of this include some windows packages, eg DriectX where it's rather inconvenient to require that users have the MS DirectX SDK installed. Currently GHC has a de-facto way of limiting the required scope of C header files to a module - by using the standard FFI syntax (!). I know people are already using this trick to allow the use of private header files. This issue also touches on the related issue that the way of specifying C header files in the FFI spec is not really optimal. GHC implements a couple other methods and these are probably used more that the method in the FFI spec. So I suggest that we briefly consider some possibilities for extending control over where C header files will be needed and perhaps also for specifying what C header files are needed in the first place. I think we'd want to be able to specify that a C header file not "escape" a module boundary and probably we'd also want to be able to ask that it not escape a package boundary (though this may be beyond the H' spec since Haskell does not talk about packages). It would also be convenient to be able to specify that a module needs a particular C header file rather than having to specify it in each foreign import decl. Currently this can be done by cabal in a compiler-specific way (it uses ghc's -#include command line mechanism) It's a reasonable question to ask if specifying a C header file should go in the module source code or elsewhere (eg a .cabal file) since afterall we don't specify search paths etc in the module. I'd say that it is right that the name of the header file be in the module source code and that the search paths etc be external. So some syntax off the top of my head: foreign import cheader module-local "foo/bar.h" I think there are 3 possibilities for the C header escape/scope setting (which should probably be manditory rather than optional): module-local package-local (extension for compilers that have a notion of a package) global this should allow us to automatically check what C headers are needed by client modules. Duncan

Duncan Coutts:
One problem that people writing FFI bindings often run into is that they do not understand exactly where C header files are required to be available.
The easy case is importing some C function defined in a well known and widely available C header file (eg gtk/gtk.h). In this case we just make sure that header is available for compiling every module in the package and add that header file to the package info file (or .cabal file) so that every module that uses the package will have the C header available. In this case there is no problem with C calls being inlined outside of the module which imported them since the C header file will be available everywhere.
The tricky case is that people often use "private" header files that are #included when compiling a module/package but are not installed along with that package and so are not #included when compiling client modules. Most of the time this works, however the Haskell compiler is allowed to inline across modules and if it chooses to inline the C call into a client module then things will break. Sadly it still compiles and sometimes even works since C allows calling a C function without a prototype. However occasionally it's going to break horribly.
Allowing us to limit where the C headers will be required would be very useful. Sometimes it is very convenient to have private header files that will not be installed with the package. It is also sometimes the case that it's much more convenient to not require that the user has a set of C header files installed to be able to use a library package. Examples of this include some windows packages, eg DriectX where it's rather inconvenient to require that users have the MS DirectX SDK installed.
I understand these concerns, but they are tightly coupled to two mechanisms that are currently not really standardised: (1) cross-module function inlining and (2) command line options.
Currently GHC has a de-facto way of limiting the required scope of C header files to a module - by using the standard FFI syntax (!). I know people are already using this trick to allow the use of private header files.
This issue also touches on the related issue that the way of specifying C header files in the FFI spec is not really optimal. GHC implements a couple other methods and these are probably used more that the method in the FFI spec.
So I suggest that we briefly consider some possibilities for extending control over where C header files will be needed and perhaps also for specifying what C header files are needed in the first place.
I think we'd want to be able to specify that a C header file not "escape" a module boundary and probably we'd also want to be able to ask that it not escape a package boundary (though this may be beyond the H' spec since Haskell does not talk about packages).
The H98 standard already specifies a NOINLINE pragma for any function: http://haskell.org/onlinereport/pragmas.html The simplest solution is to ensure that all Haskell compilers implement this pragma properly for foreign imported functions. If you want finer control over where inlining takes place, then maybe the pragma should be extended to provide that finer control.
It would also be convenient to be able to specify that a module needs a particular C header file rather than having to specify it in each foreign import decl. Currently this can be done by cabal in a compiler-specific way (it uses ghc's -#include command line mechanism)
If you don't specify it in every import declaration, the compiler won't know what to include if you allow inlining and the compiler does perform cross-module inlining. Besides, the standard so far doesn't cover command line options at all. So, there is the more general question of whether it should.
It's a reasonable question to ask if specifying a C header file should go in the module source code or elsewhere (eg a .cabal file) since afterall we don't specify search paths etc in the module. I'd say that it is right that the name of the header file be in the module source code and that the search paths etc be external.
So some syntax off the top of my head:
foreign import cheader module-local "foo/bar.h"
I think there are 3 possibilities for the C header escape/scope setting (which should probably be manditory rather than optional): module-local package-local (extension for compilers that have a notion of a package) global
Is this additional complexity really necessary or would the use of NOINLINE pragmas not suffice? It's really in a library context where you want to restrict the inlining of foreign functions, but there the foreign functions are probably not much used inside the library itself, but mainly exported, so I doubt that you would get much of a performance loss by just tagging all foreign imported functions that you don't want to escape as NOINLINE. Manuel

On Fri, 2006-04-21 at 09:32 -0400, Manuel M T Chakravarty wrote:
I think we'd want to be able to specify that a C header file not "escape" a module boundary and probably we'd also want to be able to ask that it not escape a package boundary (though this may be beyond the H' spec since Haskell does not talk about packages).
The H98 standard already specifies a NOINLINE pragma for any function:
http://haskell.org/onlinereport/pragmas.html
The simplest solution is to ensure that all Haskell compilers implement this pragma properly for foreign imported functions. If you want finer control over where inlining takes place, then maybe the pragma should be extended to provide that finer control.
I don't think we need to generalise the problem to all function inlinings. There are specific practical problems caused by inlining foreign calls that are not a problem for ordinary Haskell functions.
Besides, the standard so far doesn't cover command line options at all. So, there is the more general question of whether it should.
I don't think we need to specify the command line interface. The required headers can be put in the module.
So some syntax off the top of my head:
foreign import cheader module-local "foo/bar.h"
I think there are 3 possibilities for the C header escape/scope setting (which should probably be manditory rather than optional): module-local package-local (extension for compilers that have a notion of a package) global
Is this additional complexity really necessary or would the use of NOINLINE pragmas not suffice? It's really in a library context where you want to restrict the inlining of foreign functions, but there the foreign functions are probably not much used inside the library itself, but mainly exported, so I doubt that you would get much of a performance loss by just tagging all foreign imported functions that you don't want to escape as NOINLINE.
What I really want is for the issue of header scope to be something that can be checked by the compiler. As a distro packager I see far too many people getting it wrong because they don't understand the issue. If we could declare the intended scope of the header files then 1. people would think about and 2. if they got it wrong it'd be checkable because the compiler would complain. As it is at the moment people don't know they're doing anything dodgy until some user of their package gets a mysterious gcc warning and possibly a segfault. If we just tell everyone that they should use NOINLINE then they won't and they'll still get it wrong. The reason for some specific syntax rather than using NOINLINE is that the compiler will be able to track the header files needed by each module. So we can avoid the situation where a call gets made outside the scope of its defining header file - either by automatically #including the header file in the right place, or by complaining if the user does not supply the header (eg by putting it in the .cabal file). So it's not the general issue of inlining but the specific problem of what C header files are required to compile what modules. The ideal situation I imagine is that the scope of the headers can be checked automatically so that the compiler or cabal will complain to a library author that their private header file needs to be marked as local to the package/module or included in the library package file and installed with the package. Duncan

Duncan Coutts:
On Fri, 2006-04-21 at 09:32 -0400, Manuel M T Chakravarty wrote:
I think we'd want to be able to specify that a C header file not "escape" a module boundary and probably we'd also want to be able to ask that it not escape a package boundary (though this may be beyond the H' spec since Haskell does not talk about packages).
The H98 standard already specifies a NOINLINE pragma for any function:
http://haskell.org/onlinereport/pragmas.html
The simplest solution is to ensure that all Haskell compilers implement this pragma properly for foreign imported functions. If you want finer control over where inlining takes place, then maybe the pragma should be extended to provide that finer control.
I don't think we need to generalise the problem to all function inlinings. There are specific practical problems caused by inlining foreign calls that are not a problem for ordinary Haskell functions.
Inlining of foreign functions causes extra problems, but generally inlining is a concern; so, if we can use the same mechanisms, we get a simpler language.
Besides, the standard so far doesn't cover command line options at all. So, there is the more general question of whether it should.
I don't think we need to specify the command line interface. The required headers can be put in the module.
That's ok with me. I was just pointing out that many of the problems and/or lack of understanding of users that we are seeing has to do with the use of command line options. We simply cannot address this unless the standard covers command line options.
So some syntax off the top of my head:
foreign import cheader module-local "foo/bar.h"
I think there are 3 possibilities for the C header escape/scope setting (which should probably be manditory rather than optional): module-local package-local (extension for compilers that have a notion of a package) global
Is this additional complexity really necessary or would the use of NOINLINE pragmas not suffice? It's really in a library context where you want to restrict the inlining of foreign functions, but there the foreign functions are probably not much used inside the library itself, but mainly exported, so I doubt that you would get much of a performance loss by just tagging all foreign imported functions that you don't want to escape as NOINLINE.
What I really want is for the issue of header scope to be something that can be checked by the compiler. As a distro packager I see far too many people getting it wrong because they don't understand the issue. If we could declare the intended scope of the header files then 1. people would think about and 2. if they got it wrong it'd be checkable because the compiler would complain.
Whether or not the compiler can check for wrong use, seems to me independent of whether we use inline pragmas or any other syntax. GHC could very well check some of these things today. It just doesn't. Do you propose to make such checks mandatory in the standard?
As it is at the moment people don't know they're doing anything dodgy until some user of their package gets a mysterious gcc warning and possibly a segfault.
If we just tell everyone that they should use NOINLINE then they won't and they'll still get it wrong.
The reason for some specific syntax rather than using NOINLINE is that the compiler will be able to track the header files needed by each module. So we can avoid the situation where a call gets made outside the scope of its defining header file - either by automatically #including the header file in the right place, or by complaining if the user does not supply the header (eg by putting it in the .cabal file).
So it's not the general issue of inlining but the specific problem of what C header files are required to compile what modules.
The ideal situation I imagine is that the scope of the headers can be checked automatically so that the compiler or cabal will complain to a library author that their private header file needs to be marked as local to the package/module or included in the library package file and installed with the package.
We are having two issues here: (1) Specification of which functions need what headers and whether these functions can be inlined. (2) Let the compiler spot wrong uses of header files. These two issues are largely independent. Re (1), I dislike new syntax (or generally any additions to the language) and prefer using existing mechanisms as far as possible. The reason is simply that Haskell is already very complicated. Haskell' will be even more complicated. Hence, we must avoid any unnecessary additions. Re (2), I am happy to discuss what kind of checks are possible, but I am worried that it'll be hard to check for everything without assistance from cabal, which I don't think will be part of Haskell'. Re the concern about wrong use: FFI programming is a minefield. We will never be able to make it safe. So, I am reluctant to complicate the language just to make it (maybe) a little safer. What IMHO will be far more effective is a good tutorial on FFI programming. Manuel

On Sun, 2006-04-23 at 17:26 -0400, Manuel M T Chakravarty wrote:
Duncan Coutts:
On Fri, 2006-04-21 at 09:32 -0400, Manuel M T Chakravarty wrote:
I think we'd want to be able to specify that a C header file not "escape" a module boundary and probably we'd also want to be able to ask that it not escape a package boundary (though this may be beyond the H' spec since Haskell does not talk about packages).
The H98 standard already specifies a NOINLINE pragma for any function:
http://haskell.org/onlinereport/pragmas.html
The simplest solution is to ensure that all Haskell compilers implement this pragma properly for foreign imported functions. If you want finer control over where inlining takes place, then maybe the pragma should be extended to provide that finer control.
I don't think we need to generalise the problem to all function inlinings. There are specific practical problems caused by inlining foreign calls that are not a problem for ordinary Haskell functions.
Inlining of foreign functions causes extra problems, but generally inlining is a concern; so, if we can use the same mechanisms, we get a simpler language.
True, though with a special case mechanism we can make automatic checks possible/easier.
Besides, the standard so far doesn't cover command line options at all. So, there is the more general question of whether it should.
I don't think we need to specify the command line interface. The required headers can be put in the module.
That's ok with me. I was just pointing out that many of the problems and/or lack of understanding of users that we are seeing has to do with the use of command line options. We simply cannot address this unless the standard covers command line options.
Under my hypothetical scheme the ghc command line method would be equivalent to putting it in the module and could be checked the same way.
What I really want is for the issue of header scope to be something that can be checked by the compiler. As a distro packager I see far too many people getting it wrong because they don't understand the issue. If we could declare the intended scope of the header files then 1. people would think about and 2. if they got it wrong it'd be checkable because the compiler would complain.
Whether or not the compiler can check for wrong use, seems to me independent of whether we use inline pragmas or any other syntax. GHC could very well check some of these things today. It just doesn't. Do you propose to make such checks mandatory in the standard?
That'd be nice, though I can see that it is more work.
We are having two issues here:
(1) Specification of which functions need what headers and whether these functions can be inlined. (2) Let the compiler spot wrong uses of header files.
These two issues are largely independent.
Yes, ok.
Re (1), I dislike new syntax (or generally any additions to the language) and prefer using existing mechanisms as far as possible. The reason is simply that Haskell is already very complicated. Haskell' will be even more complicated. Hence, we must avoid any unnecessary additions.
Sure.
Re (2), I am happy to discuss what kind of checks are possible, but I am worried that it'll be hard to check for everything without assistance from cabal, which I don't think will be part of Haskell'.
I think it can be checked without cabal. Outline: suppose we use a module level granularity (I know jhc proposes to use a finer granularity) so we track which C header files are needed to compile which modules. A FFI decl specifying a header file makes that module need that header. Then transitively each module that imports that module needs that header too. We can only stop the header leaking out of the module/package by specifying NOINLINE on the imported function (or using some additional syntax s I originally suggested). So now it's easy to check what headers are needed to to compile any module. Then we probably need to rely on an external mechanism (eg cabal or the user) to make sure that all these headers are available - but at least we can check that the user has done it right. So it's at this point that issue (1) & (2) become related. If we say that a header file transitively infects every client module then it effectively bans private header files and so we need some mechanism to limit the scope of header files to allow them again (like NOINLINE). Eg with c2hs, I think that in theory we should be installing every .h file that c2hs generates for each module with the library package. I've never seen anyone actually do that (Cabal's c2hs support doesn't do that for example).
Re the concern about wrong use: FFI programming is a minefield. We will never be able to make it safe. So, I am reluctant to complicate the language just to make it (maybe) a little safer. What IMHO will be far more effective is a good tutorial on FFI programming.
While it's true that it's easy to shoot yourself in the foot with the FFI, this is a particular issue that people get wrong very often - mostly because it does work most of the time and so they never find the problem by testing (unlike most other FFI bugs which will give you a segfault fairly quickly) Duncan

How about just adding a couple of new pragmas: {-# INCLUDE_PRIVATE "foo/bar.h" #-} {-# INCLUDE_PACKAGE "foo/bar.h" #-} both pragmas apply to all the foreign imports in the current module, just like the existing INCLUDE pragma. Additionally, INCLUDE_PRIVATE prevents any foreign import from being inlined outside the current module, and INCLUDE_PACKAGE does the same but for the package (this requires a little more support from GHC). We can then describe more accurately what is means to give an include file on a particular foreign import: it means the same as INCLUDE_PRIVATE, but for this foreign import only. The problem you mentioned, namely that people use a private header file but don't export it with the package, only happens when they explicitly use {-# INCLUDE #-} or -#include flags, right? In that case, we can have Cabal check that all {-# INCLUDE #-} files are properly exported with the package, and discourage the use of explicit -#include options. Is that enough? Cheers, Simon

Simon Marlow
How about just adding a couple of new pragmas:
{-# INCLUDE_PRIVATE "foo/bar.h" #-} {-# INCLUDE_PACKAGE "foo/bar.h" #-}
both pragmas apply to all the foreign imports in the current module, just like the existing INCLUDE pragma. Additionally, INCLUDE_PRIVATE prevents any foreign import from being inlined outside the current module, and INCLUDE_PACKAGE does the same but for the package (this requires a little more support from GHC).
Probably I am now opening a can of worms: A Haskell ``import M (f)'' instructs the implementation to look for BOTH - the declaration (typing) of ``f'', AND - the implementation of ``f'' inside module ``M'', wherever that can be found with current path settings. In contrast, a ``foreign import ccall "foo/bar.h f"'' only provides analogous help for locating DECLARATIONS. While we limit declaration tracking, shouldn't we do the same for implementation tracking? Perhaps (just a quick first attempt to get the idea across): {-# INCLUDE_PRIVATE "foo/bar.h" "foobar.o" #-} {-# INCLUDE_PACKAGE "foo/bar.h" "-lfoobar" #-} I would have considered allowing just {-# INCLUDE_PRIVATE "foo/bar.h" "foobar" #-} and turning this into ``-lfoobar'' on the linker command line, but probably the ``foobar.o'' option is useful in some contexts. By the way, I do not think that an implementation necessarily has to avoid inlining of limited imports; I think it also could choose to keep the necessary information around in the hidden parts of the package. Users of the package just cannot add foreign imports using those .h files, in the same way as they cannot import hidden Haskell modules. Inlining Haskell functions from hidden modules is not forbidden either (I hope...). Cheers, Wolfram

On 24 April 2006 15:08, kahl@cas.mcmaster.ca wrote:
Perhaps (just a quick first attempt to get the idea across):
{-# INCLUDE_PRIVATE "foo/bar.h" "foobar.o" #-} {-# INCLUDE_PACKAGE "foo/bar.h" "-lfoobar" #-}
I would have considered allowing just
{-# INCLUDE_PRIVATE "foo/bar.h" "foobar" #-}
I understand your reasoning, but I think it's wrong to name libraries in the source code. Library names tend to be platform specific, and change from version to version. Unlike header files, the package implementer can't easily wrap a library dependency in a local library.
By the way, I do not think that an implementation necessarily has to avoid inlining of limited imports; I think it also could choose to keep the necessary information around in the hidden parts of the package. Users of the package just cannot add foreign imports using those .h files, in the same way as they cannot import hidden Haskell modules.
The issue is whether the .h file is available to the client at all. Making it available might impose an unnecessary burden on the client, such as having to install a development package for an external library. Actually it just occurred to me why using NOINLINE isn't the right thing here. The compiler should be free to inline a foreign call that depends on a private header, as long as the header isn't required for compilation, such as when using the native code generator. Cheers, Simon

Simon Marlow:
On 24 April 2006 15:08, kahl@cas.mcmaster.ca wrote:
Perhaps (just a quick first attempt to get the idea across):
{-# INCLUDE_PRIVATE "foo/bar.h" "foobar.o" #-} {-# INCLUDE_PACKAGE "foo/bar.h" "-lfoobar" #-}
I would have considered allowing just
{-# INCLUDE_PRIVATE "foo/bar.h" "foobar" #-}
I understand your reasoning, but I think it's wrong to name libraries in the source code. Library names tend to be platform specific, and change from version to version. Unlike header files, the package implementer can't easily wrap a library dependency in a local library.
Which is exactly the POW the FFI committee adopted - ie, the inclusion of libraries was considered, but rejected on the grounds Simon explained (who as we all know was part of that committee :)
By the way, I do not think that an implementation necessarily has to avoid inlining of limited imports; I think it also could choose to keep the necessary information around in the hidden parts of the package. Users of the package just cannot add foreign imports using those .h files, in the same way as they cannot import hidden Haskell modules.
The issue is whether the .h file is available to the client at all. Making it available might impose an unnecessary burden on the client, such as having to install a development package for an external library.
Actually it just occurred to me why using NOINLINE isn't the right thing here. The compiler should be free to inline a foreign call that depends on a private header, as long as the header isn't required for compilation, such as when using the native code generator.
Ah! That's a good point. Manuel

On Mon, 2006-04-24 at 14:44 +0100, Simon Marlow wrote:
How about just adding a couple of new pragmas:
{-# INCLUDE_PRIVATE "foo/bar.h" #-} {-# INCLUDE_PACKAGE "foo/bar.h" #-}
Sounds reasonable. This is much like my original random syntax but as a pragma.
both pragmas apply to all the foreign imports in the current module, just like the existing INCLUDE pragma. Additionally, INCLUDE_PRIVATE prevents any foreign import from being inlined outside the current module, and INCLUDE_PACKAGE does the same but for the package (this requires a little more support from GHC).
So the existing INCLUDE pagama is really INCLUDE_GLOBAL.
We can then describe more accurately what is means to give an include file on a particular foreign import: it means the same as INCLUDE_PRIVATE, but for this foreign import only.
Yes. That's what GHC currently does and the FFI spec leaves it unspecified.
The problem you mentioned, namely that people use a private header file but don't export it with the package, only happens when they explicitly use {-# INCLUDE #-} or -#include flags, right?
Yes I think so. This normally happens because users specify the header in the .cabal file and Cabal uses -#include.
In that case, we can have Cabal check that all {-# INCLUDE #-} files are properly exported with the package, and discourage the use of explicit -#include options. Is that enough?
Yes, I think that would be a great improvement and would catch most of the bugs. I was hoping we could go one step further and have ghc check that all the headers needed to compile a module are present (usually as a result of importing a module that uses INCLUDE_GLOBAL). I think it'd just be one extra bit of info for each .hi file. So yes, I'd be satisfied with a GHC-only solution but I brought it up here just in case anyone else thinks that this issue of header scope might be worth specifying more clearly in the Haskell' FFI. Specifically if it'd make sense to standardise GHC & JHC's INCLUDE_* pragma(s) into proper FFI syntax (especially since most people seem to use that rather than the official syntax). Duncan

Duncan Coutts:
How about just adding a couple of new pragmas:
{-# INCLUDE_PRIVATE "foo/bar.h" #-} {-# INCLUDE_PACKAGE "foo/bar.h" #-} [..] So yes, I'd be satisfied with a GHC-only solution but I brought it up here just in case anyone else thinks that this issue of header scope might be worth specifying more clearly in the Haskell' FFI. Specifically if it'd make sense to standardise GHC & JHC's INCLUDE_* pragma(s) into
On Mon, 2006-04-24 at 14:44 +0100, Simon Marlow wrote: proper FFI syntax (especially since most people seem to use that rather than the official syntax).
Let me summarise a bit: * Whether the whole issue is a problem or not is implementation specific - ie, if a compiler add prototypes into package files and doesn't propagate #includes, we are good; similarly, a NCG doesn't care. * NOINCLUDE is not the right solution for the above reason. This leaves me with the opinion that we should really leave this as pragma and not make it into FFI syntax. It's a hint to some implementations and irrelevant to others. Do you agree? Manuel

Manuel M T Chakravarty
This leaves me with the opinion that we should really leave this as pragma and not make it into FFI syntax. It's a hint to some implementations and irrelevant to others.
Ah well, if we use that eminently sensible criterion, then the "safe/unsafe" annotation on foreign imports ought to be in a pragma too. For some implementations (yhc/nhc98) it is simply irrelevant, it is really a ghc-ism. :-) Regards, Malcolm

On 04-May-2006, Malcolm Wallace
Manuel M T Chakravarty
wrote: This leaves me with the opinion that we should really leave this as pragma and not make it into FFI syntax. It's a hint to some implementations and irrelevant to others.
Ah well, if we use that eminently sensible criterion, then the "safe/unsafe" annotation on foreign imports ought to be in a pragma too. For some implementations (yhc/nhc98) it is simply irrelevant, it is really a ghc-ism. :-)
While that notion is indeed irrelevant to many implementations, the idea is not unique to ghc. The Mercury compiler uses annotations "will_call_mercury" and "will_not_call_mercury" which are very similar to ghc's safe/unsafe. If that design criteria had been applied to C and C++, then the "register" and "inline" keywords would not exist. But those have been useful for many portable applications. I think it is reasonable to standardize pragmas if there is a reasonable likelihood that multiple implementations will be able to make use of them, as was the case for "register" and "inline", even if many implementations will ignore them. Whether this is likely for safe/unsafe, I don't know, but the Mercury data point suggests that it may be. Cheers, Fergus. -- Fergus J. Henderson | "I have always known that the pursuit Galois Connections, Inc. | of excellence is a lethal habit" Phone: +1 503 626 6616 | -- the last words of T. S. Garp.

Malcolm Wallace:
Manuel M T Chakravarty
wrote: This leaves me with the opinion that we should really leave this as pragma and not make it into FFI syntax. It's a hint to some implementations and irrelevant to others.
Ah well, if we use that eminently sensible criterion, then the "safe/unsafe" annotation on foreign imports ought to be in a pragma too. For some implementations (yhc/nhc98) it is simply irrelevant, it is really a ghc-ism. :-)
I think safe/unsafe is more fundamental for two reasons: 1. nhc currently has it easier than GHC as it doesn't support concurrency. Although, we didn't provide an explicit features for concurrency in the FFI addendum, we tried to co-exist. 2. safe/unsafe is about enabling an optimisation. Implementations are of course free to not apply that optimisation, and then they don't care about the annotation. So the real question is, if nhc would want to achieve the same level of performance as GHC, could it still ignore the annotation? So, I guess, I need to refine my criterion: We leave an annotation as a pragma if it is a hint to some implementation and irrelevant to others that can ignore achieve comparable levels of performance while ignoring it. (Strictly speaking, I guess there is still an exception if it is generally *much* easier to achieve good performance when taking the annotation into account.) Manuel

On Mon, May 08, 2006 at 05:50:47PM -0400, Manuel M T Chakravarty wrote:
1. nhc currently has it easier than GHC as it doesn't support concurrency. Although, we didn't provide an explicit features for concurrency in the FFI addendum, we tried to co-exist.
actually, I believe all haskell implementations already have or are working on concurrency. I know Einar is pretty close to adding support to jhc, yhc has it, and hugs has a lot of the framework done so it shouldn't be too hard to bring it all the way.
2. safe/unsafe is about enabling an optimisation. Implementations are of course free to not apply that optimisation, and then they don't care about the annotation. So the real question is, if nhc would want to achieve the same level of performance as GHC, could it still ignore the annotation?
Also, at some point "optimization" problems become correctness ones if they are vital for getting usable performance. I am not sure if you read it, but there has been _a lot_ of discussion about FFI annotations in the concurrency threads. there is a basic summary of our results on the Concurrency page on the wiki. the basic consensus is to drop the ghc-specific safe vs unsafe and annotate ffi calls with what your actual intent is. as in 'nonreentrant' if the code doesn't call back into haskell and 'concurrent' if the haskell runtime needs to arrange to run concurrently with it. the exact names and defaults are still being worked out, but I think we have a good consensus on at least what different annotations we need in order to give compilers of all sorts of implementation models exactly what info they need.
So, I guess, I need to refine my criterion: We leave an annotation as a pragma if it is a hint to some implementation and irrelevant to others that can ignore achieve comparable levels of performance while ignoring it. (Strictly speaking, I guess there is still an exception if it is generally *much* easier to achieve good performance when taking the annotation into account.)
it is fuzzy. some programs rely on NOINLINE for correctness, but of course they are making all sorts of assumptions about the underlying implementation so it isn't really portable anyway. for instance the NOINLINE unsafePerformIO newIORef trick for global state just doesn't work on jhc and it would be quite tricky to make it otherwise. Not that this is a new or particularly pressing issue as we will eventually hash everything out. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham:
On Mon, May 08, 2006 at 05:50:47PM -0400, Manuel M T Chakravarty wrote:
1. nhc currently has it easier than GHC as it doesn't support concurrency. Although, we didn't provide an explicit features for concurrency in the FFI addendum, we tried to co-exist.
actually, I believe all haskell implementations already have or are working on concurrency. I know Einar is pretty close to adding support to jhc, yhc has it, and hugs has a lot of the framework done so it shouldn't be too hard to bring it all the way.
Excellent.
2. safe/unsafe is about enabling an optimisation. Implementations are of course free to not apply that optimisation, and then they don't care about the annotation. So the real question is, if nhc would want to achieve the same level of performance as GHC, could it still ignore the annotation?
Also, at some point "optimization" problems become correctness ones if they are vital for getting usable performance.
I agree.
I am not sure if you read it, but there has been _a lot_ of discussion about FFI annotations in the concurrency threads.
Yes, I saw that.
there is a basic summary of our results on the Concurrency page on the wiki. the basic consensus is to drop the ghc-specific safe vs unsafe and annotate ffi calls with what your actual intent is. as in 'nonreentrant' if the code doesn't call back into haskell and 'concurrent' if the haskell runtime needs to arrange to run concurrently with it. the exact names and defaults are still being worked out, but I think we have a good consensus on at least what different annotations we need in order to give compilers of all sorts of implementation models exactly what info they need.
That's great. The current FFI standard stayed away from concurrency, as there was no concurrency standard, but now that we get one, the FFI has to synchronise with that.
So, I guess, I need to refine my criterion: We leave an annotation as a pragma if it is a hint to some implementation and irrelevant to others that can ignore achieve comparable levels of performance while ignoring it. (Strictly speaking, I guess there is still an exception if it is generally *much* easier to achieve good performance when taking the annotation into account.)
it is fuzzy. some programs rely on NOINLINE for correctness, but of course they are making all sorts of assumptions about the underlying implementation so it isn't really portable anyway. for instance the NOINLINE unsafePerformIO newIORef trick for global state just doesn't work on jhc and it would be quite tricky to make it otherwise. Not that this is a new or particularly pressing issue as we will eventually hash everything out.
IMHO, NOINLINE unsafePerformIO newIORef is outside anything guaranteed to work by our current standards. Hence, programs that rely on NOINLINE for correctness are bad programs - maybe useful, but bad! So, jhc is perfectly alright in that respect. I wish we had a nicer alternative for this dangerous idiom... Manuel

It is my understanding that the FFI foreign imports declare an ABI and not an API, meaning the exact way to make the foreign call should be completely deterministic based on just what is in the haskell file proper. Otherwise, obviously, direct to assembly implementations would be impossible. In this sense, include files are always potentially optional, however, due to the oddness of the C langauge, one cannot express certain calls without proper prototypes, current haskell implementations take the straightforward path of relying on the prototypes that are contained in the system headers, which also incidentally provides some safety net against improperly specified FFI calls. However, it would also be reasonable for an implementation to just generate its own prototypes, or use inline assembly or any other mechanism to implement the FFI ABI calls properly. I am not sure what my point is, perhaps just that it is not really a haskell-prime language issue, but new pragmas are, so perhaps this is in that regard. in any case, in jhc a {-# INCLUDE foo.h #-} pragma has the effect of adding "foo.h .." to every foreign ccall declaration in the current module. Just a handy shortcut, not that I think that behavior should be codified or anything. John -- John Meacham - ⑆repetae.net⑆john⑈

On 24 April 2006 23:21, John Meacham wrote:
It is my understanding that the FFI foreign imports declare an ABI and not an API, meaning the exact way to make the foreign call should be completely deterministic based on just what is in the haskell file proper. Otherwise, obviously, direct to assembly implementations would be impossible.
In this sense, include files are always potentially optional, however, due to the oddness of the C langauge, one cannot express certain calls without proper prototypes, current haskell implementations take the straightforward path of relying on the prototypes that are contained in the system headers, which also incidentally provides some safety net against improperly specified FFI calls. However, it would also be reasonable for an implementation to just generate its own prototypes, or use inline assembly or any other mechanism to implement the FFI ABI calls properly.
This comes up quite often. The reason that GHC doesn't generate its own prototypes is that we would have to be *sure* that there aren't any other prototypes in scope for the same function, because those prototypes might clash. We can't generate a correct prototype that is guaranteed not to clash, because the foreign import declaration doesn't contain enough information (no const annotations). Admittedly I haven't tried this route (not including *any* external headers at all when compiling .hc files). It might be possible, but you lose the safety net of compiler-checked calls. Cheers, Simon

On Tue, Apr 25, 2006 at 09:40:58AM +0100, Simon Marlow wrote:
Admittedly I haven't tried this route (not including *any* external headers at all when compiling .hc files). It might be possible, but you lose the safety net of compiler-checked calls.
yeah, perhaps a hybrid approach of some sort, when building the package, use the system headers, but then include generated prototypes inside the package-file and don't propagate #includes once the package is built. or just an intitial conformance check against the system headers somehow (?), but then only use your own generated ones when actually compiling haskell code. It would be nice to never need to include external headers in .hc files. John -- John Meacham - ⑆repetae.net⑆john⑈

On 25 April 2006 09:51, John Meacham wrote:
On Tue, Apr 25, 2006 at 09:40:58AM +0100, Simon Marlow wrote:
Admittedly I haven't tried this route (not including *any* external headers at all when compiling .hc files). It might be possible, but you lose the safety net of compiler-checked calls.
yeah, perhaps a hybrid approach of some sort, when building the package, use the system headers, but then include generated prototypes inside the package-file and don't propagate #includes once the package is built.
or just an intitial conformance check against the system headers somehow (?), but then only use your own generated ones when actually compiling haskell code. It would be nice to never need to include external headers in .hc files.
Hmm, the more I think about it, the more I like this idea. It means we
could essentially forget about the public/private header file stuff, we
don't need the extra pragmas, and there would be no restrictions on
inlining of foreign calls.
Also, I've just checked and we #include very little when compiling .hc
files. Just

On Tue, 2006-04-25 at 10:16 +0100, Simon Marlow wrote:
On 25 April 2006 09:51, John Meacham wrote:
On Tue, Apr 25, 2006 at 09:40:58AM +0100, Simon Marlow wrote:
Admittedly I haven't tried this route (not including *any* external headers at all when compiling .hc files). It might be possible, but you lose the safety net of compiler-checked calls.
yeah, perhaps a hybrid approach of some sort, when building the package, use the system headers, but then include generated prototypes inside the package-file and don't propagate #includes once the package is built.
or just an intitial conformance check against the system headers somehow (?), but then only use your own generated ones when actually compiling haskell code. It would be nice to never need to include external headers in .hc files.
Hmm, the more I think about it, the more I like this idea. It means we could essentially forget about the public/private header file stuff, we don't need the extra pragmas, and there would be no restrictions on inlining of foreign calls.
That would be nice. If the module that imports the C functions were compiled via-C with the headers (or some other check like c2hs does) then we'd get the safety check. Then other client modules could be compiled without a prototype at all (or one generated by the Haskell compiler). As you say, it is a bit of a pain that users of Haskell bindings libs need to install the development versions of C libraries. For example Gtk2Hs users on windows need the full dev version of Gtk+ which is considerably larger than the runtime version. One downside would be that we would only be able to call C functions which conform to the standard platform ABI. As it is at the moment (perhaps somewhat by accident) we can call C functions that have non-standard ABI annotations in their prototype, eg: int foo (int) __attribute__((regparam(3))) ok that's a silly example, bu there are more sensible examples of ABI weirdness - especially on arches like mips which seem to support half a dozen different ABIs. Perhaps we don't care, I'm not sure I do. Duncan

On 25 April 2006 10:48, Duncan Coutts wrote:
One downside would be that we would only be able to call C functions which conform to the standard platform ABI. As it is at the moment (perhaps somewhat by accident) we can call C functions that have non-standard ABI annotations in their prototype, eg:
int foo (int) __attribute__((regparam(3)))
ok that's a silly example, bu there are more sensible examples of ABI weirdness - especially on arches like mips which seem to support half a dozen different ABIs. Perhaps we don't care, I'm not sure I do.
The FFI declaration is supposed to specify the ABI completely, so these differences should be reflected in the FFI syntax. As you say it works by accident now - but only when compiling via C, when using the NCG it'll go wrong. Cheers, SImon

Simon Marlow:
On 25 April 2006 09:51, John Meacham wrote:
On Tue, Apr 25, 2006 at 09:40:58AM +0100, Simon Marlow wrote:
Admittedly I haven't tried this route (not including *any* external headers at all when compiling .hc files). It might be possible, but you lose the safety net of compiler-checked calls.
yeah, perhaps a hybrid approach of some sort, when building the package, use the system headers, but then include generated prototypes inside the package-file and don't propagate #includes once the package is built.
or just an intitial conformance check against the system headers somehow (?), but then only use your own generated ones when actually compiling haskell code. It would be nice to never need to include external headers in .hc files.
Hmm, the more I think about it, the more I like this idea. It means we could essentially forget about the public/private header file stuff, we don't need the extra pragmas, and there would be no restrictions on inlining of foreign calls.
That'd be great! Manuel

John Meacham:
It is my understanding that the FFI foreign imports declare an ABI and not an API, meaning the exact way to make the foreign call should be completely deterministic based on just what is in the haskell file proper. Otherwise, obviously, direct to assembly implementations would be impossible.
In this sense, include files are always potentially optional, however, due to the oddness of the C langauge, one cannot express certain calls without proper prototypes, current haskell implementations take the straightforward path of relying on the prototypes that are contained in the system headers, which also incidentally provides some safety net against improperly specified FFI calls. However, it would also be reasonable for an implementation to just generate its own prototypes, or use inline assembly or any other mechanism to implement the FFI ABI calls properly.
Exactly! The FFI Addendum specifically leaves the compiler complete freedom as to which method to choose. I regard this property of the specification as important and we should keep it for Haskell'. Manuel
participants (7)
-
Duncan Coutts
-
Fergus Henderson
-
John Meacham
-
kahl@cas.mcmaster.ca
-
Malcolm Wallace
-
Manuel M T Chakravarty
-
Simon Marlow