
Amanda Clare
Are there any future plans to go with the SWIG group http://www.swig.org/ for interfacing with C code? They claim to support Perl, Python, Tcl/Tk, Ruby, Guile, MzScheme, Java, OCAML, CHICKEN, and C#, and as I've understood, you just run something like "swig -python" on your header file to create everything you need to interface C to Python.
At the moment, I'd like to generate Haskell access to all the functions and enums defined in the Oracle database C library libsqlora8 http://www.poitschke.de/libsqlora8/. Something like SWIG sounds ideal for that. C2hs looks good. But somehow, if I do it in c2hs I have to read and understand all about the ffi, stable pointers and foreign pointers etc. Is the problem just too complicated in Haskell to automate completely?
The short answer is "yes." The c2hs paper http://www.cse.unsw.edu.au/~chak/papers/papers.html#c2hs says the following about SWIG: SWIG works well for untyped scripting languages, such as Tcl, Python, Perl, and Scheme, or C-like languages, such as Java, but the problem with typed functional languages is that the information in the C header file is usually not sufficient for determining the interface on the functional-language side. As a result, additional information has to be included into the C header file, which leads to maintenance overhead when new versions of an interfaced C library appear. This is in contrast to the use of pristine C header files complemented by a separate high-level interface specification as favoured in C->Haskell. I have to admit that I didn't have a look at how SWIG handles OCaml (it didn't have that support at the time the paper was written), though. To illustrate the quoted text a bit, consider for example that many C programs just use values of type `int' to represent a Boolean value. They may then go on to include #define TRUE -1 #define FALSE 0 to make the code a bit more readable. Using `Int' for Booleans is clearly not acceptable in Haskell, but presented with a prototype of the form int foo (int x); should the Haskell signature be foo :: Int -> Int , foo :: Bool -> Int , foo :: Int -> Bool , or foo :: Bool -> Bool ? As a consequence, the design of a Haskell API for a C library requires an understanding of the *semantics* of the involved types and functions. Hence, it requires human intervention. Consequently, tool support can follow any of two routes: (1) Automatically generate a raw and ugly interface from the C header file (which, in particular, maps all use of a C `int' to a Haskell `Int', independent of whether that `int' represents a Boolean value). Then, write a normal Haskell module that exports a nice Haskell-ised API and implements it by calling the functions from the raw and ugly interface. I call this additional code "impedance matching code." (2) Use the C header together with some extra information that describes the mapping of C types to Haskell types to directly generate a nice Haskell-ised API. SWIG follows Route (1); although, it permits to annotate C headers to get some of the benefits of Route (2).[1] C->Haskell follows Route (2). The extra information is exactly what is contained in the binding modules. I prefer this route as it leaves scope for generating some of the repetitive patterns in the impedance matching code automatically, hence, leading to less overall effort. The main advantage of Route (1) is that it facilitates to generate a raw and ugly interface really quickly and, if you don't care to making a proper Haskell library out of it, allows you to code your application directly on that raw interface. In other words, it reduces the barrier to entry, even if it increases the overall effort. Consequently, I am very interested in reducing the barrier to entry to work with c2hs. To do so, I have writing a tutorial on my list for a quite a while - I just never seem to get the time to actually do it :-/ In addition, it might be worthwhile to extend the existing function hooks http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-3.html#ss3.7 such that supplying a Haskell type is optional and there is a default mapping to Haskell for every C type. The result would be a function binding like that SWIG would generate. Cheers, Manuel [1] IMHO annotating C headers is a big no-no. You want to work from prestine headers to simplify tracking of successive versions of the C library.