Applicative functor for building C structs with Storable

Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?

On 10/13/07, Henning Thielemann
Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?
The memory layout of structs depends on the ABI of system. You can take a guess and write code in the Get and Put monads. Your guess will probably work pretty well within a single architecture (e.g. x86 or x86-64). However, the portable way to do this is to get the information from the C compiler, which is the approach that c2hs[1] and friends use. I would recommend this unless there's some good reason to think otherwise. (see [2] for the exact hook) [1] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ [2] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-2.html#ss2.8 AGL -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org 650-283-9641

On Sat, 13 Oct 2007, Adam Langley wrote:
On 10/13/07, Henning Thielemann
wrote: Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?
The memory layout of structs depends on the ABI of system. You can take a guess and write code in the Get and Put monads. Your guess will probably work pretty well within a single architecture (e.g. x86 or x86-64).
However, the portable way to do this is to get the information from the C compiler, which is the approach that c2hs[1] and friends use. I would recommend this unless there's some good reason to think otherwise. (see [2] for the exact hook)
[1] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ [2] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-2.html#ss2.8
I do not quite see the need for an extra tool. I thought it must be possible to ship a Haskell compiler with modules that depend on the system's C compiler, just like the modules that are implemented differently for Windows and Unix. Such modules could provide the functionality to create and inspect C structs for exchange with system libraries.

On Sun, Oct 14, 2007 at 09:19:24PM +0200, Henning Thielemann wrote:
On Sat, 13 Oct 2007, Adam Langley wrote:
On 10/13/07, Henning Thielemann
wrote: Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?
The memory layout of structs depends on the ABI of system. You can take a guess and write code in the Get and Put monads. Your guess will probably work pretty well within a single architecture (e.g. x86 or x86-64).
However, the portable way to do this is to get the information from the C compiler, which is the approach that c2hs[1] and friends use. I would recommend this unless there's some good reason to think otherwise. (see [2] for the exact hook)
[1] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ [2] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-2.html#ss2.8
I do not quite see the need for an extra tool. I thought it must be possible to ship a Haskell compiler with modules that depend on the system's C compiler, just like the modules that are implemented differently for Windows and Unix. Such modules could provide the functionality to create and inspect C structs for exchange with system libraries.
You say "the system's C compiler" as if there was only one. It's quite common for UNIXoid systems to have several C compilers installed simultaneously; and if you use the module corresponding to the wrong compiler, you get silent data loss. I wouldn't risk it. Stefan

On Sun, 14 Oct 2007, Stefan O'Rear wrote:
On Sun, Oct 14, 2007 at 09:19:24PM +0200, Henning Thielemann wrote:
On Sat, 13 Oct 2007, Adam Langley wrote:
On 10/13/07, Henning Thielemann
wrote: Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?
The memory layout of structs depends on the ABI of system. You can take a guess and write code in the Get and Put monads. Your guess will probably work pretty well within a single architecture (e.g. x86 or x86-64).
However, the portable way to do this is to get the information from the C compiler, which is the approach that c2hs[1] and friends use. I would recommend this unless there's some good reason to think otherwise. (see [2] for the exact hook)
[1] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ [2] http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-2.html#ss2.8
I do not quite see the need for an extra tool. I thought it must be possible to ship a Haskell compiler with modules that depend on the system's C compiler, just like the modules that are implemented differently for Windows and Unix. Such modules could provide the functionality to create and inspect C structs for exchange with system libraries.
You say "the system's C compiler" as if there was only one. It's quite common for UNIXoid systems to have several C compilers installed simultaneously; and if you use the module corresponding to the wrong compiler, you get silent data loss. I wouldn't risk it.
Do different C compilers on the same platform actually use different layouts for structs? If yes, how can I find out, with which compiler a library was compiled?

On Sun, Oct 14, 2007 at 09:28:45PM +0200, Henning Thielemann wrote:
Do different C compilers on the same platform actually use different layouts for structs?
Yes, because there are tradeoffs involved. On x86, the optimal alignment for long double is 8 bytes, but a lot of people aren't crazy about 6 bytes of padding per 10 byte datum, so some compilers default to 4 byte alignment.
If yes, how can I find out, with which compiler a library was compiled?
Ask the person who did the compiling. Stefan

On Sun, 14 Oct 2007, Stefan O'Rear wrote:
On Sun, Oct 14, 2007 at 09:28:45PM +0200, Henning Thielemann wrote:
Do different C compilers on the same platform actually use different layouts for structs?
Yes, because there are tradeoffs involved. On x86, the optimal alignment for long double is 8 bytes, but a lot of people aren't crazy about 6 bytes of padding per 10 byte datum, so some compilers default to 4 byte alignment.
If yes, how can I find out, with which compiler a library was compiled?
Ask the person who did the compiling.
... but then c2hs cannot do it better (=automatically) than the "compiler dependent Haskell module" approach, can it?

On Sun, Oct 14, 2007 at 10:05:27PM +0200, Henning Thielemann wrote:
On Sun, 14 Oct 2007, Stefan O'Rear wrote:
On Sun, Oct 14, 2007 at 09:28:45PM +0200, Henning Thielemann wrote:
Do different C compilers on the same platform actually use different layouts for structs?
Yes, because there are tradeoffs involved. On x86, the optimal alignment for long double is 8 bytes, but a lot of people aren't crazy about 6 bytes of padding per 10 byte datum, so some compilers default to 4 byte alignment.
If yes, how can I find out, with which compiler a library was compiled?
Ask the person who did the compiling.
... but then c2hs cannot do it better (=automatically) than the "compiler dependent Haskell module" approach, can it?
On second thought, you're right. It would work. However, you still need the information in the header file, due to the possibility of things like __attribute__((packed)). c2hs reads and fully parses the headers themselves; so must your TH binding generator. Stefan

On Sun, Oct 14, 2007 at 09:28:45PM +0200, Henning Thielemann wrote:
You say "the system's C compiler" as if there was only one. It's quite common for UNIXoid systems to have several C compilers installed simultaneously; and if you use the module corresponding to the wrong compiler, you get silent data loss. I wouldn't risk it.
Do different C compilers on the same platform actually use different layouts for structs? If yes, how can I find out, with which compiler a library was compiled?
Not in general, the exact layout of C structs and whatnot is set forth in the ABI spec developed by the chip manufacturer. All compilers must follow it or they cannot even use the same libraries. that said, some compilers out there do sometimes deviate from the ABI, or allow it as an option, but there is generally an accepted ABI for a OS-platform pair. Otherwise things like 'libc' would not be abled to be linked against. for instance the ABI for x86-64 is here http://www.x86-64.org/documentation/abi-0.98.pdf John -- John Meacham - ⑆repetae.net⑆john⑈

On Sun, 14 Oct 2007, John Meacham wrote:
On Sun, Oct 14, 2007 at 09:28:45PM +0200, Henning Thielemann wrote:
You say "the system's C compiler" as if there was only one. It's quite common for UNIXoid systems to have several C compilers installed simultaneously; and if you use the module corresponding to the wrong compiler, you get silent data loss. I wouldn't risk it.
Do different C compilers on the same platform actually use different layouts for structs? If yes, how can I find out, with which compiler a library was compiled?
Not in general, the exact layout of C structs and whatnot is set forth in the ABI spec developed by the chip manufacturer. All compilers must follow it or they cannot even use the same libraries. that said, some compilers out there do sometimes deviate from the ABI, or allow it as an option, but there is generally an accepted ABI for a OS-platform pair. Otherwise things like 'libc' would not be abled to be linked against.
This means, if this knowledge is baked into system dependent instances of Storable (especially 'alignment') then generic routines for constructing and disecting C structs are straightforward to implement. Right?

Hello John, Monday, October 15, 2007, 7:51:13 AM, you wrote:
You say "the system's C compiler" as if there was only one. It's quite common for UNIXoid systems to have several C compilers installed simultaneously; and if you use the module corresponding to the wrong compiler, you get silent data loss. I wouldn't risk it.
Not in general, the exact layout of C structs and whatnot is set forth in the ABI spec developed by the chip manufacturer. All compilers must follow it or they cannot even use the same libraries. that said, some compilers out there do sometimes deviate from the ABI, or allow it as an option, but there is generally an accepted ABI for a OS-platform pair. Otherwise things like 'libc' would not be abled to be linked against.
for instance the ABI for x86-64 is here http://www.x86-64.org/documentation/abi-0.98.pdf
i bet that this document was written exactly because "old systems" (read: i386) had so much confusion :)) i definitely know that it was a great problem in win32 compilers -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sat, 2007-10-13 at 18:25 +0200, Henning Thielemann wrote:
Are there analogons to ByteString's Put and Get Monads to construct and disect binary data in a C struct for exchange with a C program?
It's certainly possible. As John Meacham says on most platforms there is a standard ABI (ie standard on that platform, not standard between platforms). So one could make a lib on top of Data.Binary.Get/Put that embeds the knowledge about some particular platform's C struct ABI and allows reading/writing them. Like, Data.Binary's Get/Put it could be used just via the Applicative combinators. You could probably also get a good deal of code re-use between different platform ABIs by parametrising by things like alignment, endianness, size of primitives etc. Duncan
participants (6)
-
Adam Langley
-
Bulat Ziganshin
-
Duncan Coutts
-
Henning Thielemann
-
John Meacham
-
Stefan O'Rear