Writing and testing a Storable instance for a record wrapping ForeignPtrs

Hi everyone, In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods. My Haskell-level record is the following: data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar } Here is the code of the Storable instance: https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-sign... And so I used `hedgehog-classes` to test the Storable instance. However, all the tests fail with the same reason: Prelude.undefined: https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-unde... The main problem (and that's certainly a red herring for me) is that the `undefined` call comes frombase:Foreign.Marshal.Array. Which shouldn't be a problem, as it is not supposed to be evaluated! Yet apparently it is. If you're interested to see the full code, it's located here: https://github.com/haskell-cryptography/libsodium-bindings/blob/add-sel-pack... I'm not sure how to proceed from here. What would be a good angle to approach this? Cheers, Hécate -- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD

Hi,
instance Storable SignedMessage where sizeOf (SignedMessage{messageLength}) = sizeOf (undefined :: CSize) + fromIntegral messageLength + sizeOf cryptoSignBytes
Your sizeOf function mustn't evaluate its argument, yet it does. The size of the storable data must only be defined by its type so you can't use messageLength here. sizeOf would better be defined as sizeOf :: Proxy a -> Word. For now it's just a convention to ignore the first argument. Sylvain On 28/11/2022 17:04, Hécate wrote:
Hi everyone,
In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods.
My Haskell-level record is the following:
data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar }
Here is the code of the Storable instance: https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-sign...
And so I used `hedgehog-classes` to test the Storable instance. However, all the tests fail with the same reason: Prelude.undefined:
https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-unde...
The main problem (and that's certainly a red herring for me) is that the `undefined` call comes frombase:Foreign.Marshal.Array. Which shouldn't be a problem, as it is not supposed to be evaluated! Yet apparently it is.
If you're interested to see the full code, it's located here: https://github.com/haskell-cryptography/libsodium-bindings/blob/add-sel-pack...
I'm not sure how to proceed from here. What would be a good angle to approach this?
Cheers, Hécate

You are strict in your definition of `sizeOf`. You probably want: sizeOf ~SignedMessage{messageLength} = ... On Mon, 28 Nov 2022, at 4:04 PM, Hécate wrote:
Hi everyone,
In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods.
My Haskell-level record is the following:
data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar }
Here is the code of the Storable instance: https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-sign...
And so I used `hedgehog-classes` to test the Storable instance. However, all the tests fail with the same reason: Prelude.undefined:
https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-unde...
The main problem (and that's certainly a red herring for me) is that the `undefined` call comes frombase:Foreign.Marshal.Array. Which shouldn't be a problem, as it is not supposed to be evaluated! Yet apparently it is.
If you're interested to see the full code, it's located here: https://github.com/haskell-cryptography/libsodium-bindings/blob/add-sel-pack...
I'm not sure how to proceed from here. What would be a good angle to approach this?
Cheers, Hécate
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Mon, 2022-11-28 at 17:04 +0100, Hécate wrote:
In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods.
Next to the other replies, did you consider using `hsc2hs` to create these bindings? Using the tool, you can use its `#{size ...}`, `#{alignment ...}` and `#{peek ...}`/`#{poke ...}` helpers to implement `Storable` for some C struct, without running a C compiler in your head (e.g., taking padding into account to implement `sizeOf`). See the docs at https://ghc.gitlab.haskell.org/ghc/doc/users_guide/utils.html#writing-haskel.... Note `hsc2hs` is fully integrated in Cabal (use `Foo.hsc` as module filename), no need to reach out to its CLI. You can find a simple example of a Storable instance using it at https://github.com/NicolasT/landlock-hs/blob/b5638684869ad4f85bea53f10a3f0b9.... Nicolas
My Haskell-level record is the following:
data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar }
Here is the code of the Storable instance: https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-sign...
And so I used `hedgehog-classes` to test the Storable instance. However, all the tests fail with the same reason: Prelude.undefined:
https://gist.github.com/Kleidukos/31346d067f309f2a86cbd97a85c0f1e8#file-unde...
The main problem (and that's certainly a red herring for me) is that the `undefined` call comes frombase:Foreign.Marshal.Array. Which shouldn't be a problem, as it is not supposed to be evaluated! Yet apparently it is.
If you're interested to see the full code, it's located here: https://github.com/haskell-cryptography/libsodium-bindings/blob/add-sel-pack...
I'm not sure how to proceed from here. What would be a good angle to approach this?
Cheers, Hécate

On Mon, Nov 28, 2022 at 05:04:32PM +0100, Hécate wrote:
In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods.
My Haskell-level record is the following:
data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar }
This is not the sort of object for which `Storable` makes sense. It holds ephemeral pointers to variable sized external data, and so cannot be serialised in a modest fixed-size memory block. The `Storable` class is for primitive data (Ints, Words, ...) and simple fixed layout structures consisting of same (e.g. various structures passed to, or returned by C system calls). When structures contain pointers to data, nested `peek` or `poke` calls (with associated memory allocations) may be needed to read or write the structure. To serialise your "SignedMessage" object you may need a higher-level serialisation format. ASN.1, protobufs, ... or (for a Haskell-only format) an instance of `Data.Binary.Binary` (rather than Storable). https://hackage.haskell.org/package/binary-0.10.0.0/docs/Data-Binary.html What's the use case here? -- Viktor.

Hi Viktor, hi everyone, Thanks again for the many useful answers. I'll try to answers the questions asked in one email: 1. did you consider using `hsc2hs` to create these bindings? I have to admit I stopped at c2hs when I found out it could not produce correct alignments for libsodium: https://github.com/haskell/c2hs/issues/272 I am now being told that hsc2hs asks the compiler directly so it's not problem, but that's eleven months after the fact :) 2. What's the use case here? The use case is certainly my own partial worldview of how it all works. That being said one interesting thing is that peek & poke allow you to do IO, whereas Binary's Put and Get do not seem to allow me to do it (without cheating at least). And I couldn't find any instruction that said that it was okay to use unsafeDupablePerformIO (or similar) in Binary. Have a nice rest of your day! Cheers, Hécate Le 28/11/2022 à 17:42, Viktor Dukhovni a écrit :
On Mon, Nov 28, 2022 at 05:04:32PM +0100, Hécate wrote:
In a fit of madness I have found myself writing Haskell code where I need to implement a Storable instance. However, by virtue of not being a C programmer, I'm fairly lost on some of the details, especially the value of the sizeOf and alignment methods.
My Haskell-level record is the following:
data SignedMessage = SignedMessage { messageLength :: CSize , messageForeignPtr :: ForeignPtr CUChar , signatureForeignPtr :: ForeignPtr CUChar } This is not the sort of object for which `Storable` makes sense. It holds ephemeral pointers to variable sized external data, and so cannot be serialised in a modest fixed-size memory block. The `Storable` class is for primitive data (Ints, Words, ...) and simple fixed layout structures consisting of same (e.g. various structures passed to, or returned by C system calls). When structures contain pointers to data, nested `peek` or `poke` calls (with associated memory allocations) may be needed to read or write the structure.
To serialise your "SignedMessage" object you may need a higher-level serialisation format. ASN.1, protobufs, ... or (for a Haskell-only format) an instance of `Data.Binary.Binary` (rather than Storable).
https://hackage.haskell.org/package/binary-0.10.0.0/docs/Data-Binary.html
What's the use case here?
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD

On Mon, Nov 28, 2022 at 06:54:39PM +0100, Hécate wrote:
2. What's the use case here?
The use case is certainly my own partial worldview of how it all works. That being said one interesting thing is that peek & poke allow you to do IO, whereas Binary's Put and Get do not seem to allow me to do it (without cheating at least). And I couldn't find any instruction that said that it was okay to use unsafeDupablePerformIO (or similar) in Binary.
The use-case for Data.Binary is for converting to and from ByteStrings (possibly lazy construction via Builders). If you want to include reading the data from a stream, there's at least: https://hackage.haskell.org/package/ghc-9.4.2/docs/GHC-Utils-Binary.html and equivalent options. But your answer is still at much too low a level. What kinds of messages are these? What sort of communication pattern is this serialisation in aid of? - Haskell to FFI helper functions? - Haskell process to Haskell process on the same machine? - Haskell Process to cache for later retrieval? - Interprocess, possily across languages and systems? - ... What are the interoperability requirements? ... -- Viktor.

Ah sorry, I misunderstood your question. I am writing a high-level interface to my libsodium bindings, and I'm trying to provide implementations of helpful typeclasses. I actually don't need Storable to do the FFI stuff, thankfully, but the main case of using this library would involve sending the result of the signing operation to the network (as ByteStrings), with authorization tokens like Biscuit¹ in mind. ¹ https://www.biscuitsec.org Le 28/11/2022 à 20:06, Viktor Dukhovni a écrit :
On Mon, Nov 28, 2022 at 06:54:39PM +0100, Hécate wrote:
2. What's the use case here?
The use case is certainly my own partial worldview of how it all works. That being said one interesting thing is that peek & poke allow you to do IO, whereas Binary's Put and Get do not seem to allow me to do it (without cheating at least). And I couldn't find any instruction that said that it was okay to use unsafeDupablePerformIO (or similar) in Binary. The use-case for Data.Binary is for converting to and from ByteStrings (possibly lazy construction via Builders). If you want to include reading the data from a stream, there's at least:
https://hackage.haskell.org/package/ghc-9.4.2/docs/GHC-Utils-Binary.html
and equivalent options. But your answer is still at much too low a level. What kinds of messages are these? What sort of communication pattern is this serialisation in aid of?
- Haskell to FFI helper functions? - Haskell process to Haskell process on the same machine? - Haskell Process to cache for later retrieval? - Interprocess, possily across languages and systems? - ...
What are the interoperability requirements? ...
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD

On Mon, Nov 28, 2022 at 08:14:15PM +0100, Hécate wrote:
Ah sorry, I misunderstood your question.
I am writing a high-level interface to my libsodium bindings, and I'm trying to provide implementations of helpful typeclasses.
I actually don't need Storable to do the FFI stuff, thankfully, but the main case of using this library would involve sending the result of the signing operation to the network (as ByteStrings), with authorization tokens like Biscuit¹ in mind.
For serialisation to external standard formats, like JSON, or binary
JSON, ... you're definitely not looking for `Storable`. Simply
returning the message as two octet-strings (ByteStrings), one for the
raw data and another for the signature is all you need. From there,
various higher-level formats are possible.
The main thing to be mindful of is that ByteStrings are limited to 2^31
bytes on 32-bit machines, so very large messages don't fit in a
ByteString on some (increasingly less common) architectures.
Or am I missing some reason why you'd want to create a single binary
blob

Thanks for the knowledge regarding the size of the ByteString! Eventually the outside serialisation isn't up to me, but rather the consumers. I have one potential user who told me they need the signature and the message to be separate for the protocol they are implementing. I guess if I defer to ByteString early on it'll be easier for the library to be adapted to whatever use case comes after. Thanks a lot Viktor, I believe I'll go to bed less ignorant tonight. :) Le 28/11/2022 à 20:34, Viktor Dukhovni a écrit :
On Mon, Nov 28, 2022 at 08:14:15PM +0100, Hécate wrote:
Ah sorry, I misunderstood your question.
I am writing a high-level interface to my libsodium bindings, and I'm trying to provide implementations of helpful typeclasses.
I actually don't need Storable to do the FFI stuff, thankfully, but the main case of using this library would involve sending the result of the signing operation to the network (as ByteStrings), with authorization tokens like Biscuit¹ in mind.
¹ https://www.biscuitsec.org For serialisation to external standard formats, like JSON, or binary JSON, ... you're definitely not looking for `Storable`. Simply returning the message as two octet-strings (ByteStrings), one for the raw data and another for the signature is all you need. From there, various higher-level formats are possible.
The main thing to be mindful of is that ByteStrings are limited to 2^31 bytes on 32-bit machines, so very large messages don't fit in a ByteString on some (increasingly less common) architectures.
Or am I missing some reason why you'd want to create a single binary blob
,<signature>>?
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD

On Mon, Nov 28, 2022 at 08:42:40PM +0100, Hécate wrote:
Thanks for the knowledge regarding the size of the ByteString!
Eventually the outside serialisation isn't up to me, but rather the consumers. I have one potential user who told me they need the signature and the message to be separate for the protocol they are implementing.
I guess if I defer to ByteString early on it'll be easier for the library to be adapted to whatever use case comes after.
Note that "Lazy" Bytestrings are allocated as a chain of component ByteString chunks, and their total length is an Int64 on all platforms. The chunks might still be in memory, and in some cases could be truly incrementally loaded from the underlying source. However true streaming support is perhaps best via ByteStream from e.g. https://hackage.haskell.org/package/streaming-bytestring-0.2.4/docs/Streamin... which supports monadic incremental consumption (the underlying Monad need not be IO, so pure sources are also supported). The design space is wide. If signatures are "detached" (available independently of the message), and messages could be too large to fit in memory, then streaming the message may be attractive. But then the protocol to the consumer needs to support streaming input (e.g. HTTP PUT or POST with chunked transfer encoding). With large messages one might even want chunk-level signatures that authenticate all the chunks transmitted so far in order, with a final empty chunk authenticating the entire message. There should be a standard format for this, but I am not aware of one. CMS sadly does not suffice. For example, messages that are just signed, but not cryptographically tied to a particular transaction (and recipient) are potentially subject to replays. This may or may not be OK. A message with a clear transaction context (that makes out of context replays detectable), encrypted to a given set of recipients, and then signed has stronger security properties than just a signed blob that is not tied to a clear context or recipient. It is sadly common to use cryptography pixie dust to add security theatre to application data flows without a careful analysis of the security properties required and attained. Security is difficult and brittle. :-( -- Viktor.
participants (5)
-
Hécate
-
Nicolas Trangez
-
Oliver Charles
-
Sylvain Henry
-
Viktor Dukhovni