Storable instance of () is broken

Harendra Kumar

5 Jan 2022 5 Jan '22

8:01 a.m.

The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows: instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return () The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to. This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code. Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it. If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above? Thanks, Harendra

Show replies by date

David Feuer

5 Jan 5 Jan

8:13 a.m.

I don't think it's broken; I think your length calculation is broken. Instead of asking every use of () to take an extra byte, why don't you just store a word saying how long your array is? Alternatively, you could probably avoid special cases with a newtype: newtype NZS a = NZS { unNZS :: a } instance Storable a => Storable (NZS a) where sizeof | sizeof (undefined @a) == 0 = const 1 | otherwise = sizeof .# unNZS alignment = alignment .# unNZS peekElemOff = coerce (peekElemOff @a) .... On Wed, Jan 5, 2022, 3:01 AM Harendra Kumar wrote:

...

The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Harendra Kumar

11:14 a.m.

On Wed, 5 Jan 2022 at 13:44, David Feuer wrote:

...

I don't think it's broken; I think your length calculation is broken. Instead of asking every use of () to take an extra byte, why don't you just store a word saying how long your array is?

We can always say that the application is broken and not the underlying system as long the problem can be circumvented in the application. Of course the problem can be circumvented in our code and that is what we are doing, but these semantics for the () instance makes it harder and not very elegant. It is debatable what is broken. That's the reason I wanted to raise this for discussion. We already store the length in the array but the length is in the form of number of bytes and not the number of elements, we store the start and end pointers in memory and compute the byte length from that. And the () Storable instance does not allow us to convert bytes back to the number of elements. And that is actually the crux of the problem. There are reasons for storing pointers which may be irrelevant for this discussion. But the point is that this instance forces us to use a different representation where we have to store the number of elements in the array. Which I think is unnecessary for Storable arrays as long as the Storable type has a concrete representation with a definite size. I would like to think that if something is Storable then it should have a concrete memory representation. peek and poke operations themselves indicate this fundamental nature of Storable semantics. peek means we are reading from a memory location and poke means that we are writing to a memory location. In the case of () we are just pretending to read or write on peek/poke, we are not storing or retrieving something. This in my opinion should not be how Storable should behave. I think this should not be about optimising for that extra byte. That optimization is bogus. We are saying look, we can store something in memory without even consuming any space. What's the point of doing that? In this case we are doing that just because we can. To take this argument further we can also say that we can compress some byte sequences and it will take up less space. But that's harder and it will completely screw up the simple semantics so we won't do that. Let's screw it only a little bit, because it's easier to do that. If we do not want to consume space for () then better not have a Storable instance for it, leave it to the users. And if we really want to store it then let it behave in the same way as any other mere mortal storable would, so what if it takes up some space. Why not have simple, uniform semantics for all cases? I think saving that extra byte for this particular case is an overkill and leading to a bigger cost in applications without actually saving anything worthwhile. -harendra

...

Alternatively, you could probably avoid special cases with a newtype:

newtype NZS a = NZS { unNZS :: a }

instance Storable a => Storable (NZS a) where sizeof | sizeof (undefined @a) == 0 = const 1 | otherwise = sizeof .# unNZS alignment = alignment .# unNZS peekElemOff = coerce (peekElemOff @a) ....

On Wed, Jan 5, 2022, 3:01 AM Harendra Kumar wrote:

...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

David Feuer

11:41 a.m.

No. Consider a type like this: data Foo a = Foo !Int !a instance Storable a => Storable (Foo a) where ... Now if a happens to be (), we pay only one word per Foo. You want us to pay more so you can do your calculation more efficiently. That doesn't seem quite fair. You have another option: don't use (). Just write your own version with a different Storable instance and use that. Or use a newtype wrapper like I suggested. On Wed, Jan 5, 2022, 6:14 AM Harendra Kumar wrote:

...

On Wed, 5 Jan 2022 at 13:44, David Feuer wrote:

...
I don't think it's broken; I think your length calculation is broken.

Instead of asking every use of () to take an extra byte, why don't you just store a word saying how long your array is?

We can always say that the application is broken and not the underlying system as long the problem can be circumvented in the application. Of course the problem can be circumvented in our code and that is what we are doing, but these semantics for the () instance makes it harder and not very elegant. It is debatable what is broken. That's the reason I wanted to raise this for discussion.

We already store the length in the array but the length is in the form of number of bytes and not the number of elements, we store the start and end pointers in memory and compute the byte length from that. And the () Storable instance does not allow us to convert bytes back to the number of elements. And that is actually the crux of the problem.

There are reasons for storing pointers which may be irrelevant for this discussion. But the point is that this instance forces us to use a different representation where we have to store the number of elements in the array. Which I think is unnecessary for Storable arrays as long as the Storable type has a concrete representation with a definite size.

I would like to think that if something is Storable then it should have a concrete memory representation. peek and poke operations themselves indicate this fundamental nature of Storable semantics. peek means we are reading from a memory location and poke means that we are writing to a memory location. In the case of () we are just pretending to read or write on peek/poke, we are not storing or retrieving something. This in my opinion should not be how Storable should behave.

I think this should not be about optimising for that extra byte. That optimization is bogus. We are saying look, we can store something in memory without even consuming any space. What's the point of doing that? In this case we are doing that just because we can. To take this argument further we can also say that we can compress some byte sequences and it will take up less space. But that's harder and it will completely screw up the simple semantics so we won't do that. Let's screw it only a little bit, because it's easier to do that.

If we do not want to consume space for () then better not have a Storable instance for it, leave it to the users. And if we really want to store it then let it behave in the same way as any other mere mortal storable would, so what if it takes up some space. Why not have simple, uniform semantics for all cases? I think saving that extra byte for this particular case is an overkill and leading to a bigger cost in applications without actually saving anything worthwhile.

-harendra

...
Alternatively, you could probably avoid special cases with a newtype:

newtype NZS a = NZS { unNZS :: a }

instance Storable a => Storable (NZS a) where sizeof | sizeof (undefined @a) == 0 = const 1 | otherwise = sizeof .# unNZS alignment = alignment .# unNZS peekElemOff = coerce (peekElemOff @a) ....

On Wed, Jan 5, 2022, 3:01 AM Harendra Kumar

wrote:

...
...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Harendra Kumar

12:09 p.m.

It is hard to objectively or mathematically prove which option is better. As a library designer I do not like to think only from the perspective of my current problem at hand but in general what is the right thing. In my opinion, the right thing here is to have uniform semantics with a non-zero size for objects that are stored or retrieved from memory. If I were the owner of the base package I would do that. This optimization in my opinion is a micro-optimization which is irrelevant in the larger scheme of things. If someone wants to optimise for this case there could be ways to do that. But again it is subjective - this vs that. It is a different discussion whether it is a good idea to change the instance because it might break things. This instance came into being in GHC 8.0. There must be a reason for that, maybe someone on this list knows. I wonder how many users there are. -harendra On Wed, 5 Jan 2022 at 17:11, David Feuer wrote:

...

No. Consider a type like this:

data Foo a = Foo !Int !a

instance Storable a => Storable (Foo a) where ...

Now if a happens to be (), we pay only one word per Foo. You want us to pay more so you can do your calculation more efficiently. That doesn't seem quite fair. You have another option: don't use (). Just write your own version with a different Storable instance and use that. Or use a newtype wrapper like I suggested.

On Wed, Jan 5, 2022, 6:14 AM Harendra Kumar wrote:

...
On Wed, 5 Jan 2022 at 13:44, David Feuer wrote:

...
I don't think it's broken; I think your length calculation is broken. Instead of asking every use of () to take an extra byte, why don't you just store a word saying how long your array is?

We can always say that the application is broken and not the underlying system as long the problem can be circumvented in the application. Of course the problem can be circumvented in our code and that is what we are doing, but these semantics for the () instance makes it harder and not very elegant. It is debatable what is broken. That's the reason I wanted to raise this for discussion.

We already store the length in the array but the length is in the form of number of bytes and not the number of elements, we store the start and end pointers in memory and compute the byte length from that. And the () Storable instance does not allow us to convert bytes back to the number of elements. And that is actually the crux of the problem.

There are reasons for storing pointers which may be irrelevant for this discussion. But the point is that this instance forces us to use a different representation where we have to store the number of elements in the array. Which I think is unnecessary for Storable arrays as long as the Storable type has a concrete representation with a definite size.

I would like to think that if something is Storable then it should have a concrete memory representation. peek and poke operations themselves indicate this fundamental nature of Storable semantics. peek means we are reading from a memory location and poke means that we are writing to a memory location. In the case of () we are just pretending to read or write on peek/poke, we are not storing or retrieving something. This in my opinion should not be how Storable should behave.

I think this should not be about optimising for that extra byte. That optimization is bogus. We are saying look, we can store something in memory without even consuming any space. What's the point of doing that? In this case we are doing that just because we can. To take this argument further we can also say that we can compress some byte sequences and it will take up less space. But that's harder and it will completely screw up the simple semantics so we won't do that. Let's screw it only a little bit, because it's easier to do that.

If we do not want to consume space for () then better not have a Storable instance for it, leave it to the users. And if we really want to store it then let it behave in the same way as any other mere mortal storable would, so what if it takes up some space. Why not have simple, uniform semantics for all cases? I think saving that extra byte for this particular case is an overkill and leading to a bigger cost in applications without actually saving anything worthwhile.

-harendra

...
Alternatively, you could probably avoid special cases with a newtype:

newtype NZS a = NZS { unNZS :: a }

instance Storable a => Storable (NZS a) where sizeof | sizeof (undefined @a) == 0 = const 1 | otherwise = sizeof .# unNZS alignment = alignment .# unNZS peekElemOff = coerce (peekElemOff @a) ....

On Wed, Jan 5, 2022, 3:01 AM Harendra Kumar wrote:

...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Brandon Allbery

12:15 p.m.

"Mathematically" doesn't seem the point of Storable; it's about exchanging values with foreign functions, which in practice means C. What C type does () correspond to? (It's not void, since that has no values.) On Wed, Jan 5, 2022 at 7:10 AM Harendra Kumar wrote:

...

It is hard to objectively or mathematically prove which option is better. As a library designer I do not like to think only from the perspective of my current problem at hand but in general what is the right thing. In my opinion, the right thing here is to have uniform semantics with a non-zero size for objects that are stored or retrieved from memory. If I were the owner of the base package I would do that. This optimization in my opinion is a micro-optimization which is irrelevant in the larger scheme of things. If someone wants to optimise for this case there could be ways to do that. But again it is subjective - this vs that.

It is a different discussion whether it is a good idea to change the instance because it might break things. This instance came into being in GHC 8.0. There must be a reason for that, maybe someone on this list knows. I wonder how many users there are.

-harendra

On Wed, 5 Jan 2022 at 17:11, David Feuer wrote:

...
No. Consider a type like this:

data Foo a = Foo !Int !a

instance Storable a => Storable (Foo a) where ...

Now if a happens to be (), we pay only one word per Foo. You want us to pay more so you can do your calculation more efficiently. That doesn't seem quite fair. You have another option: don't use (). Just write your own version with a different Storable instance and use that. Or use a newtype wrapper like I suggested.

On Wed, Jan 5, 2022, 6:14 AM Harendra Kumar wrote:

...
On Wed, 5 Jan 2022 at 13:44, David Feuer wrote:

...
I don't think it's broken; I think your length calculation is broken. Instead of asking every use of () to take an extra byte, why don't you just store a word saying how long your array is?

We can always say that the application is broken and not the underlying system as long the problem can be circumvented in the application. Of course the problem can be circumvented in our code and that is what we are doing, but these semantics for the () instance makes it harder and not very elegant. It is debatable what is broken. That's the reason I wanted to raise this for discussion.

We already store the length in the array but the length is in the form of number of bytes and not the number of elements, we store the start and end pointers in memory and compute the byte length from that. And the () Storable instance does not allow us to convert bytes back to the number of elements. And that is actually the crux of the problem.

There are reasons for storing pointers which may be irrelevant for this discussion. But the point is that this instance forces us to use a different representation where we have to store the number of elements in the array. Which I think is unnecessary for Storable arrays as long as the Storable type has a concrete representation with a definite size.

I would like to think that if something is Storable then it should have a concrete memory representation. peek and poke operations themselves indicate this fundamental nature of Storable semantics. peek means we are reading from a memory location and poke means that we are writing to a memory location. In the case of () we are just pretending to read or write on peek/poke, we are not storing or retrieving something. This in my opinion should not be how Storable should behave.

I think this should not be about optimising for that extra byte. That optimization is bogus. We are saying look, we can store something in memory without even consuming any space. What's the point of doing that? In this case we are doing that just because we can. To take this argument further we can also say that we can compress some byte sequences and it will take up less space. But that's harder and it will completely screw up the simple semantics so we won't do that. Let's screw it only a little bit, because it's easier to do that.

If we do not want to consume space for () then better not have a Storable instance for it, leave it to the users. And if we really want to store it then let it behave in the same way as any other mere mortal storable would, so what if it takes up some space. Why not have simple, uniform semantics for all cases? I think saving that extra byte for this particular case is an overkill and leading to a bigger cost in applications without actually saving anything worthwhile.

-harendra

...
Alternatively, you could probably avoid special cases with a newtype:

newtype NZS a = NZS { unNZS :: a }

instance Storable a => Storable (NZS a) where sizeof | sizeof (undefined @a) == 0 = const 1 | otherwise = sizeof .# unNZS alignment = alignment .# unNZS peekElemOff = coerce (peekElemOff @a) ....

On Wed, Jan 5, 2022, 3:01 AM Harendra Kumar wrote:

...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

-- brandon s allbery kf8nh allbery.b@gmail.com

Sven Panne

12:18 p.m.

Am Mi., 5. Jan. 2022 um 13:11 Uhr schrieb Harendra Kumar < harendra.kumar@gmail.com>:

...

[...] In my opinion, the right thing here is to have uniform semantics with a non-zero size for objects that are stored or retrieved from memory.

Which way is more uniform seems to depend on your POV, and base's choice has been made a long time ago, so that ship has sailed...

...

If I were the owner of the base package I would do that.

Then I'm quite happy that you aren't. ;-) API breakages should have a *very* good reason, not just that it makes life easier for a single library.

...

This optimization in my opinion is a micro-optimization which is irrelevant in the larger scheme of things. If someone wants to optimise for this case there could be ways to do that. But again it is subjective - this vs that.

IIRC there were no deep thoughts or arguments about optimizations at all at that time, but I still find the Storable () instance OK in its current state today.

Harendra Kumar

12:29 p.m.

On Wed, 5 Jan 2022 at 17:48, Sven Panne wrote:

...

Then I'm quite happy that you aren't. ;-) API breakages should have a *very* good reason, not just that it makes life easier for a single library.

Sure. But I did not mean that I am hell bent to change it now, I think I said that changing it now is a different discussion. -harendra

Tom Ellis

12:38 p.m.

On Wed, Jan 05, 2022 at 05:39:49PM +0530, Harendra Kumar wrote:

...

It is hard to objectively or mathematically prove which option is better.

Well, let's give it a go. One condition that instances might be required to satisfy is do { poke p a; peek p } == do { poke p a; evaluate a } and evaluating these expressions should touch at most `sizeof a` bytes starting from `p`. If further you impose the reasonable condition that `sizeof a` should be as small as possible then this fixes the behaviour of the () instance. (Other reasonable conditions are available.) The existing `Storable ()` instance does *not* satisfy this condition, because it is not sufficiently strict. That's orthogonal to Harendra's complaint though. It does satisfy the "as small as possible" part. On the other hand, despite being no expert of Storable, it seems to me the class exists merely to conveniently read to and write from raw memory. The () instance is useless for this purpose so I'm not sure why it exists. If the purpose of the class were to be a general purpose serialisation API then the () instance *would* make sense, but then we'd also have an (a, b) instance too, which we don't. If it were up to me I probably would not have allowed the `Storable ()` instance, and instead I would have designed a "serialise to a raw buffer" API *on top of* Storable (if that's what people really wanted). Tom (See https://www.stackage.org/haddock/lts-18.21/base-4.14.3.0/Foreign-Storable.ht...)

Harendra Kumar

1:49 p.m.

On Wed, 5 Jan 2022 at 18:09, Tom Ellis wrote:

...

On the other hand, despite being no expert of Storable, it seems to me the class exists merely to conveniently read to and write from raw memory. The () instance is useless for this purpose so I'm not sure why it exists. If the purpose of the class were to be a general purpose serialisation API then the () instance *would* make sense, but then we'd also have an (a, b) instance too, which we don't.

Exactly!

Kim-Ee Yeoh

2:36 p.m.

On Wed, Jan 5, 2022 at 7:10 PM Harendra Kumar wrote:

...

It is hard to objectively or mathematically prove which option is better.

You’ve made good points in your posts. Worth to keep in mind that the number of bits needed to distinguish between n things is log2 n. The log of 1 is 0.

...

-- -- Kim-Ee

Harendra Kumar

2:51 p.m.

On Wed, 5 Jan 2022 at 20:06, Kim-Ee Yeoh wrote:

...

You’ve made good points in your posts.

Worth to keep in mind that the number of bits needed to distinguish between n things is log2 n.

The log of 1 is 0.

Exactly. And that is why we should not have a Storable instance for (). -harendra

Kim-Ee Yeoh

3:23 p.m.

One way to make rapid progress is to adopt the approach that Henning took. Is Storable that large and/or so frequently updated a library that you cannot create your own variant of it? On Wed, Jan 5, 2022 at 9:51 PM Harendra Kumar wrote:

...

On Wed, 5 Jan 2022 at 20:06, Kim-Ee Yeoh wrote:

...
You’ve made good points in your posts.

Worth to keep in mind that the number of bits needed to distinguish between n things is log2 n.

The log of 1 is 0.

Exactly. And that is why we should not have a Storable instance for ().

-harendra

-- -- Kim-Ee

Sven Panne

12:12 p.m.

Am Mi., 5. Jan. 2022 um 12:42 Uhr schrieb David Feuer

...

:

...

No. Consider a type like this:

data Foo a = Foo !Int !a

instance Storable a => Storable (Foo a) where ...

Now if a happens to be (), we pay only one word per Foo. [...]

This is exactly the kind of breakage I had in mind: With the proposed change, the storage layout would change, and the compiler wouldn't warn you about that at all. Note that I'm not arguing about memory efficiency, it's all about a subtle semantic change for the sake of a single library, wanting to change something which was in place for 10-20 years. Seems like an extremely bad move from the POV of the Haskell ecosystem: It's exactly this kind of ad hoc changes which annoys people.

Henning Thielemann

11:41 a.m.

On Wed, 5 Jan 2022, Harendra Kumar wrote:

...

If we do not want to consume space for () then better not have a Storable instance for it, leave it to the users. And if we really want to store it then let it behave in the same way as any other mere mortal storable would, so what if it takes up some space. Why not have simple, uniform semantics for all cases? I think saving that extra byte for this particular case is an overkill and leading to a bigger cost in applications without actually saving anything worthwhile.

Maybe the Storable class is not the right one for your application. I came to this conclusion for my llvm-tf package for exchanging data between LLVM code and Haskell. Today I distinguish between base:Storable and my own LLVM.Storable class. They share implementations for Word8, Int16 and so on, but e.g. for Bool they differ (LLVM: one byte, base: four bytes) and LLVM.Storable supports tuples, whereas base:Storable does not. One of my LLVM applications works this way: I have an Arrow with existentially quantified state. If an arrow does not need a state, the state type is (). If I combine two arrows with (***) then I need to bundle their states in a pair. Thus I need to store () and I need to store nested tuples where some members are (). () is mapped to LLVM's empty tuple {}, and {} consumes no space. By using (***), the state type grows, but the required memory may stay the same. That is, () consuming no space can be very useful. However, that argument cannot be directly transfered to base:Storable, because base:Storable is not intended for tuples. If you find that your application needs tuples, then you better define your own Storable class, anyway.

Henning Thielemann

10:04 a.m.

On Wed, 5 Jan 2022, Harendra Kumar wrote:

...

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

In my package comfort-array (e.g. for Storable arrays) I use an array 'shape' to determine the size of the array: https://hackage.haskell.org/package/comfort-array

Sven Panne

10:13 a.m.

Am Mi., 5. Jan. 2022 um 09:01 Uhr schrieb Harendra Kumar < harendra.kumar@gmail.com>:

...

[...] The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0?

This is not absurd at all, there is absolutely no information to be stored. Everything one needs to know is in the type here.

...

This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

Exactly.

...

[...] Can this be fixed? Is there a compelling argument to keep it like this? [...]

There is nothing to be fixed on the Storable side of things, the fix needs to be in your code, as David has already mentioned. And in addition: I would *strongly* advise to leave the Storable () instance as-is, I'm quite sure that otherwise tons of code will break in mysterious ways, undetected by any compiler. Cheers, S.

Matthew Pickering

10:26 a.m.

I agree with the other replies to this thread, I just reply to point out the Binary instance for () is the same. On Wed, Jan 5, 2022 at 8:01 AM Harendra Kumar wrote:

...

The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Georgi Lyubenov

10:39 a.m.

I have an additional question: It is true that in a strict/unboxed language, the type of () is sufficient to reproduce its value. However, here, trying to store undefined :: () is no different from trying to store () :: (). Is this difference in behaviour with other instances of Storable (where presumably trying to store undefined will blow up, as there is indeed some work to do there) intentionally ignored? On Wed, Jan 5, 2022 at 12:26 PM Matthew Pickering < matthewtpickering@gmail.com> wrote:

...

I agree with the other replies to this thread, I just reply to point out the Binary instance for () is the same.

On Wed, Jan 5, 2022 at 8:01 AM Harendra Kumar wrote:

...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

David Feuer

10:50 a.m.

It is a bit peculiar that that would be so. Maybe there's some efficiency reason, but it doesn't seem very strongly motivated. On Wed, Jan 5, 2022 at 5:40 AM Georgi Lyubenov wrote:

...

I have an additional question:

It is true that in a strict/unboxed language, the type of () is sufficient to reproduce its value. However, here, trying to store undefined :: () is no different from trying to store () :: (). Is this difference in behaviour with other instances of Storable (where presumably trying to store undefined will blow up, as there is indeed some work to do there) intentionally ignored?

On Wed, Jan 5, 2022 at 12:26 PM Matthew Pickering wrote:

...
I agree with the other replies to this thread, I just reply to point out the Binary instance for () is the same.

On Wed, Jan 5, 2022 at 8:01 AM Harendra Kumar wrote:

...
The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Sven Panne

10:51 a.m.

Am Mi., 5. Jan. 2022 um 11:40 Uhr schrieb Georgi Lyubenov < godzbanebane@gmail.com>:

...

[...] However, here, trying to store undefined :: () is no different from trying to store () :: (). Is this difference in behaviour with other instances of Storable (where presumably trying to store undefined will blow up, as there is indeed some work to do there) intentionally ignored?

Good point, this might be seen as a bug/inconsistency of Storable (): peek and poke are not strict in their arguments, while probably all(?) other instances are. But https://www.haskell.org/onlinereport/haskell2010/haskellch37.html doesn't require this, so I would be reluctant to really call this a bug. Changing this can have "interesting" effects on the ecosystem, too, who knows? Again, this would be a change where the compiler doesn't help you. Cheers, S.

Roman Cheplyaka

11:05 a.m.

On 05/01/2022 12.39, Georgi Lyubenov wrote:

...

I have an additional question:

It is true that in a strict/unboxed language, the type of () is sufficient to reproduce its value. However, here, trying to store undefined :: () is no different from trying to store () :: (). Is this difference in behaviour with other instances of Storable (where presumably trying to store undefined will blow up, as there is indeed some work to do there) intentionally ignored?

If you look at it from the strict-by-default point of view, it does appear inconsistent with the other instances. However, if you look at it from the non-strict-by-default point of view, which is arguably more native to Haskell, then all instances follow the same principle: they are maximally non-strict. It's just when you store anything non-trivial, you are forced to be strict in order to fulfill the task, but you never add any gratuitous strictness. Roman

Oleg Grenrus

1:42 p.m.

https://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Empty-Structures.html#Empty-Str...

...

GCC permits a C structure to have no members:

struct empty { };

...

The structure will have size zero. In C++, empty structures are part of the language. G++ treats empty structures as if they had a single member of type char.

so it's either GCC is absurd, or it isn't. Standard C also supports zero-width members (to force alignment), whether to use () for that or not in modelling Haskell-Storable structs. Could be useful.https://stackoverflow.com/questions/13802728/what-is-zero-width-bit-field - Oleg On 5.1.2022 10.01, Harendra Kumar wrote:

...

The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Fumiaki Kinoshita

1:51 p.m.

I'm the author of the instance [1]. One of my libraries uses Storable vectors to represent buffers of audio samples[1]. When the user doesn't need input, they leave the type of input vector to be `Vector ()`, which is totally valid and reasonable under the constraint that the input buffer and the output buffer have the same size.

...

This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

Yes, we can read an infinite number of () without reading anything in memory. That's intended, and it is less arbitrary than defining () as a single byte object. Perhaps you may want to reconsider the design of your array implementation before roasting this instance as "broken" and "absurd". [1] https://gitlab.haskell.org/ghc/ghc/-/commit/97843d0b10cac3912a85329ebcb8ed1a... [2] https://github.com/fumieval/bindings-portaudio/blob/4e49c50d19d141062e7a75a5... 2022年1月5日(水) 17:01 Harendra Kumar :

...

The Storable instance of () is defined in the "Foreign.Storable" module of the "base" package as follows:

instance Storable () where sizeOf _ = 0 alignment _ = 1 peek _ = return () poke _ _ = return ()

The size of () is defined as 0. It sounds absurd for a Storable to have a size of 0? This means that we can read an infinite number of () type values out of nothing (no memory location required) or store an infinite number of () type values without even requiring a memory location to write to.

This is causing a practical problem in our Storable array implementation. The array is constrained to a Storable type. Since () has a Storable instance, one can store () in the Storable array. But it causes a problem because we determine the array element size using sizeOf on the type. For () type it turns out to be 0. Essentially, the array of () would always be of size 0. Now, we cannot determine the length of the array from its byte length as you could store infinite such elements in an empty array. The Storable instance of () seems to be an oddity and makes us use a special case everywhere in the code to handle this, and this special casing makes it highly prone to errors when we change code.

Can this be fixed? Is there a compelling argument to keep it like this? A possible fix could be to represent it by a single byte in memory which can be discarded when reading or writing. Another alternative is to not provide a Storable instance for it at all. Let the users write their own if they need it.

If you think this does not have a problem, can you suggest how to elegantly handle the array implementation problem as I described above?

Thanks, Harendra _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Harendra Kumar

2:38 p.m.

On Wed, 5 Jan 2022 at 19:22, Fumiaki Kinoshita wrote:

...

Perhaps you may want to reconsider the design of your array implementation before roasting this instance as "broken" and "absurd".

Fumiaki, thanks for the pointers to the origin of the instance. My intention was not to roast, but just to have a discussion to find out the reasoning in favour and against. I am sorry if it sounded like roasting. -harendra

1278

Age (days ago)

1278

Last active (days ago)

List overview

Download

24 comments

12 participants

participants (12)

Brandon Allbery
David Feuer
Fumiaki Kinoshita
Georgi Lyubenov
Harendra Kumar
Henning Thielemann
Kim-Ee Yeoh
Matthew Pickering
Oleg Grenrus
Roman Cheplyaka
Sven Panne
Tom Ellis