proposal/RFC: add bSwap to base in Data.Bits

Hi, I'm trying to expose some byte swapping (bSwap) capabilities in base, namely operation to allow to swap endianness on word{16,32,64}, i.e.: bswap16 :: Word16 -> Word16 bswap16 a = (a `shiftR` 8) .|. (a `shiftL` 8) ... I'ld want to propose to extends Bits to do so generically: class Bits a where ... bSwap : a -> a I'm attaching a patch that do as explained, and add a default implementation for bSwap, so that compatibility is assured for existing instances. -- Vincent

On Thu, 16 May 2013, Vincent Hanquez wrote:
I'm trying to expose some byte swapping (bSwap) capabilities in base, namely operation to allow to swap endianness on word{16,32,64}, i.e.:
bswap16 :: Word16 -> Word16 bswap16 a = (a `shiftR` 8) .|. (a `shiftL` 8) ...
If at all, I'd suggest a name without the 'b', since the other functions like 'shift' do not contain a 'b' as well. You can use qualification if you want to say that the swap is meant with respect to 'bits'.

I would be strongly against naming it 'swap'. Lots of people in the community do not subscribe to the philosophy that all things should be imported qualified. 'byteSwap' or even 'byteswap' would be closer to the traditional 'bswap' name in spirit. -Edward On Thu, May 16, 2013 at 4:24 AM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Thu, 16 May 2013, Vincent Hanquez wrote:
I'm trying to expose some byte swapping (bSwap) capabilities in base,
namely operation to allow to swap endianness on word{16,32,64}, i.e.:
bswap16 :: Word16 -> Word16 bswap16 a = (a `shiftR` 8) .|. (a `shiftL` 8) ...
If at all, I'd suggest a name without the 'b', since the other functions like 'shift' do not contain a 'b' as well. You can use qualification if you want to say that the swap is meant with respect to 'bits'.
______________________________**_________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/**mailman/listinfo/librarieshttp://www.haskell.org/mailman/listinfo/libraries

On Thu, May 16, 2013 at 04:32:41AM -0400, Edward Kmett wrote:
I would be strongly against naming it 'swap'. Lots of people in the community do not subscribe to the philosophy that all things should be imported qualified.
'byteSwap' or even 'byteswap' would be closer to the traditional 'bswap' name in spirit.
I thought about byteswap too, and i think it's a fine choice (although my preference still goes on bswap for now). If a majority prefer byteswap to bswap, i'm happy to change it. -- Vincent

I'd strongly prefer "byteSwap". We already have shiftR, shiftL insted
of sh[a]r, shl; I see no reason to stick to assembly convention here.
Also "bSwap" still wouldn't make it clear that it's _byte_ swap rather
than _bit_ swap.
On 16 May 2013 10:55, Vincent Hanquez
On Thu, May 16, 2013 at 04:32:41AM -0400, Edward Kmett wrote:
I would be strongly against naming it 'swap'. Lots of people in the community do not subscribe to the philosophy that all things should be imported qualified.
'byteSwap' or even 'byteswap' would be closer to the traditional 'bswap' name in spirit.
I thought about byteswap too, and i think it's a fine choice (although my preference still goes on bswap for now). If a majority prefer byteswap to bswap, i'm happy to change it.
-- Vincent
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Thu, May 16, 2013 at 11:00:11AM +0200, Thomas Schilling wrote:
I'd strongly prefer "byteSwap". We already have shiftR, shiftL insted of sh[a]r, shl; I see no reason to stick to assembly convention here. Also "bSwap" still wouldn't make it clear that it's _byte_ swap rather than _bit_ swap.
well, it does if you have x86 assembly knowledge :-) But in anycase i'm fine with byteSwap. -- Vincent

On Thu, May 16, 2013 at 10:24:31AM +0200, Henning Thielemann wrote:
On Thu, 16 May 2013, Vincent Hanquez wrote:
I'm trying to expose some byte swapping (bSwap) capabilities in base, namely operation to allow to swap endianness on word{16,32,64}, i.e.:
bswap16 :: Word16 -> Word16 bswap16 a = (a `shiftR` 8) .|. (a `shiftL` 8) ...
If at all, I'd suggest a name without the 'b', since the other functions like 'shift' do not contain a 'b' as well.
apologies if i misunderstood you, but i think that's the whole point. the b is there because it's different than shift. shift works on bits, and bswap works on bytes. I don't think swap it's a good name, and also Data.Tuple already got a swap that does what i think 'swap' should do. -- Vincent

On Thu, 16 May 2013, Vincent Hanquez wrote:
On Thu, May 16, 2013 at 10:24:31AM +0200, Henning Thielemann wrote:
If at all, I'd suggest a name without the 'b', since the other functions like 'shift' do not contain a 'b' as well.
apologies if i misunderstood you, but i think that's the whole point. the b is there because it's different than shift. shift works on bits, and bswap works on bytes.
If it would work on bits, I would certainly not call it swap, but 'reverse'. On the MC68000 processor there was an assembly instruction "swap" that swapped upper and lower 16 bits of a 32 bit word. However if you really only want to swap byte order (and not 16 bit words within 64 bit words and so on), then how about just using a package like 'endian': http://hackage.haskell.org/packages/archive/data-endian/0.0.1/doc/html/Data-... ?

Henning has a point. EndianSensitive is arguably the more appropriate notion. What does it mean to 'byteSwap' an 'Integer'? Or a bit vector of length n? -Edward On Thu, May 16, 2013 at 4:59 AM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Thu, 16 May 2013, Vincent Hanquez wrote:
On Thu, May 16, 2013 at 10:24:31AM +0200, Henning Thielemann wrote:
If at all, I'd suggest a name without the 'b', since the other functions like 'shift' do not contain a 'b' as well.
apologies if i misunderstood you, but i think that's the whole point. the b is there because it's different than shift. shift works on bits, and bswap works on bytes.
If it would work on bits, I would certainly not call it swap, but 'reverse'. On the MC68000 processor there was an assembly instruction "swap" that swapped upper and lower 16 bits of a 32 bit word.
However if you really only want to swap byte order (and not 16 bit words within 64 bit words and so on), then how about just using a package like 'endian':
http://hackage.haskell.org/**packages/archive/data-endian/** 0.0.1/doc/html/Data-Endian.**htmlhttp://hackage.haskell.org/packages/archive/data-endian/0.0.1/doc/html/Data-...
?
______________________________**_________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/**mailman/listinfo/librarieshttp://www.haskell.org/mailman/listinfo/libraries

On Thu, May 16, 2013 at 05:07:51AM -0400, Edward Kmett wrote:
Henning has a point.
EndianSensitive is arguably the more appropriate notion.
Yes, Bits is not necessarily the best fit in term of naming or feature, but nothing close to EndianSensitive is in base.
What does it mean to 'byteSwap' an 'Integer'? Or a bit vector of length n?
It would mean the same as byteswapping a Word32/Word64, the 1st byte would be at the end, the second byte ... +bSwapDefault :: (Bits a, Num a) => a -> a +bSwapDefault = go 0 + where + go !c 0 = c + go c w = go ((c `unsafeShiftL` 8) .|. (w .&. 0xff)) (w `unsafeShiftR` 8) It doesn't necessarily make complete sense to byteswap arbitrary Bits types (either for performance reason like integer, or for a non-8-bytes multiple), however it's possible to come up with a definition that somewhat make sense generically for any bits types. -- Vincent

On Thu, May 16, 2013 at 11:36:46AM +0200, Vincent Hanquez wrote:
On Thu, May 16, 2013 at 05:07:51AM -0400, Edward Kmett wrote:
Henning has a point.
EndianSensitive is arguably the more appropriate notion.
Yes, Bits is not necessarily the best fit in term of naming or feature, but nothing close to EndianSensitive is in base.
There's no reason that the recommended interface to the primitives needs to be in base. It would be a little unfriendly to only export the primitives, but we could just export byteSwap16 :: Word16 -> Word16 byteSwap32 :: Word32 -> Word32 byteSwap64 :: Word64 -> Word64 and leave it up to packages like data-endian to provide a more user-friendly interface. The problem with adding a Bytes class is there are various questions, like "should there be an instance Bytes a => Bytes [a]?", "should there be a toWord8s :: a -> [Word8]" method?", and "would Word8s be a better name for the class?". If the class goes into base, then it's a lot harder to change the answers to these questions (and, indeed, to the questions that we don't think to ask before the class is in a released GHC). I think that adding the functions above, probably to Data.Word, would be my preference. Oh, and for the record, I also prefer "byteSwap" to "bswap", and dislike using (Finite)Bits for this. Thanks Ian

I agree with every one of Ian's points here.
Just bolting the functions in non-overloaded in Data.Word would be a useful
starting point and side-steps concerns of what it means when you have
(Finite)Bits instances that don't evenly divide into bytes, etc.
On Fri, May 17, 2013 at 2:37 PM, Ian Lynagh
On Thu, May 16, 2013 at 11:36:46AM +0200, Vincent Hanquez wrote:
On Thu, May 16, 2013 at 05:07:51AM -0400, Edward Kmett wrote:
Henning has a point.
EndianSensitive is arguably the more appropriate notion.
Yes, Bits is not necessarily the best fit in term of naming or feature, but nothing close to EndianSensitive is in base.
There's no reason that the recommended interface to the primitives needs to be in base. It would be a little unfriendly to only export the primitives, but we could just export byteSwap16 :: Word16 -> Word16 byteSwap32 :: Word32 -> Word32 byteSwap64 :: Word64 -> Word64 and leave it up to packages like data-endian to provide a more user-friendly interface.
The problem with adding a Bytes class is there are various questions, like "should there be an instance Bytes a => Bytes [a]?", "should there be a toWord8s :: a -> [Word8]" method?", and "would Word8s be a better name for the class?". If the class goes into base, then it's a lot harder to change the answers to these questions (and, indeed, to the questions that we don't think to ask before the class is in a released GHC).
I think that adding the functions above, probably to Data.Word, would be my preference.
Oh, and for the record, I also prefer "byteSwap" to "bswap", and dislike using (Finite)Bits for this.
Thanks Ian

On 05/17/2013 07:37 PM, Ian Lynagh wrote:
I think that adding the functions above, probably to Data.Word, would be my preference.
Well ok. that's much better for me, and personally that's the only thing i want exposed. somehow, i imagined that it would have better acceptance as something generic.
Oh, and for the record, I also prefer "byteSwap" to "bswap", and dislike using (Finite)Bits for this.
yes, i've already renamed locally to byteSwap. I'll reshape the proposal one more time, instead of using FiniteBits then. Thanks, -- Vincent

On Thu, May 16, 2013 at 10:59:03AM +0200, Henning Thielemann wrote:
If it would work on bits, I would certainly not call it swap, but 'reverse'. On the MC68000 processor there was an assembly instruction "swap" that swapped upper and lower 16 bits of a 32 bit word.
However if you really only want to swap byte order (and not 16 bit words within 64 bit words and so on), then how about just using a package like 'endian':
http://hackage.haskell.org/packages/archive/data-endian/0.0.1/doc/html/Data-...
I'm sorry, i completely forgot to put the rationale of this in the original proposal. http://hackage.haskell.org/trac/ghc/ticket/7902 The point of this exercise is to expose some new primops that swap bytes efficiently (by generating the proper assembly code), not using an existing package. -- Vincent

On 2013-05-16 08:59:28 +0200, Vincent Hanquez wrote:
I'ld want to propose to extends Bits to do so generically:
class Bits a where ... bSwap : a -> a
I'm attaching a patch that do as explained, and add a default implementation for bSwap, so that compatibility is assured for existing instances.
I'm not against providing "generic" byteswap primitives, but I'm skeptical about adding a *byte*-swap to the 'Bits' typeclass, whose declared scope is according to the documentation: | The Bits class defines bitwise operations over integral types. And I haven't seen any operation relying on the concept of bytes in 'Bits'. Wouldn't it be more logical to add a new typeclass depending on the byte-notion such as e.g. class Bits a => Bytes a where ... which could then provide byte-operations such as little/big endian swaps? cheers, hvr

On Thu, 16 May 2013, Herbert Valerio Riedel wrote:
Wouldn't it be more logical to add a new typeclass depending on the byte-notion such as e.g.
class Bits a => Bytes a where ...
which could then provide byte-operations such as little/big endian swaps?
I would prefer that way.

On 05/17/2013 01:28 PM, Henning Thielemann wrote:
On Thu, 16 May 2013, Herbert Valerio Riedel wrote:
Wouldn't it be more logical to add a new typeclass depending on the byte-notion such as e.g.
class Bits a => Bytes a where ...
which could then provide byte-operations such as little/big endian swaps?
I would prefer that way.
+1, this solution seems more reasonable to me too.

Dear Vincent, thanks for working on this proposal.
-- | Default implementation for 'bSwap' -- -- This implementation is intentionally naive. Instances are expected to provide -- an optimized implementation for their size. bSwapDefault :: (Bits a, Num a) => a -> a bSwapDefault = go 0 where go !c 0 = c go c w = go ((c `unsafeShiftL` 8) .|. (w .&. 0xff)) (w `unsafeShiftR` 8)
This can't be right, since it ignores the bitSize. In fact, from bSwapDefault 1 = 1 one could conclude that the datatype is 1 byte wide. bSwapDefault (bSwapDefault 256) = 1 shows that bSwapDefault is not its own inverse. (As with bitSize, I don't think that there is a sensible implementation of bSwap for Integer) I agree with previous posters that 'bSwap' is a bad name; 'byteSwap' seems better. Cheers, Bertram

On Thu, May 16, 2013 at 03:24:53PM +0200, Bertram Felgenhauer wrote:
Dear Vincent,
thanks for working on this proposal.
You're welcome.
This can't be right, since it ignores the bitSize. In fact, from bSwapDefault 1 = 1 one could conclude that the datatype is 1 byte wide. bSwapDefault (bSwapDefault 256) = 1 shows that bSwapDefault is not its own inverse.
You're absolutely right. not sure what i was thinking when i did the patch to base after thinking it was not possible to do it without a fixed size. The function above just fall apart if the "highest" bit is not set. I'll rework this to depends on FiniteBits. -- Vincent
participants (8)
-
Bertram Felgenhauer
-
Edward Kmett
-
Henning Thielemann
-
Herbert Valerio Riedel
-
Ian Lynagh
-
Petr Pudlák
-
Thomas Schilling
-
Vincent Hanquez