
Hi all, I'm doing some work with ASN.1, and I'm thinking about how to represent the type "BIT STRING". The obvious, and inefficient representation would be
type BitString = [Bool]
A reasonable sparse representation might be
type BitString = [Integer] where the integers represent the positions of the non-zero bits, but other implementations (e.g. C & C++ based ones) mostly seem to use dense representations, and there are certain predictability advantages to be had by following existing implementations in this regard.
But given that a value of type BIT STRING is allowed to have leading 0 bits, which have no semantic effect, another representation would be
type BitString = [Word64] (I'm a modern type, and believe in 64bit computing ;-), but you could use Word32 if you liked).
However, after a few moments' consideration, I realized, that if I was going to use a representation like that, I could probably use
type BitString = Integer which already has [I assume] a carefully optimized implementation. Also, it ends up being significantly more strict than a list representation, which is probably a good thing in most situations.
My question for all present is: Have I missed either a problem with using Integer, or have I overlooked a better representation? T.

It's hard to tell what the best representation is if you don't know what the operations that you are going to perform are. If all you are going to do is I/O of bitstrings, then [Bool] could be great. If you need to do bitwise boolean ops then Integer is a wise choice. -- Lennart On Sep 14, 2006, at 21:35 , Thomas Conway wrote:
Hi all,
I'm doing some work with ASN.1, and I'm thinking about how to represent the type "BIT STRING".
The obvious, and inefficient representation would be
type BitString = [Bool]
A reasonable sparse representation might be
type BitString = [Integer] where the integers represent the positions of the non-zero bits, but other implementations (e.g. C & C++ based ones) mostly seem to use dense representations, and there are certain predictability advantages to be had by following existing implementations in this regard.
But given that a value of type BIT STRING is allowed to have leading 0 bits, which have no semantic effect, another representation would be
type BitString = [Word64] (I'm a modern type, and believe in 64bit computing ;-), but you could use Word32 if you liked).
However, after a few moments' consideration, I realized, that if I was going to use a representation like that, I could probably use
type BitString = Integer which already has [I assume] a carefully optimized implementation. Also, it ends up being significantly more strict than a list representation, which is probably a good thing in most situations.
My question for all present is: Have I missed either a problem with using Integer, or have I overlooked a better representation?
T. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 9/15/06, Lennart Augustsson
It's hard to tell what the best representation is if you don't know what the operations that you are going to perform are. If all you are going to do is I/O of bitstrings, then [Bool] could be great. If you need to do bitwise boolean ops then Integer is a wise choice.
And is I/O of bitstring represented as Integer going to be significantly worse [Bool]? It is certainly a much denser representation. Actually the main operations are probably tests (is a certain bit set), and encoding and decoding (to and from BER, PER, etc). The operations that need to be efficient are: -- test to see if the Nth bit is set testBit :: BitString -> Integer -> Bool -- set the Nth bit setBit :: BitString -> Integer -> BitString -- clear the Nth bit clearBit :: BitString -> Integer -> BitString I can see the potential for creating large intermediate Integers (i.e. 1 `shift` n), if the number of bits is large. T.

On Fri, Sep 15, 2006 at 11:35:45AM +1000, Thomas Conway wrote:
My question for all present is: Have I missed either a problem with using Integer, or have I overlooked a better representation?
Consider also (UArray Int Bool). In GHC it has an efficient implementation. Best regards Tomasz

tomasz.zielonka:
On Fri, Sep 15, 2006 at 11:35:45AM +1000, Thomas Conway wrote:
My question for all present is: Have I missed either a problem with using Integer, or have I overlooked a better representation?
Consider also (UArray Int Bool). In GHC it has an efficient implementation.
A _very_ efficient implementation: http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsieve&lang=all :) -- Don

On Fri, 15 Sep 2006, Thomas Conway wrote:
My question for all present is: Have I missed either a problem with using Integer, or have I overlooked a better representation?
With my Modula background, where a machine-oriented SET type is available, I consider using integers as bit sets (as in Data.Bits) as a hack. Assuming that integers are stored by binary numbers might be true for all todays computer systems, but I wouldn't count on that. According to its interface, Data.IntSet might be an option for you. http://haskell.org/ghc/docs/latest/html/libraries/base/Data-IntSet.html
participants (5)
-
dons@cse.unsw.edu.au
-
Henning Thielemann
-
Lennart Augustsson
-
Thomas Conway
-
Tomasz Zielonka