
Hi, I'd like to be able to use Data.Binary (or similar) for compression. Say I have an abstract type Symbol, and for each value of Symbol I have a representation in terms of some number of bits. For compression to be efficient, commonly-used Symbols should have very short representations, while less common ones can be longer. Since an encoding like [Bool] would be really inefficient for this (at least I think it would, though some fancy fusion tricks might be able to help), I was thinking a reasonable approach might be to use Word8 (for example), and then specify a number of bits n, indicating that only the first n bits are to be written to the compressed representation. I was looking at the internals of Data.Binary, and saw it seems that PutM could be used for this purpose (was something like this its original purpose?). Today, I put this together: ------------------------------------ type BitRep = Word8 type NBits = Int type MyBits = (BitRep, NBits) (#) :: MyBits -> MyBits -> PutM MyBits (a, m) # (b, n) = case (a .|. (b `shiftR` m), m + n) of ans@(ab, s) -> if s < 8 then return ans else putWord8 ab >> return (b `shiftL` (8 - m), s - 8) ------------------------------------ Then, it would be easy enough to map [Symbol] -> [MyBits], and then use something like foldM (#) to get into the PutM monad. A couple of questions: (1) Am I reinventing the wheel? I haven't seen anything like this, but it would be nice to be a bit more certain. (2) This seems like it will work ok, but the feel is not as clean as the current Data.Binary interface. Is there something I'm missing that might make it easier to integrate this? (3) Right now this is just proof of concept, but eventually I'd like to do some performance tuning, and it would be nice to have a representation that's amenable to this. Any thoughts on speeding this up while keeping the interface reasonably clean would be much appreciated. Thanks! -Chad