
[copied to Malcolm]
From the attached mail, it sounds like Simon has made some worthwhile additions to the Binary interface but left out a few things. The only omission that seems fundamental is that Simon's version supports reading/writng bytes whilst Malcolm's supports reading/writing bits.
Malcolm: - How important is this? - Assuming that supporting bits slows down the whole interface, is there a cunning implementation trick which would have very low overhead if you're doing a byte-aligned read/write (e.g., if all previous reads/writes has been multiples of bytes)? - Or, would it be appropriate to build one as a layer on top of the other so that programmers can express their choice by using one type or another. (I suggest a layered approach in the hope that this would lead to more code sharing, reduce tendency for API divergence, etc. but I have no concrete thought on what a layered approach might look like. -- Alastair
I was wondering if it was on the list of things to do to get a Binary module into the standard libraries. I know SimonM has a version for GHC and there's an NHC version (I think the original). I don't know about Hugs.
I ask because by putting it in the standard libs, library developers could feel more pressured to release their data structures with Binary instances.
Indeed. The only reason I didn't put my version into the libraries yet was because it differed somewhat from the NHC version, and I thought it would be a good idea to discuss what the interface should look like first.
FYI, the main differences between GHC's Binary library and NHC's are described below. Keep in mind that GHC's Binary library is heavily tuned for speed, because we use it for reading/writing interface files.
GHC's Binary class:
class Binary a where put_ :: BinHandle -> a -> IO () put :: BinHandle -> a -> IO (Bin a) get :: BinHandle -> IO a
NHC's Binary class:
class Binary a where put :: BinHandle -> a -> IO (Bin a) get :: BinHandle -> IO a getF :: BinHandle -> Bin a -> (a, Bin b)
putAt :: BinHandle -> Bin a -> a -> IO () getAt :: BinHandle -> Bin a -> IO a getFAt :: BinHandle -> Bin a -> a
- For GHC, I added the put_ method. The reason is efficiency: you can often write a tail-recursive definition of put_, but not put, and one rarely needs the return value of put (I found). Each function has a default definition defined in terms of the other (in fact, I think I use put_ exclusively in GHC, and put could be taken out of the class altogether).
- For GHC, I didn't implement getF. Instead, I have explicit lazyGet and lazyPut operations, to give me more control over the laziness: I only want laziness in a few well-defined places.
- I implemented putAt and getAt as functions rather than class methods. There are lots of instances of Binary, so you save a few dictionary fields, and I didn't come across a case where I needed to override either of these.
- GHC's library also works in terms of bytes rather than bits, again for efficiency (time over space). There are putByte and getByte functions for writing your own instances of Binary, whereas NHC has putBits and getBits.
There are more differences in the rest of the interface, but these are the most fundamental ones.
Cheers, Simon