
3) I think we can all agree that we should buffer BinIOs. There are a few questions, given this:
a) Should multiple threads be allowed to write the same BinHandle simultaneously? If not, is an error thrown or is the behiour just left "unspecified"? b) Should multiple threads be allowed to read from the same BinHandle simultaneously? If not, ... c) Should one thread be allowed to write and another to read from the same BH simultaneously? If not, ...
I believe GHC has a reader-writer lock on Handles so the answer is that one thread blocks if another is already using it in a conflicting way.
Basically, I suggest doing whatever normal file Handles do.
This is a tricky one. Doing whatever normal Handles do is the "right" way to approach this, but I fear it might be expensive. Handles have a single file pointer (if they have a file pointer at all), a buffer, and some other state. The Handle itself is protected by a lock, so that only one thread can access the state at a time. Currently, a BinIO handle caches the file pointer for speed, and doesn't protect this with a lock. BinIO handles might also need a cache. The "right" thing to do is to push this inside the Handle - use the Handle's buffer as the cache. Provide something like hOpenBin :: FilePath -> OpenMode -> IO Handle hPutBits :: Handle -> Int -> Word8 -> IO () hGetBits :: Handle -> Int -> IO Word8 hSeekBits :: Handle -> Integer -> IO () I don't know whether this would be acceptably fast or not. (I'll try to do some perf measurements on BinIO vs. BinMem later today, that should give us a rough idea). What about BinMem? Currently a BinMem is basically a flat array and a pointer. It has no lock; if you write or read from two threads simultaneously you can get race conditions. However, even with a lock, reading from two threads simultaneously isn't likely to be a good idea because of the shared file pointer. This is why I suggested having dupBin: dupBin :: BinHandle -> IO BinHandle which essentially gives you another file pointer to work with, so that two threads can safely read the same BinHandle at different points. (writing is still problematic - use BinIO if you want multithreaded writing). dupBin can be implemented for Handles, and hence BinIO too. It's fairly straightforward and seems useful anyway. Summary: - reading/writing the same BinHandle from two threads isn't useful unless the threads can have their own file pointers. ==> need dupBin - cacheing of the data in a BinIO should be done in the Handle, unless that's too expensive. (Hal: for now, just continue with what you had planned, if we decide to make some of these changes we can refactor later). Cheers, Simon