
On Fri, 1 Aug 2003, Simon Marlow wrote:
I wanted to float a generalisation of this scheme, though. I'm wondering whether it might be a good idea to make InputStream and OutputStream into type classes, the advantage being that this makes streams more extensible - one example is that memory-mapped files fit neatly into this framework. I already have 6 examples of things that can have streams layered on top (or *are* streams), and there are almost certainly more.
I think this is unambiguously superior to my design because it's user-extensible. I can easily imagine a user wanting to put a text reader on top of a user-defined instance of InputStream, for example. It also allows particular kinds of streams to expose additional structure, which is good. My only concern is that the additional structure might not be known at type-check time. In particular, the lookupXputStream functions can't return any particular type of stream, as far as I can tell -- certainly not a FileXputStream.
Here's some signatures for you to peruse:
class Stream s where closeStream :: s -> IO ()
I guess "open" and "close" do make sense for streams.
streamSetBuffering :: s -> BufferMode -> IO ()
This is not a design issue, but not all kinds of buffering make sense for all kinds of streams (line buffering doesn't seem sensible for file streams, and any buffering on a memory array is pointless). The supplied buffering should presumably be only a suggestion.
streamGetBuffering :: s -> IO BufferMode streamFlush :: s -> IO ()
Does streamFlush make sense for input streams? In the case of a file stream it could discard buffered data, but for other streams I'm not sure what it would do.
isEOS :: s -> IO Bool
This has a clear meaning for input streams (no more data), but for output streams it could mean many different things (connection closed by listener, no more disk space, no more memory buffer space), and, more seriously, these conditions can't in general be detected synchronously unless the stream happens to be unbuffered.
class InputStream s where streamGet :: s -> IO Word8 streamReadBuffer :: s -> Integer -> Buffer -> IO ()
I used "read" and "write" exclusively for files and "get" and "put" exclusively for streams to emphasize that these are completely different operations. Writing a file is like writing on a piece of paper; you know where your data is going and how to get it back with a read. But output streams are like pneumatic tubes that whisk your octets away to parts unknown. I would even go so far as to use names like push/pull or send/receive or speak/listen for streams.
streamReadBuffer :: s -> Integer -> Buffer -> IO () streamGetBuffer :: s -> Integer -> IO ImmutableBuffer
This brings up (again) an important issue: what's the most practical way of providing a memory buffer for file/stream operations? There doesn't seem to be a clean answer to this in Haskell. It seems like we'll need more variants than just these two. [snip]
data MappedFileInputStream -- instance Stream, InputStream data MappedFileOutputStream -- instance Stream, OutputStream
I don't think these are necessary; you can use ArrayXputStream. [snip]
-- Pipes data Pipe -- a pipe with a read and a write end instance Stream Pipe instance InputStream Pipe instance OutputStream Pipe createPipe :: IO Pipe closePipe :: Pipe -> IO ()
I strongly believe that createPipe should return an (InputStream,OutputStream) pair, not a single object supporting both interfaces. The streams associated with a pipe represent the ends of the pipe, not the pipe itself. This is true conceptually and also in practice: pipes are only useful if you separate the two ends and give them to two different threads.
-- Sockets: data Socket instance Stream Socket instance InputStream Socket instance OutputStream Socket
Same objection here, although the reason is a bit different. Each TCP connection consists of two independent unidirectional channels; they're only created together for reasons of efficiency (and security?). There are a total of four ends, of which you get two and the remote host gets the other two. I admit that in this case a natural analogy with a telephone handset suggests that the two streams should be kept together; but that's what tuples are for. The only object I can think of that could legitimately be an instance of both InputStream and OutputStream is a LIFO buffer, assuming there's any use for such a thing. -- Ben