RE: Raw I/O library proposal, second (more pragmatic) draft

1 Aug 2003

      On Fri, 1 Aug 2003, Simon Marlow wrote:
...
I wanted to float a generalisation of this scheme, though.  I'm
wondering whether it might be a good idea to make InputStream and
OutputStream into type classes, the advantage being that this makes
streams more extensible - one example is that memory-mapped files fit
neatly into this framework.  I already have 6 examples of things that
can have streams layered on top (or *are* streams), and there are almost
certainly more.
I think this is unambiguously superior to my design because it's
user-extensible. I can easily imagine a user wanting to put a text reader
on top of a user-defined instance of InputStream, for example. It also
allows particular kinds of streams to expose additional structure, which
is good.

My only concern is that the additional structure might not be known at
type-check time. In particular, the lookupXputStream functions can't
return any particular type of stream, as far as I can tell -- certainly
not a FileXputStream.
...
Here's some signatures for you to peruse:
class Stream s where
      closeStream	   :: s -> IO ()
I guess "open" and "close" do make sense for streams.
...
streamSetBuffering :: s -> BufferMode -> IO ()
This is not a design issue, but not all kinds of buffering make sense for
all kinds of streams (line buffering doesn't seem sensible for file
streams, and any buffering on a memory array is pointless). The supplied
buffering should presumably be only a suggestion.
...
streamGetBuffering :: s -> IO BufferMode
      streamFlush	   :: s -> IO ()
Does streamFlush make sense for input streams? In the case of a file
stream it could discard buffered data, but for other streams I'm not sure
what it would do.
...
isEOS		   :: s -> IO Bool
This has a clear meaning for input streams (no more data), but for output
streams it could mean many different things (connection closed by
listener, no more disk space, no more memory buffer space), and, more
seriously, these conditions can't in general be detected synchronously
unless the stream happens to be unbuffered.
...
class InputStream s where
      streamGet         :: s -> IO Word8
      streamReadBuffer  :: s -> Integer -> Buffer -> IO ()
I used "read" and "write" exclusively for files and "get" and "put"
exclusively for streams to emphasize that these are completely different
operations. Writing a file is like writing on a piece of paper; you know
where your data is going and how to get it back with a read. But output
streams are like pneumatic tubes that whisk your octets away to parts
unknown. I would even go so far as to use names like push/pull or
send/receive or speak/listen for streams.
...
streamReadBuffer  :: s -> Integer -> Buffer -> IO ()
      streamGetBuffer   :: s -> Integer -> IO ImmutableBuffer
This brings up (again) an important issue: what's the most practical way
of providing a memory buffer for file/stream operations? There doesn't
seem to be a clean answer to this in Haskell. It seems like we'll need
more variants than just these two.

[snip]
...
data MappedFileInputStream	-- instance Stream, InputStream
data MappedFileOutputStream	-- instance Stream, OutputStream
I don't think these are necessary; you can use ArrayXputStream.

[snip]
...
-- Pipes
data Pipe  -- a pipe with a read and a write end
instance Stream Pipe
instance InputStream Pipe
instance OutputStream Pipe
createPipe	 :: IO Pipe
closePipe	 :: Pipe -> IO ()
I strongly believe that createPipe should return an
(InputStream,OutputStream) pair, not a single object supporting both
interfaces. The streams associated with a pipe represent the ends of the
pipe, not the pipe itself. This is true conceptually and also in practice:
pipes are only useful if you separate the two ends and give them to two
different threads.
...
-- Sockets:
data Socket
instance Stream Socket
instance InputStream Socket
instance OutputStream Socket
Same objection here, although the reason is a bit different. Each TCP
connection consists of two independent unidirectional channels; they're
only created together for reasons of efficiency (and security?). There are
a total of four ends, of which you get two and the remote host gets the
other two. I admit that in this case a natural analogy with a telephone
handset suggests that the two streams should be kept together; but that's
what tuples are for.

The only object I can think of that could legitimately be an instance of
both InputStream and OutputStream is a LIFO buffer, assuming there's any
use for such a thing.

-- Ben

RE: Raw I/O library proposal, second (more pragmatic) draft

Ben Rudiak-Gould