Streams internal structure

21 Apr 2006

      Hello Simon,

Thursday, April 20, 2006, 12:57:10 PM, you wrote:
...
...
...
Believe me I've looked in detail at your streams library.
...
The main problem is that it doesn't have enough type structure.  There
are many combinations of stream transformers that don't make sense, and
should therefore be ruled out by the type system.  There are operations
that don't work on some streams.  There should at the least be a type 
distinction between directly accessible memory streams, byte streams, 
and text streams.  Additionally I would add separate classes for 
seekable and buffered streams.  I believe these changes would improve 
performance by reducing the size of dictionaries.
well, your last answer shows that you don't understand my problems.
i'm entirely want to have precise classes, but when i run into
IMPLEMENTATION, Haskell restrictions bite me again and again. for
beginning, i should describe class structure what was modified
according to your critics (i yet not released version with these
modifications, but it available as
http://freearc.narod.ru/Binary_20060329184510.rar)

currently, Stream lib includes class Streams, what includes
"unclassified", minor operations, such as vIsEOF or vShow, and classes
BlockStream, MemoryStream, ByteStream, TextStream

class BlockStream supports reading and writing of memory blocks:

class (Stream m h) => BlockStream m h | h->m where
    vGetBuf :: h -> Ptr a -> Int -> m Int
    vPutBuf :: h -> Ptr a -> Int -> m ()

it's a natural class for low-level, "raw" streams such as FD or
network sockets:

instance BlockStream IO FD where
    vGetBuf = fdGetBuf
    vPutBuf = fdPutBuf

vGetBuf, as you suggested, should implement non-blocking reads, while
the vPutBuf should write the whole buffer.

MemoryStream class supports low-level, "raw" streams whose data are
ALREADY RESIDE IN MEMORY. examples are MemBuf, MemoryMappedFile,
circular buffer for inter-thread communication what i plan to
implement. the main difference against BlockStream is what data are
not read in user-supplied buffer, but already somewhere in memory, and
MemoryStream functions just give to user address and size of next part
of data (for reading) or next place to fill up (for writing). so:

class (Stream m h) => MemoryStream m h | h->m where
    -- | Receive next buffer which contains data / should be filled with data
    vReceiveBuf :: h -> ReadWrite -> m (Ptr a, Int)
    -- | Release buffer after reading `len` bytes / Send buffer filled with `len` bytes
    vSendBuf    :: h -> Ptr a -> Int -> Int -> m ()

data ReadWrite = READING | WRITING | UNKNOWN  deriving (Eq, Show)

this scheme will allow to create MemBuf not as one huge buffer, as in
current implementation and all other Binary/... libraries, but as list
of buffers of some fixed size. the same applies to MemoryMappedFile -
this scheme allows to map just fixed-size buffer each time instead of
mapping the whole file

next level is the ByteStream class, what is just the way to quickly
read/write one byte at time:

class (Stream m h) => ByteStream m h | h->m where
    vGetByte :: h -> m Word8
    vPutByte :: h -> Word8 -> m ()

each buffering transformer implements ByteStream via BlockStream or
MemoryStream:

instance (BlockStream IO h) => ByteStream IO (BufferedBlockStream h)
instance (MemoryStream IO h) => ByteStream IO (BufferedMemoryStream h)
instance (MemoryStream IO h) => ByteStream IO (UncheckedBufferedMemoryStream h)

These data type constructors (BufferedBlockStream...) joins raw stream
with buffer, r/w pointer and other data required to implement
buffering:

data BufferedBlockStream h = BBuf h                  -- raw stream
                                  (IOURef BytePtr)   -- buffer
                                  (IOURef BytePtr)   -- buffer end
                                  (IOURef BytePtr)   -- r/w pointer
                                  ....
type BytePtr = Ptr Word8

Next level is class TextStream that implements text I/O operations:

class (Stream m h) => TextStream m h | h->m where
    vGetChar :: h -> m Char
    vGetLine :: h -> m String
    vGetContents :: h -> m String
    vPutChar :: h -> Char -> m ()
    vPutStr :: h -> String -> m ()

Encoding transformer attaches encoding to the ByteStream that allows
to implement text I/O:

instance (ByteStream m h) => TextStream m (WithEncoding m h)

data WithEncoding m h = WithEncoding h (CharEncoding m)

where type "CharEncoding m" provides vGetByte->vGetChar and vPutByte->vPutChar
transformers operating in monad m

all these can be named a canonical Streams hierarchy and it already
works. i will be glad to add BufferedStream and SeekableStream classes
and split BlockStream..TextStream to the reading and writing ones, but
this is, as i said, limited by implementation issues

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin@gmail.com

Streams internal structure

Bulat Ziganshin