Restructured version of the Streams library

Hi, I wrote an IO library, inspired by, and partly based on, Bulat Ziganshin's Streams library. The main aim of the library is to experiment with different design choices, especially emphasizing the following: * Many small classes * Iconv-friendly interface The source code and documentation can be found at: http://yogimo.sakura.ne.jp/ssc/ == Type structure The library tries to assign a different class for a different concept. For example, it distinguishes read-only streams from write-only ones, and seekable streams from non-seekable ones. As a result, the library provides a small and clean interface, so it ensures more type safety, and type signatures of stream-operating functions are able to carry more information. == Iconv friendliness Although Unicode is quickly becoming popular, there are still many widely-used, character-set-specific encodings. Therefore i18n-aware applications may have to handle a locale-specific encoding, which is not statically known. Then, using an external conversion routine, such as iconv, will be virtually the only possible way to go. The library's interface intaracts well with iconv and other c runtines. This also means that adding zlib functionality can be easy, for example. == Performance The library is currently not very fast. For example, typical repitition of putStr/getLine is 1.5x slower than the corresponding standard Handle operations. When you omit locking, the library performs roughly equally well to the standard Handle. == Any comments are welcome. Regards, Takano Akio

Takano Akio wrote:
I wrote an IO library, inspired by, and partly based on, Bulat Ziganshin's Streams library. The main aim of the library is to experiment with different design choices, especially emphasizing the following:
* Many small classes * Iconv-friendly interface
The source code and documentation can be found at: http://yogimo.sakura.ne.jp/ssc/
After my initial reaction of "oh no, not another one" :-) I took a brief look and I like it very much. I admit I don't fully understand the structure, especially the need for the Controlled classes. It would probably help to have some notes on the design: what is the motivation for the presence of each class. Going beyond just file descriptors, what kinds of stream do you envisage being instances of which classes. Also, file descriptors are not platform independent, so it would be better to stick to the notion of a "file" as the underlying device for file I/O, and have separate types for sockets, pipes, and other objects. We need to know whether the design can be efficiently implemented. Maybe Bulat could comment on that? Cheers, Simon

[I personally replied to Simon, by mistake. Sorry for the confusion]
Thank you for the comment.
On Mon, 12 Jun 2006 10:18:01 +0100
Simon Marlow
After my initial reaction of "oh no, not another one" :-) I took a brief look and I like it very much. I admit I don't fully understand the structure, especially the need for the Controlled classes.
It would probably help to have some notes on the design: what is the motivation for the presence of each class. Going beyond just file descriptors, what kinds of stream do you envisage being instances of which classes.
I added some explanations to the library overview. I also wrote rationale for the Controlled class. It is essentially for making the implementations of buffering transformers easy, if I'm not missing anything. (I'm not very sure about it because the class was introduced before a big design change was made). I'm now thinking about whether it is worth complicating the class structure.
Also, file descriptors are not platform independent, so it would be better to stick to the notion of a "file" as the underlying device for file I/O, and have separate types for sockets, pipes, and other objects.
I like the idea. Thank you! Regards, Takano Akio

Hello Takano, Sunday, June 11, 2006, 1:18:35 PM, you wrote:
I wrote an IO library, inspired by, and partly based on, Bulat Ziganshin's Streams library. The main aim of the library is to experiment with different design choices, especially emphasizing the following:
* Many small classes * Iconv-friendly interface
The source code and documentation can be found at: http://yogimo.sakura.ne.jp/ssc/
first i considered your library as ready-to-use alternative to Streams and was discouraged. but then i understood that it is only proof-of-concept for alternative class structure and realized that for this purpose it serves very well - everything works just as it should work, every bit of Simon's critique about monolithic Stream class was taken into account
== Type structure
The library tries to assign a different class for a different concept. For example, it distinguishes read-only streams from write-only ones, and seekable streams from non-seekable ones. As a result, the library provides a small and clean interface, so it ensures more type safety, and type signatures of stream-operating functions are able to carry more information.
moreover, you've implemented this w/o duplicating functionality in read, write and read/write classes and virtually w/o speed penalty, by creating "manager" (you call it Controller) class to switch between reading and writing Port behavior
== Iconv friendliness
Although Unicode is quickly becoming popular, there are still many widely-used, character-set-specific encodings. Therefore i18n-aware applications may have to handle a locale-specific encoding, which is not statically known. Then, using an external conversion routine, such as iconv, will be virtually the only possible way to go. The library's interface intaracts well with iconv and other c runtines. This also means that adding zlib functionality can be easy, for example.
about your example of using UTF8+Latin1 encoding on one stream: it shouldn't work for using any encoding AFTER UTF8 (i don't have Unix, so i not compiled your lib and don't make actual tests). imagine that first 10 bytes in file represent 8 UTF8 chars. after decoding, these chars will be represented by 32 bytes (8 UTF-32 values). when you make sCleanup, library just can't know what those 32 bytes read from intermediate buffer correspond to 10 bytes in lower-level stream
== Performance
The library is currently not very fast. For example, typical repitition of putStr/getLine is 1.5x slower than the corresponding standard Handle operations. When you omit locking, the library performs roughly equally well to the standard Handle.
to make performance maximal, you should use INLINE pragmas, unboxed references and many other small toys of True Glasgow Inliner :) of course, you don't need to do so in proof-of-concept lib the only overhead imposed by your design by itself is to check one IOURef (unboxed variable) per each operation and even that is required only for controlled r/w streams i also specially mentioned your implementation of locking transformer. if i correctly understood, it allows to add locking even when stream is used by two transformer paths (such as UTF8+Latin1), in contrast to my library i still don't understood your whole class structure and therefore can't compare it exactly to my current design (where i splitted Stream class only to BlockStream/MemoryStream/CharStream/ByteStream with corresponding operations inside each class) and therefore can't decide whether i like to go in your direction or don't like :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hi Bulat,
On Tue, 13 Jun 2006 20:08:16 +0400
Bulat Ziganshin
first i considered your library as ready-to-use alternative to Streams and was discouraged. but then i understood that it is only proof-of-concept for alternative class structure and realized that for this purpose it serves very well - everything works just as it should work, every bit of Simon's critique about monolithic Stream class was taken into account
Yes, I deliberately published the library in the early stage, in order to show the design to you and other people interested in the area. If the design will be used by the Streams library, I will consider it a success and be satisfied. Otherwise, I will develop it into a fully-featured library.
moreover, you've implemented this w/o duplicating functionality in read, write and read/write classes and virtually w/o speed penalty, by creating "manager" (you call it Controller) class to switch between reading and writing Port behavior
Now I have got rid of the Controlled class. This change introduces a overhead when more than one read/write buffering transformers are used together, but I think it is a rare case.
about your example of using UTF8+Latin1 encoding on one stream: it shouldn't work for using any encoding AFTER UTF8 (i don't have Unix, so i not compiled your lib and don't make actual tests). imagine that first 10 bytes in file represent 8 UTF8 chars. after decoding, these chars will be represented by 32 bytes (8 UTF-32 values). when you make sCleanup, library just can't know what those 32 bytes read from intermediate buffer correspond to 10 bytes in lower-level stream
Thank you for pointing out the design flaw. I'm going to fix it.
i also specially mentioned your implementation of locking transformer. if i correctly understood, it allows to add locking even when stream is used by two transformer paths (such as UTF8+Latin1), in contrast to my library
Yes, that is my intention. Regards, Takano Akio

Hello Takano, Wednesday, June 14, 2006, 4:42:25 PM, you wrote:
work, every bit of Simon's critique about monolithic Stream class was taken into account
Yes, I deliberately published the library in the early stage, in order to show the design to you and other people interested in the area. If the design will be used by the Streams library, I will consider it a success and be satisfied. Otherwise, I will develop it into a fully-featured library.
how about redesigning Streams 0.2 library? i've just uploaded current state-of-the-art as http://freearc.narod.ru/StreamsBeta.tar.gz it also requires http://freearc.narod.ru/ArrayRef.tar.gz to be installed we can ask Simon Marlow to establish darcs repository for the lib
about your example of using UTF8+Latin1 encoding on one stream: it shouldn't work for using any encoding AFTER UTF8 (i don't have Unix, so i not compiled your lib and don't make actual tests). imagine that first 10 bytes in file represent 8 UTF8 chars. after decoding, these chars will be represented by 32 bytes (8 UTF-32 values). when you make sCleanup, library just can't know what those 32 bytes read from intermediate buffer correspond to 10 bytes in lower-level stream
Thank you for pointing out the design flaw. I'm going to fix it.
how? :) i think it is impossible -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hi Bulat,
On Wed, 14 Jun 2006 19:22:18 +0400
Bulat Ziganshin
how about redesigning Streams 0.2 library? i've just uploaded current state-of-the-art as http://freearc.narod.ru/StreamsBeta.tar.gz
it also requires http://freearc.narod.ru/ArrayRef.tar.gz to be installed
we can ask Simon Marlow to establish darcs repository for the lib
You mean we could cooperate on redisigning your library? That would be great!
Thank you for pointing out the design flaw. I'm going to fix it.
how? :) i think it is impossible
I think it can be done by providing a special input-only buffering transformer that remembers the position at the point of the last "fill" operation. If you want to discard the buffer, you can simply seek to the remembered position. This should work well unless the content of the file changes. Regards, Takano Akio
participants (3)
-
Bulat Ziganshin
-
Simon Marlow
-
Takano Akio