
Hello, One of my favorite features from Python is its notion of "file-like objects". For those of you unfamiliar with Python, here's a quick backgrounder: anything can present a file-like interface. It could be a file, socket, an interface to a gzip/gunzip processor, character set codecs, in-memory buffers, etc. File-like objects are used in Python for everything from gzip files to certain network data. The memory buffers are great for unit testing. As I've started working on some gzip stuff in Haskell, I've developed a prototype for a similar concept in Haskell. One of my main goals is to keep is as compatible with standard Haskell as I possibly can. To that end, a Handle is still acceptable as a parameter to any of the functions I'm proposing. My proposal is here: http://www.complete.org/~jgoerzen/t/MissingH.IO.HVIO.html I'm aware that others have been working on IO proposals; specifically, Simon Marlow's here: http://www.haskell.org/~simonmar/io/System.IO.html I'm especially interested in their feedback on my proposal, and I'm wondering: * Do you think HVIO has merit? * Would I be better advised to try to implement some existing ideas instead? * Are there any other implementations of these things that are ready to use? (With code) One note: I'm trying to keep the interface and behavior as identical to the normal System.IO interface as possible. So I'm still using Strings, raising the same exceptions, etc. In fact, the Handle instance of these classes looks like: instance Handle HVIOReader vGetChar = hGetChar ... -- John

John Goerzen wrote:
My proposal is here:
http://www.complete.org/~jgoerzen/t/MissingH.IO.HVIO.html
I'm aware that others have been working on IO proposals; specifically, Simon Marlow's here:
The proposal on Simon M's page was originally my design, though Simon made many improvements. You can read my rationale for the original design in these mailing-list messages: http://www.haskell.org/pipermail/haskell/2003-July/012312.html http://www.haskell.org/pipermail/libraries/2003-July/001255.html http://www.haskell.org/pipermail/libraries/2003-July/001257.html http://www.haskell.org/pipermail/libraries/2003-July/001273.html http://www.haskell.org/pipermail/libraries/2003-August/001319.html http://www.haskell.org/pipermail/libraries/2003-August/001336.html http://www.haskell.org/pipermail/libraries/2003-August/001366.html I had to abandon many of the original ideas because the Posix and Win32 APIs can't support them. (Some examples of things you should be able to do, but can't in Posix or Win32: given a directory handle and the name of a file in the directory, open that file; given a file handle with read access, acquire write access if available; conduct atomic filesystem transactions.) The most important idea that survives is the separation of files from input streams and output streams, Given this background you can probably guess that I'm not too keen on the traditional open/read/write/seek/close model; I don't think it's a good abstraction for anything, even files. I love the idea of gzip and gunzip as transformations on streams, though, and streams backed by memory buffers appear in my proposal too.
* Would I be better advised to try to implement some existing ideas instead?
Yes, you should definitely spend your time implementing my pet idea, not yours. :-)
* Are there any other implementations of these things that are ready to use? (With code)
Simon wrote a prototype implementation of his/my proposal: http://www.mail-archive.com/haskell-cafe@haskell.org/msg05138.html -- Ben

On Fri, Dec 17, 2004 at 12:09:34AM +0000, Ben Rudiak-Gould wrote:
Given this background you can probably guess that I'm not too keen on the traditional open/read/write/seek/close model; I don't think it's a good abstraction for anything, even files. I love the idea of gzip and gunzip as transformations on streams, though, and streams backed by memory buffers appear in my proposal too.
The traditional model may be limited, but it's still sufficient for many purposes, and since it's what is supported by the OS, there's something to be said for using it. As a non-library-developer, I must admit that I thought the proposal was pretty nice-looking. In general, I like the idea of creating a standard interface having multiple back ends, and certainly don't see how (for example) viewing a String as a non-seekable read-only file would be at all a bad interface. Of course, you could view it as a seekable read-only file, if you were willing to give up the guarantee of memory-efficient lazy consumption of said string. Presumably you'd need separate classes for binary vs text streams? Since hGetBuf certainly couldn't be used on a String, since it has no binary representation... -- David Roundy http://www.darcs.net

What about Peter Simons' BlockIO library? Its very fast and has a reasonably simple interface. Keean. Ben Rudiak-Gould wrote:
John Goerzen wrote:
My proposal is here:
http://www.complete.org/~jgoerzen/t/MissingH.IO.HVIO.html
I'm aware that others have been working on IO proposals; specifically, Simon Marlow's here:

Keean Schupke writes:
What about Peter Simons' BlockIO library?
I think BlockIO solves a different problem than the other proposals do. BlockIO doesn't really abstract different types of input sources, it does the opposite: it restricts I/O handlers to operate on raw memory buffers. A 'Handler' in BlockIO is a function with, approximately, this type: (Ptr Word8, Int) -> StateT st IO () Every time new input is available, this stateful computation is called by the main loop to process the data -- and this generic main loop can read from all kinds of input sources then. So instead of making a 'Handle' more abstract, BlockIO basically says: Handles don't exist, all we know are blocks of input data. A feature like 'gzip'ing a stream of data, for example, would be implemented as a monadic transformer for computations of the type given above. It wouldn't know anything about I/O either, it would operate on raw memory like the rest of the handlers does. The most obvious advantage of this design is speed. Personally, I also think it is "more functional" to pass a handler function for data obtained from the outside world instead of passing an abstract entity which can be "read from" (with all kinds of nasty error conditions). This design, however, is obviously meant for data streams: the concept of seeking simply doesn't exist. So although you can run a BlockIO handler from any conceivable input source, you cannot fit any conceivable input/output model into BlockIO. Peter

Peter Simons wrote:
This design, however, is obviously meant for data streams: the concept of seeking simply doesn't exist. So although you can run a BlockIO handler from any conceivable input source, you cannot fit any conceivable input/output model into BlockIO.
It seems to me a 'block' system _could_ abstract many IO types, its just that the block size would change and there would need to be a way of seeking. It seems to me seeking could be easily added to the design, one possibility would be to pass a block list to withBuffer: withBuffer :: [Int] -> (Buffer -> IO a) -> IO a This is quite flexible, because the 'return' value could be another list of blocks. If we rearrange the arguments we can get: withBuffer :: (Buffer -> IO a) -> [Int] -> IO a Then it should be possible to write: mfix (withBuffer m) Which would require each application of 'm' to return a list of blocks to process... One remaining limitation is that the block processing must be independant of the blocks location... however if we allow the possibility of passing a state as well, the block processing function can depend on location, this can be achived by using the class definition of IO and allowing monad transformers (like StateT) to be used: withBuffer :: MonadIO m => (Buffer -> m a) -> [Int] -> m a Regards, Keean.

Thank you, Ben, Simon PJ, Simon M, David, and the others that sent such helpful responses. It seems that there is rough consensus that Ben/SimonM's ideas are the right way forward. Obviously these ideas represent a significant shift from the way things are now. I spent some time looking over Simon's example. My first reaction was: "this is almost as complex as Java." Then I saw the System.IO.Text module, and how it could be used in a pretty much self-contained fashion with the same ease as the current IO system, and I felt much better about it. I like the ideas. There are two concerns, though, before I dive right into something like this. First, if someone were to make a working, useful package out of this, is it likely that it would become the "standard" (whatever that means) IO system in Haskell anytime in the near future? I ask because I don't want to put a lot of time into developing an IO library, and code that works with it, only to have nobody use my code because it's incompatible with everything they're doing. Second is my own level of expertise. I frankly don't understand how much of that code could even compile (example: I couldn't find setNonBlockingFD anywhere in my docs; maybe it's from one of those GHC.* areas), and I don't really understand the whole array/buffer situation either. I spent some time reading docs, and I'm still not sure exactly how one builds a mutable, resiable array. I've also never done anything but the most trivial FFI work. I'm willing to learn, but since there's a lot there that's new to me, I'm not terribly confident that I would write correct, useful code. I'm especially unsure of how to make it work with the non-GHC compilers/interpreters out there, given all the GHC pragmas in the code. On Fri, Dec 17, 2004 at 12:09:34AM +0000, Ben Rudiak-Gould wrote:
John Goerzen wrote:
My proposal is here:
http://www.complete.org/~jgoerzen/t/MissingH.IO.HVIO.html
I'm aware that others have been working on IO proposals; specifically, Simon Marlow's here:
The proposal on Simon M's page was originally my design, though Simon made many improvements. You can read my rationale for the original design in these mailing-list messages:
http://www.haskell.org/pipermail/haskell/2003-July/012312.html http://www.haskell.org/pipermail/libraries/2003-July/001255.html http://www.haskell.org/pipermail/libraries/2003-July/001257.html http://www.haskell.org/pipermail/libraries/2003-July/001273.html http://www.haskell.org/pipermail/libraries/2003-August/001319.html http://www.haskell.org/pipermail/libraries/2003-August/001336.html http://www.haskell.org/pipermail/libraries/2003-August/001366.html
I had to abandon many of the original ideas because the Posix and Win32 APIs can't support them. (Some examples of things you should be able to do, but can't in Posix or Win32: given a directory handle and the name of a file in the directory, open that file; given a file handle with read access, acquire write access if available; conduct atomic filesystem transactions.) The most important idea that survives is the separation of files from input streams and output streams,
Given this background you can probably guess that I'm not too keen on the traditional open/read/write/seek/close model; I don't think it's a good abstraction for anything, even files. I love the idea of gzip and gunzip as transformations on streams, though, and streams backed by memory buffers appear in my proposal too.
* Would I be better advised to try to implement some existing ideas instead?
Yes, you should definitely spend your time implementing my pet idea, not yours. :-)
* Are there any other implementations of these things that are ready to use? (With code)
Simon wrote a prototype implementation of his/my proposal:
http://www.mail-archive.com/haskell-cafe@haskell.org/msg05138.html
-- Ben

At Fri, 17 Dec 2004 10:48:34 -0600, John Goerzen wrote:
First, if someone were to make a working, useful package out of this, is it likely that it would become the "standard" (whatever that means) IO system in Haskell anytime in the near future? I ask because I don't want to put a lot of time into developing an IO library, and code that works with it, only to have nobody use my code because it's incompatible with everything they're doing.
I am quite ignorant about the different proposals coming along these days, but I would hope that anything that hopes to be the next generation of IO would address: (1) binary IO (especially dealing with bit-fields -- like what you would see in network packets, or binary data formats like .swf or stuff). (2) Internationalization / Unicode I am not saying that it must *solve* these problems -- just that it needs to have a plan for how to add these later. Thanks! Jeremy, the ignorant, Shaw. -- This message contains information which may be confidential and privileged. Unless you are the addressee (or authorized to receive for the addressee), you may not use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender and delete the message. Thank you.
participants (6)
-
Ben Rudiak-Gould
-
David Roundy
-
Jeremy Shaw
-
John Goerzen
-
Keean Schupke
-
Peter Simons