
On 22 April 2005 12:05, Peter Simons wrote:
Simon Marlow writes:
Does anyone have any *objections* to introducing
System.IO.hGetWord8 :: Handle -> IO Word8 System.IO.hPutWord8 :: Word8 -> Handle -> IO Word8
I don't mind having these functions, but to be honest I doubt that they would be useful for real life applications. Reading a byte at a time is a performance nightmare, buffering or not. IMHO, the much better API is this one:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
You have these in GHC.IO at the moment. I beieve they don't have proper non-blocking behaviour on Windows, though. These are also rather more difficult to implement than the single-byte versions, so I can't see nhc98 or Hugs implementing them any time soon. Cheers, Simon

Simon Marlow writes:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
You have these in GHC.IO at the moment.
I know. ;-)
These are also rather more difficult to implement than the single-byte versions, so I can't see nhc98 or Hugs implementing them any time soon.
What is difficult to implement about these functions? Aren't they just wrappers for read(2) and write(2)? Peter

On Fri, Apr 22, 2005 at 01:26:14PM +0200, Peter Simons wrote:
Simon Marlow writes:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
You have these in GHC.IO at the moment.
I know. ;-)
These are also rather more difficult to implement than the single-byte versions, so I can't see nhc98 or Hugs implementing them any time soon.
What is difficult to implement about these functions? Aren't they just wrappers for read(2) and write(2)?
The handles have internal buffering that they might interact oddly with. windows might do things differently... In any case, these seem much too low level for the task at hand, which is to just make portable binary IO _possible_. not necessarily efficient or elegant. And hGet/PutWord8 are fast enough for most situations. We certainly don't want to have to teach a new user about Foreign.C.Storable just because they want to do a little simple binary IO. A next generation IO system has been in the works for a while, which will probably provide some portable low level primitives too. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham writes:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
What is difficult to implement about these functions? Aren't they just wrappers for read(2) and write(2)?
In any case, these seem much too low level for the task at hand, which is to just make portable binary IO _possible_. not necessarily efficient or elegant.
Well, there is a difference between "it is possible" and "it does actually work", and my guess is that any non-trivial application which performs byte-wise I/O will _not_ work in practice. I doubt you could implement Darcs with that API; and you certainly couldn't implement Bittorrent with this API (which was the question that spawned this thread).
And hGet/PutWord8 are fast enough for most situations.
Are you certain? Which interesting applications do you know that read and write one byte at a time? I can't speak for "most" situations, but in those situations where I needed binary I/O this API would have been impracticable.
We certainly don't want to have to teach a new user about Foreign.C.Storable just because they want to do a little simple binary IO.
That's why you write libraries which provide a more comfortable API based on the primitives. You cannot write such a library, however, if your primitives are slow to begin with! Don't get me wrong -- I don't oppose the addition of those functions to the standard libraries. I just don't think they'll be good for much in the real world. Peter

On Fri, 22 Apr 2005 08:09:27 -0700, Peter Simons
And hGet/PutWord8 are fast enough for most situations.
Are you certain? Which interesting applications do you know that read and write one byte at a time? I can't speak for "most" situations, but in those situations where I needed binary I/O this API would have been impracticable.
I'm going to chime in agreement with the disagreement. I have one application which uses single character binary I/O. Because it works on small files, it borders on usefuleness. As time goes on, the size of these inputs grows, and now the tool has become nearly worthless. I seem to run into this kind of thing a lot with Haskell. Dominic's ASN library is useful to me, however I'm finding I'll probably just have to use it as a template for new code. It works fine for what designed for, parsing keys and such, but I'm looking to use it to represent gigabytes of data. Processing data like that one Word8 at a time isn't practical. Dave

David Brown wrote:
And hGet/PutWord8 are fast enough for most situations.
Are you certain? Which interesting applications do you know that read and write one byte at a time? I can't speak for "most" situations, but in those situations where I needed binary I/O this API would have been impracticable.
I'm going to chime in agreement with the disagreement.
I have one application which uses single character binary I/O. Because it works on small files, it borders on usefuleness. As time goes on, the size of these inputs grows, and now the tool has become nearly worthless.
I seem to run into this kind of thing a lot with Haskell. Dominic's ASN library is useful to me, however I'm finding I'll probably just have to use it as a template for new code. It works fine for what designed for, parsing keys and such, but I'm looking to use it to represent gigabytes of data. Processing data like that one Word8 at a time isn't practical.
Personally, I doubt that Haskell will ever be practical for processing
very large amounts of data (e.g. larger than your system's RAM).
When processing large amounts of data, rule #1 is to do as little as
possible to most of the data; don't even read it into memory if you
can avoid it (e.g. create an index and use lseek() as much as
possible). You certainly don't want to be boxing/unboxing gigabytes.
--
Glynn Clements

Glynn Clements wrote:
Personally, I doubt that Haskell will ever be practical for processing very large amounts of data (e.g. larger than your system's RAM).
When processing large amounts of data, rule #1 is to do as little as possible to most of the data; don't even read it into memory if you can avoid it (e.g. create an index and use lseek() as much as possible). You certainly don't want to be boxing/unboxing gigabytes.
Just like to note that I have found Haskell fine for processing large datasets, if used in the old batch processing model (and easy to code as you can use lazy lists to stream the data). Also in the future as 64bit becomes more widly used, an mmap into an MArray could be an efficient way to deal with large datasets. Keean.

On Fri, 22 Apr 2005 13:56:20 -0700, Glynn Clements
Personally, I doubt that Haskell will ever be practical for processing very large amounts of data (e.g. larger than your system's RAM).
Well, I haven't had any problems implementing the kind of processing I've been doing. I wrote a utility to scan through a directory structure and compute SHA1 hashes of all of the files. When implemented smartly, it spends about 97% of its time in the SHA1 hash computation (A C library), so does't really have much different of an overhead of one written in a less pleasant language. Written with lazy lists, and a Haskell SHA1 is is hundreds of times slower. I don't think Haskell will be a barrier to writing something like backup software, as long as it is possible to hand the data around in chunks. Dave

Peter Simons wrote:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
You have these in GHC.IO at the moment.
I know. ;-)
These are also rather more difficult to implement than the single-byte versions, so I can't see nhc98 or Hugs implementing them any time soon.
What is difficult to implement about these functions? Aren't they just wrappers for read(2) and write(2)?
I would expect them to be essentially wrappers for fread() and
fwrite() (i.e. the stream's buffering is used), except with a single
byte-count argument instead of separate element-size and element-count
arguments.
--
Glynn Clements

Simon Marlow writes:
hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
I believe they don't have proper non-blocking behaviour on Windows, though.
What exactly would happen when I'd try to use those on Windows? Would they block until they had read/written exactly 'Int' bytes? Or is there some other, more subtle difference? Peter
participants (6)
-
David Brown
-
Glynn Clements
-
John Meacham
-
Keean Schupke
-
Peter Simons
-
Simon Marlow