Re: [Haskell-cafe] Re: Hugsvs GHC (again)was: Re: Somerandomnewbiequestions

Ben Rudiak-Gould
If you're reading from a random-access file, there's no way [select] can tell you when the file data is buffered, because it doesn't know which part of the file you plan to read. The OS may try to guess for readahead purposes, but select()'s behavior can't depend on that guess.
The Glibc documentation says, "select determines if there is data available (more precisely, if a call to read(2) will not block)." I think, this is reasonably precise. The OS does know, where you are going to read (at the file pointer) and if you seek() or pread() instead, well, that is no call to read(2) and may change everything. Thus the question is, does select() reliably tell if read() would block or does it check for something else? Is the documentation wrong (on some platforms)?
This is another example of why the world would be better off with the file/stream model. Have I convinced anyone?
Uhm, well, me thinks, open() and read() on files quite accurately model a stream already... Udo. __________________________________________________________ Mit WEB.DE FreePhone mit hoechster Qualitaet ab 0 Ct./Min. weltweit telefonieren! http://freephone.web.de/?mc=021201

Udo Stenzel wrote:
The Glibc documentation says, "select determines if there is data available (more precisely, if a call to read(2) will not block)." I think, this is reasonably precise. The OS does know, where you are going to read (at the file pointer) and if you seek() or pread() instead, well, that is no call to read(2) and may change everything.
Thus the question is, does select() reliably tell if read() would block or does it check for something else? Is the documentation wrong (on some platforms)?
Having read around I have found that select does return readable for all file IO on a block device... I wonder if ghc could use non-blocking mode (files opened with the O_NONBLOCK) flag? In which case you just do the read, and it returns immediately with the current contents of the buffer (up to the size in the read argument)... The sheduler could allow one chance at reading, then give the other haskell-threads a go whilst more data comes in. Keean.

On Fri, 2005-01-21 at 16:26 +0000, Keean Schupke wrote:
Udo Stenzel wrote:
Thus the question is, does select() reliably tell if read() would block or does it check for something else? Is the documentation wrong (on some platforms)?
Having read around I have found that select does return readable for all file IO on a block device...
I wonder if ghc could use non-blocking mode (files opened with the O_NONBLOCK) flag? In which case you just do the read, and it returns immediately with the current contents of the buffer (up to the size in the read argument)... The sheduler could allow one chance at reading, then give the other haskell-threads a go whilst more data comes in.
In 6.2 (and in 6.4 without the threaded RTS) this is what ghc already does. It opens all files in non-blocking mode, when a Haskell thread does a read, if that read would have blocked the thread is suspended and another thread is scheduled. The Haskell thread that did the read only becomes runnable again when select() indicates that there is more data available. However as people have said disk files are considered never to block so all this scheduling stuff will only happen for Haskell threads that are reading from pipes or sockets. I think you're still confused about what non-blocking means under the traditional unix interpretation. It does not guarantee that all you have to wait for is for some disk buffer to be copied into your address space (ie hardly any time at all). It may mean waiting for as long as it takes to schedule an IO read, seek, perform dma transfer and then copy the data to your address space. (Writing on the other hand is pretty instantaneous so long as there is kernel IO buffer space available) Duncan
participants (3)
-
Duncan Coutts
-
Keean Schupke
-
Udo Stenzel