
On Friday 01 September 2006 16:46, Duncan Coutts wrote:
On Fri, 2006-09-01 at 16:28 -0400, Robert Dockins wrote:
On Friday 01 September 2006 15:19, Tamas K Papp wrote:
Hi,
I am newbie, reading the Gentle Introduction. Chapter 7 (Input/Output) says
Pragmatically, it may seem that getContents must immediately read an entire file or channel, resulting in poor space and time performance under certain conditions. However, this is not the case. The key point is that getContents returns a "lazy" (i.e. non-strict) list of characters (recall that strings are just lists of characters in Haskell), whose elements are read "by demand" just like any other list. An implementation can be expected to implement this demand-driven behavior by reading one character at a time from the file as they are required by the computation.
So what happens if I do
contents <- getContents handle putStr (take 5 contents) -- assume that the implementation -- only reads a few chars -- delete the file in some way putStr (take 500 contents) -- but the file is not there now
If an IO function is lazy, doesn't that break sequentiality? Sorry if the question is stupid.
This is not a stupid question at all, and it highlights the main problem with lazy IO. The solution is, in essence "don't do that, because Bad Things will happen". It's pretty unsatisfactory, but there it is. For this reason, lazy IO is widely regarded as somewhat dangerous (or even as an outright misfeature, by a few).
If you are going to be doing simple pipe-style IO (ie, read some data sequentially, manipulate it, spit out the output), lazy IO is very convenient, and it makes putting together quick scripts very easy. However, if you're doing something more advanced, you'd probably do best to stay away from lazy IO.
Since working on Data.ByteString.Lazy I'm now even more of a pro-lazy-IO zealot than I was before ;-)
In practise I expect that most programs that deal with file IO strictly do not handle the file disappearing under them very well either.
That's probably true, except for especially robust applications where such a thing is a regular (or at least expected) event.
At best the probably throw an exception and let something else clean up. The same can be done with lazy I, though it requires using imprecise exceptions which some people grumble about. So I would contend that lazy IO is actually applicable in rather a wider range of circumstances than you might. :-)
Perhaps I should be more clear. When I said "advanced" above I meant "any use whereby you treat a file as random access, read/write storage, or do any kind of directory manipulation (including deleting and or renaming files)". Lazy I/O (as it currently stands) doesn't play very nice with those use cases. I agree generally with the idea that lazy I/O is good. The problem is that it is a "leaky abstraction"; details are exposed to the user that should ideally be completely hidden. Unfortunately, the leaks aren't likely to get plugged without pretty tight operating system support, which I suspect won't be happening anytime soon.
Note also, that with lazy IO we can write really short programs that are blindingly quick. Lazy IO allows us to save a copy through the Handle buffer.
BTW in the above case the "bad thing that will happen" is that contents will be truncated. As I said, I think it's better to throw an exception, which is what Data.ByteString.Lazy.hGetContents does.
Well, AFAIK, the behavior is officially undefined, which is my real beef. I agree that it _should_ throw an exception.
Duncan
-- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG