Re: [Haskell-cafe] getContents and lazy evaluation

2 Sep 2006


      On Friday 01 September 2006 16:46, Duncan Coutts wrote:
...
On Fri, 2006-09-01 at 16:28 -0400, Robert Dockins wrote:
...
On Friday 01 September 2006 15:19, Tamas K Papp wrote:
...
Hi,
I am newbie, reading the Gentle Introduction.  Chapter 7
(Input/Output) says
Pragmatically, it may seem that getContents must immediately read an
  entire file or channel, resulting in poor space and time performance
  under certain conditions. However, this is not the case. The key
  point is that getContents returns a "lazy" (i.e. non-strict) list of
  characters (recall that strings are just lists of characters in
  Haskell), whose elements are read "by demand" just like any other
  list. An implementation can be expected to implement this
  demand-driven behavior by reading one character at a time from the
  file as they are required by the computation.
So what happens if I do
contents <- getContents handle
putStr (take 5 contents) -- assume that the implementation
       	       		 -- only reads a few chars
-- delete the file in some way
putStr (take 500 contents) -- but the file is not there now
If an IO function is lazy, doesn't that break sequentiality?  Sorry if
the question is stupid.
This is not a stupid question at all, and it highlights the main problem
with lazy IO.  The solution is, in essence "don't do that, because Bad
Things will happen".  It's pretty unsatisfactory, but there it is.  For
this reason, lazy IO is widely regarded as somewhat dangerous (or even as
an outright misfeature, by a few).
If you are going to be doing simple pipe-style IO (ie, read some data
sequentially, manipulate it, spit out the output),  lazy IO is very
convenient, and it makes putting together quick scripts very easy. 
However, if you're doing something more advanced, you'd probably do best
to stay away from lazy IO.
Since working on Data.ByteString.Lazy I'm now even more of a pro-lazy-IO
zealot than I was before ;-)
In practise I expect that most programs that deal with file IO strictly
do not handle the file disappearing under them very well either.
That's probably true, except for especially robust applications where such a 
thing is a regular (or at least expected) event.
...
At best 
the probably throw an exception and let something else clean up. The
same can be done with lazy I, though it requires using imprecise
exceptions which some people grumble about. So I would contend that lazy
IO is actually applicable in rather a wider range of circumstances than
you might. :-)
Perhaps I should be more clear.  When I said "advanced" above I meant "any use 
whereby you treat a file as random access, read/write storage, or do any kind 
of directory manipulation (including deleting and or renaming files)".  Lazy 
I/O (as it currently stands) doesn't play very nice with those use cases.

I agree generally with the idea that lazy I/O is good.  The problem is that it 
is a "leaky abstraction"; details are exposed to the user that should ideally 
be completely hidden.  Unfortunately, the leaks aren't likely to get plugged 
without pretty tight operating system support, which I suspect won't be 
happening anytime soon.
...
Note also, that with lazy IO we can write really short programs that are
blindingly quick. Lazy IO allows us to save a copy through the Handle
buffer.
...
BTW in the above case the "bad thing that will happen" is that contents
will be truncated. As I said, I think it's better to throw an exception,
which is what Data.ByteString.Lazy.hGetContents does.
Well, AFAIK, the behavior is officially undefined, which is my real beef.  I 
agree that it _should_ throw an exception.
...
Duncan
-- 
Rob Dockins

Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
       -- TMBG