New subject: [Haskell] Lazy IO breaks purity

5 Mar 2009


      Lennart Augustsson wrote:
...
I don't see any breaking of referential transparence in your code.
Every time you do an IO operation the result is basically
non-deterministic since you are talking to the outside world.
You're assuming the IO has some kind of semantics that Haskell makes
no promises about.
I'm not saying that this isn't a problem, because it is.
But it doesn't break referential transparency, it just makes the
semantics of IO even more complicated.
(I don't have a formal proof that unsafeInterleaveIO cannot break RT,
but I've not seen an example where it does yet.)
So the argument is something like: we can think of the result of a call to 
unsafeInterleaveIO as having been chosen at the time we called 
unsafeInterleaveIO, rather than when its result is actually evaluated. 
This is on dodgy ground, IMO: either you admit that the IO monad contains 
an Oracle, or you admit it can time-travel.   I don't believe in either of 
those things :-)

Cheers,
	Simon
...
-- Lennart
On Thu, Mar 5, 2009 at 2:12 AM,   wrote:
...
We demonstrate how lazy IO breaks referential transparency.  A pure
function of the type Int->Int->Int gives different integers depending
on the order of evaluation of its arguments. Our Haskell98 code uses
nothing but the standard input.  We conclude that extolling the purity
of Haskell and advertising lazy IO is inconsistent.
Henning Thielemann wrote on Haskell-Cafe:
...
Say I have a chain of functions: read a file, ... write that to a file,
all of these functions are written in a lazy way, which is currently
considered good style
Lazy IO should not be considered good style. One of the common
definitions of purity is that pure expressions should evaluate to the
same results regardless of evaluation order, or that equals can be
substituted for equals. If an expression of the type Int evaluates to
1, we should be able to replace every occurrence of the expression with
1 without changing the results and other observables.
The expression (read s) where s is the result of the (hGetContents h1)
action has the pure type, e.g., Int. Yet its evaluation does more than
just producing an integer: it may also close a file associated with
the handle h1. Closing the file has an observable effect: the file
descriptor, which is a scarce resource, is freed. Some OS lock the
open file, or prevent operations such as renaming and deletion on the
open file. A file system whose file is open cannot be
unmounted. Suppose I use an Haskell application such as an editor to
process data from a removable drive. I mount the drive, tell the
application the file name. The (still running) application finished
with the file and displayed the result. And yet I can't unplug the
removable drive, because it turns out that the final result was
produced without needing to read all the data from the file, and the
file remains unclosed.
Some people say: the programmer should have used seq. That is the
admission of guilt! An expression (read s)::Int that evaluates to 1 is
equal to 1. So, one should be able substitute (read s) with 1. If the
result of evaluating 1 turns out not needed for the final outcome,
then not evaluating (read s) should not affect the outcome. And yet it
does. One uses seq to force evaluation of an expression even if the
result may be unused. Thus the expression of a pure type
has *side-effect*.
The advice about using seq reminds me of advice given to C programmers
that they should not free a malloc'ed pointer twice, dereference a
zero pointer and write past the boundary of an array. If lazy IO is
considered good style given the proper use of seq, then C should be
considered safe given the proper use of pointers and arrays.
Side affects like closing a file are observable in the external
world. A program may observe these effects, in an IO monad. One can
argue there are no guarantees at all about what happens in the IO
monad. Can side-effects of lazy IO be observable in _pure_ Haskell
code, in functions with pure type? Yes, they can. The examples are
depressingly easy to write, once one realizes that reading does have
side effects, POSIX provides for file descriptor aliasing, and sharing
becomes observable with side effects. Here is a simple Haskell98 code.
...
{- Haskell98! -}
module Main where
import System.IO
import System.Posix.IO (fdToHandle, stdInput)
-- f1 and f2 are both pure functions, with the pure type.
-- Both compute the result of the subtraction e1 - e2.
-- The only difference between them is the sequence of
-- evaluating their arguments, e1 `seq` e2 vs. e2 `seq` e1
-- For really pure functions, that difference should not be observable
f1, f2:: Int -> Int -> Int
f1 e1 e2 = e1 `seq` e2 `seq` e1 - e2
f2 e1 e2 = e2 `seq` e1 `seq` e1 - e2
read_int s = read . head . words $ s
main = do
       let h1 = stdin
       h2 <- fdToHandle stdInput
       s1 <- hGetContents h1
       s2 <- hGetContents h2
       print $ f1 (read_int s1) (read_int s2)
       -- print $ f2 (read_int s1) (read_int s2)
One can compile it with GHC (I used 6.8.2, with and without -O2) and
run like this
  ~> /tmp/a.out
  1
  2
  -1
That is, we run the program and type 1 and 2 as the inputs. The
program prints out the result, -1. If we comment out the line
"print $ f1 (read_int s1) (read_int s2)" and uncomment out the last
line the transcript looks like this
  ~> /tmp/a.out
  1
  2
  1
Running the code with Hugs exhibits the same behavior. Thus a pure
function (-) of the pure type gives different results depending on the
order of evaluating its arguments.
Is 1 = -1?
_______________________________________________
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell
_______________________________________________
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell

Re: [Haskell] Lazy IO breaks purity

Simon Marlow

Lennart Augustsson

Jonathan Cast

Gregg Reynolds

Brandon S. Allbery KF8NH

Duncan Coutts

tags

participants (6)