unsafeInterleaveST (and IO) is really unsafe [was: meaning of "referential transparency"]

One may read this message as proving True === False without resorting to IO. In other words, referential transparency, or the substitution of equals for equals, may fail even in expressions of type Bool. This message is intended as an indirect stab at lazy IO. Unfortunately, Lazy IO and even the raw unsafeInterleaveIO, on which it is based, are sometimes recommended, without warnings that usually accompany unsafePerformIO. http://www.haskell.org/pipermail/haskell-cafe/2013-March/107027.html UnsafePerformIO is known to be unsafe, breaking equational reasoning; unsafeInterleaveIO gets a free pass because any computation with it has to be embedded in the IO context in order to be evaluated -- and we can expect anything from IO. But unsafeInterleaveIO has essentially the same code as unsafeInterleaveST: compare unsafeInterleaveST from GHC/ST.lhs with unsafeDupableInterleaveIO from GHC/IO.hs keeping in mind that IO and ST have the same representation, as described in GHC/IO.hs. And unsafeInterleaveST is really unsafe -- not just mildly or somewhat or vaguely unsafe. In breaks equational reasoning, in pure contexts. Here is the evidence. I hope most people believe that the equality on Booleans is symmetric. I mean the function (==) :: Bool -> Bool -> Bool True == True = True False == False = True _ == _ = False or its Prelude implementation. The equality x == y to y == x holds even if either x or y (or both) are undefined. And yet there exists a context that distinguishes x == y from y ==x. That is, there exists bad_ctx :: ((Bool,Bool) -> Bool) -> Bool such that *R> bad_ctx $ \(x,y) -> x == y True *R> bad_ctx $ \(x,y) -> y == x False The function unsafeInterleaveST ought to bear the same stigma as does unsafePerformIO. After all, both masquerade side-effecting computations as pure. Hopefully even lazy IO will get recommended less, and with more caveats. (Lazy IO may be superficially convenient -- so are the absence of typing discipline and the presence of unrestricted mutation, as many people in Python/Ruby/Scheme etc worlds would like to argue.) The complete code: module R where import Control.Monad.ST.Lazy (runST) import Control.Monad.ST.Lazy.Unsafe (unsafeInterleaveST) import Data.STRef.Lazy bad_ctx :: ((Bool,Bool) -> Bool) -> Bool bad_ctx body = body $ runST (do r <- newSTRef False x <- unsafeInterleaveST (writeSTRef r True >> return True) y <- readSTRef r return (x,y)) t1 = bad_ctx $ \(x,y) -> x == y t2 = bad_ctx $ \(x,y) -> y == x

On 10/04/2013, at 2:45 PM,
import Control.Monad.ST.Lazy (runST) import Control.Monad.ST.Lazy.Unsafe (unsafeInterleaveST) import Data.STRef.Lazy
bad_ctx :: ((Bool,Bool) -> Bool) -> Bool bad_ctx body = body $ runST (do r <- newSTRef False x <- unsafeInterleaveST (writeSTRef r True >> return True) y <- readSTRef r return (x,y))
t1 = bad_ctx $ \(x,y) -> x == y -- True t2 = bad_ctx $ \(x,y) -> y == x -- False
If I remember correctly, one of the Griswold systems on the path between SNOBOL and Icon had a special feature for looking below the language level, called "The Window into Hell". Derek Lowe has a list of "Things I Won't Work With". http://pipeline.corante.com/archives/things_i_wont_work_with/ unsafeInterleaveST has just joined my "Things I Won't Work With" list. But since it is new to me, I don't understand what it does or *how* it breaks this code. Does it involve side effects being reordered in some weird way? I think there is a big difference between this and lazy I/O. unsafeInterleaveST *sounds* dangerous. Lazy I/O *sounds* safe. And most of the alternatives (like conduits) hurt my head, so it is really *really* tempting to stay with lazy I/O and think I'm doing something safe.

On Thu, 11 Apr 2013 12:49:40 +1200 "Richard A. O'Keefe"
And most of the alternatives (like conduits) hurt my head
I've understood conduits when I've read the awesome pipes tutorial. http://hackage.haskell.org/packages/archive/pipes/3.2.0/doc/html/Control-Pro...

On Thu, Apr 11, 2013 at 12:49:40PM +1200, Richard A. O'Keefe wrote:
On 10/04/2013, at 2:45 PM,
wrote: ... unsafeInterleaveST is really unsafe ... import Control.Monad.ST.Lazy (runST) import Control.Monad.ST.Lazy.Unsafe (unsafeInterleaveST) import Data.STRef.Lazy
bad_ctx :: ((Bool,Bool) -> Bool) -> Bool bad_ctx body = body $ runST (do r <- newSTRef False x <- unsafeInterleaveST (writeSTRef r True >> return True) y <- readSTRef r return (x,y))
t1 = bad_ctx $ \(x,y) -> x == y -- True t2 = bad_ctx $ \(x,y) -> y == x -- False
[...]
I don't understand what it does or *how* it breaks this code. Does it involve side effects being reordered in some weird way?
As I understand it, unsafeInterleaveST defers the computation of x, so * if x is forced before y, then "writeSTRef r True" is run before "readSTRef r", thus the latter yields "True" * if y is forced before x, then "writeSTRef r True" is run after "readSTRef r", thus the latter yields "False" Tom

Lazy I/O *sounds* safe. And most of the alternatives (like conduits) hurt my head, so it is really *really* tempting to stay with lazy I/O and think I'm doing something safe.
Well, conduit was created for the sake of a web framework. I think all web frameworks, in whatever language, are quite complex, with a steep learning curve. As to alternatives -- this is may be the issue of familiarity or the availability of a nice library of combinators. Here is the example from my FLOPS talk: count the number of words "the" in a file. Lazy IO: run_countTHEL fname = readFile fname >>= print . length . filter (=="the") . words Iteratee IO: run_countTHEI fname = print =<< fileL fname >$< wordsL >$< filterL (=="the") >$< count_i The same structure of computation and the same size (and the same incrementality). But there is even a simple way (when it applies): generators. All languages that tried generators so far (starying from CLU and Icon) have used them to great success.
Derek Lowe has a list of "Things I Won't Work With". http://pipeline.corante.com/archives/things_i_wont_work_with/ This is a really fun site indeed.

On Fri, Apr 12, 2013 at 1:44 AM,
As to alternatives -- this is may be the issue of familiarity or the availability of a nice library of combinators.
It is certainly not just a matter of familiarity, nor availability. Rather, it's a matter of the number of names that are required in a working set. Any Haskell programmer, regardless of whether they use lazy I/O, will already know the meanings of (.), length, and filter. On the other hand, (>$<), count_i, and filterL are new names that must be learned from yet another library -- and much harder than learned, also kept in a mental working set of fluency. This ends up being a rather strong argument for lazy I/O. Not that the code is shorter, but that it (surprisingly) unifies ideas that would otherwise have required separate vocabulary. I'm not saying it's a sufficient argument, just that it's a much stronger one than familiarity, and that it's untrue that some better library might achieve the same thing without the negative consequences. (If you're curious, I do believe that it often is a sufficient argument in certain environments; I just don't think that's the kind of question that gets resolved in mailing list threads.) -- Chris Smith

On 04/10/2013 04:45 AM, oleg@okmij.org wrote:
...
And yet there exists a context that distinguishes x == y from y ==x. That is, there exists bad_ctx :: ((Bool,Bool) -> Bool) -> Bool such that
*R> bad_ctx $ \(x,y) -> x == y True *R> bad_ctx $ \(x,y) -> y == x False
I am not sure that the two statements are equivalent. Above you say that the context distinguishes x == y from y == x and below you say that it distinguishes them in one possible run.
The function unsafeInterleaveST ought to bear the same stigma as does unsafePerformIO. After all, both masquerade side-effecting computations as pure.
Potentially side-effecting computations. There are non-side-effecting uses of unsafePerformIO and unsafeInterleaveST, but verifying this is outside the reach of the type checker. If they have observable side-effects, I'd say the type system has been broken and it does not make sense to have a defined language semantics for those cases.
Hopefully even lazy IO will get recommended less, and with more caveats. (Lazy IO may be superficially convenient -- so are the absence of typing discipline and the presence of unrestricted mutation, as many people in Python/Ruby/Scheme etc worlds would like to argue.)
In essence, lazy IO provides unsafe constructs that are not named accordingly. (But IO is problematic in any case, partly because it depends on an ideal program being run on a real machine which is based on a less general model of computation.)
The complete code:
module R where
import Control.Monad.ST.Lazy (runST) import Control.Monad.ST.Lazy.Unsafe (unsafeInterleaveST) import Data.STRef.Lazy
bad_ctx :: ((Bool,Bool) -> Bool) -> Bool bad_ctx body = body $ runST (do r <- newSTRef False x <- unsafeInterleaveST (writeSTRef r True >> return True) y <- readSTRef r return (x,y))
t1 = bad_ctx $ \(x,y) -> x == y t2 = bad_ctx $ \(x,y) -> y == x
I think this context cannot be used to reliably distinguish x == y and y == x. Rather, the outcomes would be arbitrary/implementation defined/undefined in both cases.
participants (6)
-
Chris Smith
-
kudah
-
oleg@okmij.org
-
Richard A. O'Keefe
-
Timon Gehr
-
Tom Ellis