IO and State (was Re: [Haskell] Re: Global Variables and IO initializers)

Hello, Just wanted to point out that the suggested idea is not quite correct. (well that has to be quantiifed a bit, see bellow) Krasimir Angelov wrote:
--- Ben Rudiak-Gould
wrote: This is solved by merging the IO and ST monads, something that ought to be done anyway:
type IO = ST RealWorld type IORef a = Ref RealWorld a type STRef s a = Ref s a
It is not (should not be?) the case that IO = ST RealWord, as IO is not a state monad as we understand it. In a state monad, the changes to the state are all in the program, i.e. one can always point to the part of the program that modified the state. On the other hand, the "state" of the RealWorld can change on its own, without anything in the program affecting it. I guess this is similar to "volatile" state in C. For example, one might expect the following rule in a state monad: do x <- readSTRef r y <- readSTRef r f x y = do x <- readSTRef r f x x But this is not true for the IO monad, as for example reading a file twice does not guarantee that you will get the same result, even if no part of the program ever wrote to the file. Now the above law already doesn't hold when all GHC extensions are used, as when concurrency is present we cannot assume that nobody modified the state concurrently. As a result all pointers in GHC probably behave as if they were "volatile", which is not very nice. I think that inherently there are two concepts here, and I see no point in mixing them (even though they are already a bit mixed up, because of stToIO). The one is the usual sequential state that we know and love (and I think we use that a lot of the time). In the sequential state the above optimization is one of the laws specifying the monad. The other concept is that of "volatile" or concurrent state, and then the law doesn't hold. The two monads have a very similar inetrafce (reading and wrinting pointers, and creating new pointers) but the laws that the operations satisfy are different. -Iavor

At 10:38 08/11/04 -0800, Iavor S. Diatchki wrote:
It is not (should not be?) the case that IO = ST RealWord, as IO is not a state monad as we understand it. In a state monad, the changes to the state are all in the program, i.e. one can always point to the part of the program that modified the state. On the other hand, the "state" of the RealWorld can change on its own, without anything in the program affecting it. I guess this is similar to "volatile" state in C. For example, one might expect the following rule in a state monad:
do x <- readSTRef r y <- readSTRef r f x y = do x <- readSTRef r f x x
But this is not true for the IO monad, as for example reading a file twice does not guarantee that you will get the same result, even if no part of the program ever wrote to the file.
Now the above law already doesn't hold when all GHC extensions are used, as when concurrency is present we cannot assume that nobody modified the state concurrently. As a result all pointers in GHC probably behave as if they were "volatile", which is not very nice.
Eek! I find this most worrisome. And I'm not sure that I agree. I thought that part of the reason for having a monad that it was threaded in a useful way through the path of a *single* computation (expression evaluation). If concurrent activities can change that, then I sense that they're breaking something quite fundamental in the way Haskell should work. e.g. in a sequence like: v :: SomeMonad v = do { e1 ; e2 ; e3 } Then I think that exactly the monad created by e1 is passed to e2, and the result of e2 passed to e3, without any visible external interference under any circumstance. Concurrency, as I understand it should apply to Haskell, would allow different elements of that computation to be performed in different threads/processes, but the overall result of the computation should not be changeable. Put another way, the graph reduction model for evaluating a Haskell program should not change, just the mechanics actual processes (or processors) actually perform the reduction steps. Or am I really overlooking something here? #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Hello, Concurrency in Haskell (technically GHC) is much like concurrency in other languages, in that you fork off threads and do stuff with them, etc. I think you are perhaps thinking of another kind of concurrency, where different redexes in a term are evaluted simultaneously? Here is an example to illustrate what I was talking about (the point about stToIO making that one law unsafe)
import Control.Monad.ST import Data.STRef import Control.Concurrent import Control.Monad(when)
reader :: STRef r Int -> ST r () reader r = do x <- readSTRef r y <- readSTRef r when (x == y) (reader r)
writer :: Int -> STRef r Int -> ST r () writer n r = do writeSTRef r n writer (n+1) r
main :: IO () main = do r <- stToIO (newSTRef 0) forkIO (stToIO (writer 1 r)) stToIO (reader r)
In GHC the whole program stops when the main thread exits. So if the law I was talking about holds, this program should never terminate, as it will forever loop in 'reader'. However since there is a concurrent thread running that can modify the state, if 'reader' is interrupted between the two readSTRefs we will get different values for 'x' and 'y' and 'reader' will stop. I tried that in GHC and it stops after a little while, so the law does not hold. What is interesting is that I think the law does hold if we execute an ST computation using runST, so at least in principle GHC could perform this optimiziation in such situations. I think the law holds then, as I think no reference can escape to concurrent threads, as if they did their region parameter would become "RealWorld", and so the computation could not be runSTed. -Iavor Graham Klyne wrote:
At 10:38 08/11/04 -0800, Iavor S. Diatchki wrote:
... Now the above law already doesn't hold when all GHC extensions are used, as when concurrency is present we cannot assume that nobody modified the state concurrently. As a result all pointers in GHC probably behave as if they were "volatile", which is not very nice.
Eek! I find this most worrisome. And I'm not sure that I agree.
I thought that part of the reason for having a monad that it was threaded in a useful way through the path of a *single* computation (expression evaluation). If concurrent activities can change that, then I sense that they're breaking something quite fundamental in the way Haskell should work.
e.g. in a sequence like:
v :: SomeMonad v = do { e1 ; e2 ; e3 }
Then I think that exactly the monad created by e1 is passed to e2, and the result of e2 passed to e3, without any visible external interference under any circumstance. Concurrency, as I understand it should apply to Haskell, would allow different elements of that computation to be performed in different threads/processes, but the overall result of the computation should not be changeable. Put another way, the graph reduction model for evaluating a Haskell program should not change, just the mechanics actual processes (or processors) actually perform the reduction steps.
Or am I really overlooking something here?
#g
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

At 10:35 10/11/04 -0800, Iavor S. Diatchki wrote:
Hello, Concurrency in Haskell (technically GHC) is much like concurrency in other languages, in that you fork off threads and do stuff with them, etc. I think you are perhaps thinking of another kind of concurrency, where different redexes in a term are evaluted simultaneously?
Indeed, I was. I guess this begs the question of why one wants concurrency. I can, offhand, see two reasons: 1. performance, especially in a multiprocessor environment. In this case, simultaneous evaluation of redexes is the goal, without any visible change in program semantics. (I see this as a possible way of achieving very high degrees of concurrency in an application, as individual concurrent activities can be very lightweight.) 2. coordination with multiple external processes, or between distinct computations. In this case, I think all the concurrent activities would need to be (explicitly) in an IO monad. I'm not sufficiently familiar with the specific interfaces used in your example to comment in detail, but when you say:
In GHC the whole program stops when the main thread exits. So if the law I was talking about holds, this program should never terminate, as it will forever loop in 'reader'. However since there is a concurrent thread running that can modify the state, if 'reader' is interrupted between the two readSTRefs we will get different values for 'x' and 'y' and 'reader' will stop. I tried that in GHC and it stops after a little while, so the law does not hold.
I think there may be a problem with the interaction between forking semantics. and (non-)strictness in Haskell. Essentially, (reader r) is a non-terminating computation. If the main program requires the value of (reader r) to be fully evaluated, then I think the value of program itself must be undefined (_|_). But if it never actually needs the value, it should be whatever the rest of the program does return. If the state value is to be shared between the two branches of a fork, then I think it MUST be embedded in IO for the expected language semantics to be properly honoured. IO, as I understand it, is *the* mechanism provided by Haskell that allows evaluation of an expression to depend on some activity that occurs outside that evaluation. #g --
Here is an example to illustrate what I was talking about (the point about stToIO making that one law unsafe)
import Control.Monad.ST import Data.STRef import Control.Concurrent import Control.Monad(when)
reader :: STRef r Int -> ST r () reader r = do x <- readSTRef r y <- readSTRef r when (x == y) (reader r)
writer :: Int -> STRef r Int -> ST r () writer n r = do writeSTRef r n writer (n+1) r
main :: IO () main = do r <- stToIO (newSTRef 0) forkIO (stToIO (writer 1 r)) stToIO (reader r)
In GHC the whole program stops when the main thread exits. So if the law I was talking about holds, this program should never terminate, as it will forever loop in 'reader'. However since there is a concurrent thread running that can modify the state, if 'reader' is interrupted between the two readSTRefs we will get different values for 'x' and 'y' and 'reader' will stop. I tried that in GHC and it stops after a little while, so the law does not hold.
What is interesting is that I think the law does hold if we execute an ST computation using runST, so at least in principle GHC could perform this optimiziation in such situations. I think the law holds then, as I think no reference can escape to concurrent threads, as if they did their region parameter would become "RealWorld", and so the computation could not be runSTed.
-Iavor
Graham Klyne wrote:
At 10:38 08/11/04 -0800, Iavor S. Diatchki wrote:
... Now the above law already doesn't hold when all GHC extensions are used, as when concurrency is present we cannot assume that nobody modified the state concurrently. As a result all pointers in GHC probably behave as if they were "volatile", which is not very nice.
Eek! I find this most worrisome. And I'm not sure that I agree.
I thought that part of the reason for having a monad that it was threaded in a useful way through the path of a *single* computation (expression evaluation). If concurrent activities can change that, then I sense that they're breaking something quite fundamental in the way Haskell should work.
e.g. in a sequence like:
v :: SomeMonad v = do { e1 ; e2 ; e3 }
Then I think that exactly the monad created by e1 is passed to e2, and the result of e2 passed to e3, without any visible external interference under any circumstance. Concurrency, as I understand it should apply to Haskell, would allow different elements of that computation to be performed in different threads/processes, but the overall result of the computation should not be changeable. Put another way, the graph reduction model for evaluating a Haskell program should not change, just the mechanics actual processes (or processors) actually perform the reduction steps.
Or am I really overlooking something here?
#g
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Iavor S. Diatchki wrote:
In GHC the whole program stops when the main thread exits. So if the law I was talking about holds, this program should never terminate, as it will forever loop in 'reader'. However since there is a concurrent thread running that can modify the state, if 'reader' is interrupted between the two readSTRefs we will get different values for 'x' and 'y' and 'reader' will stop. I tried that in GHC and it stops after a little while, so the law does not hold.
I would say that the law holds in one direction and not the other. It's safe to replace do x <- readSTRef r y <- readSTRef r with do x <- readSTRef r let y = x but not the other way around. The semantics for concurrency don't guarantee that a task switch will /never/ happen between two calls of readIORef, but they don't guarantee that a task switch will /ever/ happen, either, so relying on your sample application terminating is non-portable. Therefore optimizing it in such a way that it never terminates is safe. If it's important to distinguish ST actions which can be treated as IO from others which can't, you can introduce a type class class PrivateState s where {} and a new function newPureSTRef :: forall a s. PrivateState s => a -> ST s (Ref s a) newPureSTRef = newRef There don't need to be any instances of PrivateState; the only important thing is that RealWorld isn't an instance. Any code which uses a Ref created by newPureSTRef is restricted to the PrivateState space and therefore can't be used as part of an IO action. runST would be given the more general type runST :: forall a. (forall s. PrivateState s => ST s a) -> a which would work for pure ST and ordinary (IO-embeddable) ST actions. stToIO would retain its current type, and so would not work with pure ST actions. Your equivalence would apply in both directions to any action of type (forall s. PrivateState s => ST s a). I don't think this careful approach is necessary in practice, but I hope the fact that it can be done [*] makes the merging of ST and IO look more reasonable. -- Ben [*] Except how would you prevent the user from declaring an instance for PrivateState RealWorld? Oh well. It can be done /in principle/. :-)

Hello, Ben Rudiak-Gould wrote:
... I would say that the law holds in one direction and not the other. It's safe to replace
do x <- readSTRef r y <- readSTRef r
with
do x <- readSTRef r let y = x
but not the other way around.
How can things be equal the one way and not the other? I must be misunderstanding what you are saying... As my little program showed these two pieces of code are not the same, so they cannot be arbitrarily replaced anywhere in the program. In the previous email I was trying to exaplin that I think they are the same as long as they occur in the context of runST. Of course there are many other contexts where the two are equivalent... The important thing is that there is a context that can tell them apart, and so they are not the same.
The semantics for concurrency don't guarantee that a task switch will /never/ happen between two calls of readIORef, but they don't guarantee that a task switch will /ever/ happen, either, so relying on your sample application terminating is non-portable. Therefore optimizing it in such a way that it never terminates is safe.
My example had nothing to do with non-termination. It illustrated that with the one piece of code, a boolean value in the program will be always True, while with the other it could become False (this is when the program stopped, but you cold of course do other things too). Thus, depending on which piece of code you use your program could have a completely different behaviour, and so the law I was talking about does not hold. What concurrency semantics are you reffering to? GHC provides preemtive concurrency, Hugs provides non-preemptive concurrency. The two are quite different. I was talking about GHC, as I think this is the kind of concurrency people often use. Cooperative concurency is also useful sometimes, but it is quite a different beast. -Iavor

Iavor S. Diatchki wrote:
Ben Rudiak-Gould wrote:
I would say that the law holds in one direction and not the other. [...]
How can things be equal the one way and not the other?
Saying that two things are equal means (in this context) that either can be replaced by the other without changing the semantics. In other words, the first can be replaced by the second without changing the semantics, and the second can be replaced by the first without changing the semantics. One of those replacements is valid and the other invalid (I argue). Of course it follows that the two things are not equal -- i.e. I agree with you. :-)
The semantics for concurrency don't guarantee that a task switch will /never/ happen between two calls of readIORef, but they don't guarantee that a task switch will /ever/ happen, either, so relying on your sample application terminating is non-portable. Therefore optimizing it in such a way that it never terminates is safe.
My example had nothing to do with non-termination.
Nor did my response. I could change the ending to: "[...] so relying on (x == y) ever being False is non-portable. Therefore optimizing it in such a way that it is never False is safe."
It illustrated that with the one piece of code, a boolean value in the program will be always True, while with the other it could become False [...]
Right, in the first piece of code the expression is guaranteed to be True, while in the second there are no guarantees -- the value could be True or False each time. Therefore it's safe to transform the second expression into the first during optimization, because "always True" is a special case of "anything at all". -- Ben

Hello, I see now what you meant, thanks for the explanation. I find the argument a bit disturbing, as it seems to imply that it is OK for the compiler to produce code without any context switches at all (after all none of the context swicthes are guaranteed to happen :-). I guess this is not true when MVar's and other syncronization primitives are taken into account, e.g. the semantics could say that a context switch will definately occur when a thread blocks on an empty MVar. In any case, the above comments further emphasizes the original point of my post, namely that the IO monad is not a state monad as we know it: in a state monad there is a close relationship between reads and writes to the state, as only the program modifies the state. In the IO monad, the two are much more unrealted, as the real world can change on its own, and thus there are no guarantees that there is any realtionship between reads and writes. A simplistic example to illustrate my point might be that reading a file twice does not guarantee that we get the same contents, or writing to a file, and then reading it, does not guarantee that we get what we wrote. -Iavor Ben Rudiak-Gould wrote:
... Saying that two things are equal means (in this context) that either can be replaced by the other without changing the semantics. In other words, the first can be replaced by the second without changing the semantics, and the second can be replaced by the first without changing the semantics. One of those replacements is valid and the other invalid (I argue). Of course it follows that the two things are not equal -- i.e. I agree with you. :-) ... Nor did my response. I could change the ending to: "[...] so relying on (x == y) ever being False is non-portable. Therefore optimizing it in such a way that it is never False is safe." ... Right, in the first piece of code the expression is guaranteed to be True, while in the second there are no guarantees -- the value could be True or False each time. Therefore it's safe to transform the second expression into the first during optimization, because "always True" is a special case of "anything at all".

"Iavor S. Diatchki"
I find the argument a bit disturbing, as it seems to imply that it is OK for the compiler to produce code without any context switches at all
Note that in this case if the main program doesn't explicitly block on MVars, I/O nor timeout, then finalizers will never be run and would be kept in memory despite garbage collection. So such implementation would not be able to run certain programs which create lots of finalized objects, even if almost all of them become dead soon after creation. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk wrote:
"Iavor S. Diatchki"
writes: I find the argument a bit disturbing, as it seems to imply that it is OK for the compiler to produce code without any context switches at all
Note that in this case if the main program doesn't explicitly block on MVars, I/O nor timeout, then finalizers will never be run and would be kept in memory despite garbage collection. So such implementation would not be able to run certain programs which create lots of finalized objects, even if almost all of them become dead soon after creation.
This is little different from the situation with GC-based finalization in other languages (cf Java). This is why finalizers are generally viewed as a backstop to clean up resources which were accidentally left un-freed, rather than a universal solution to cleanup. -Jan-Willem Maessen
participants (6)
-
Ben Rudiak-Gould
-
Graham Klyne
-
Graham Klyne
-
Iavor S. Diatchki
-
Jan-Willem Maessen
-
Marcin 'Qrczak' Kowalczyk