RE: [Haskell-cafe] One-shot? (was: Global variables and stuff)

On 09 November 2004 11:54, Graham Klyne wrote:
I've not been following the Global variables debate too closely, it seeming to have something of a religious wars flavour, but I noticed that an example being proposed was how to achieve a "one shot" execution. Here's something I did when working on modifications to the HaXML parser:
[[ -- Memoization of withSocketsDo to prevent multiple calls. -- (cf. http://tangentsoft.net/wskfaq/articles/lame-list.html) socketsInitialized :: Bool socketsInitialized = unsafePerformIO ( withSocketsDo ( return True ) ) ]]
Does it work as I think it does? ARe there any problems I'm overlooking?
You should add {-# NOINLINE socketsInitialized #-} to be on the safe side, but in practice GHC won't inline it anyway. The pragma would be good documentation though. Cheers, Simon

I have written a small library for supporting one-shot without using unsfePerformIO... The library uses SYSV semaphores under linux to make sure the functional argument of "once" is only ever run once. It uses the ProcessID as the key for the semaphore, so will even enforce the once-only property accross different Haskell threads. Some semaphore functions are also exported, allowing other constraints to be used (for example, once only over multiple processes by using a constant ID rather than the processID. I have attached the source for the library incase anyone is interested. If people think it is useful I could put it up on a website (let me know). Also attached is an example, which can be compiled with: ghc -o test NamedSem.hs Test.hs -package posix Keean.

What about the following? It does use unsafePerformIO, but only to wrap newMVar in this specific case. once :: Typeable a => IO a -> IO a once m = let {-# NOINLINE r #-} r = unsafePerformIO (newMVar Nothing) in do y <- takeMVar r x <- case y of Nothing -> m Just x -> return x putMVar r (Just x) return x The "Typeable" constraint forces the return value to be monomorphic, which prevents the following from happening (the first line doesn't type check under the constraint):
let ref = once (newIORef []) :t ref ref :: forall a. IO (IORef [a]) ref >>= flip writeIORef "foo" ref >>= readIORef >>= (\(x::[Bool]) -> print x) [Illegal instruction
Additionally, I'd like to repeat the point that "once" (whether
defined my way or Keean's) is
not just a consequence of module initialization; it can actually
replace it in most cases!
For example:
myRef :: IO (IORef Char)
myRef = once (newIORef 'a')
readMyRef :: IO Char
readMyRef = myRef >>= readIORef
writeMyRef :: Char -> IO ()
writeMyRef c = myRef >>= flip writeIORef c
A library interface might consist of readMyRef and writeMyRef, while hiding
myRef itself from the user. However, what happens in IO stays in the
IO monad; myRef is an action, so the IORef is not initialized until
the first time
that one of read/writeMyRef is called. Indeed, any action wrapped by once
will only be run in the context of the IO monad. IMO, this is the primary
advantage of a function like once over the proposal for top-level
x <- someAction
where the exact time someAction is evaluated is unspecified.
Are there any applications of module initialization for which once
does not suffice?
-Judah
On Wed, 10 Nov 2004 17:11:31 +0000, Keean Schupke
I have written a small library for supporting one-shot without using unsfePerformIO... The library uses SYSV semaphores under linux to make sure the functional argument of "once" is only ever run once. It uses the ProcessID as the key for the semaphore, so will even enforce the once-only property accross different Haskell threads. Some semaphore functions are also exported, allowing other constraints to be used (for example, once only over multiple processes by using a constant ID rather than the processID.
I have attached the source for the library incase anyone is interested. If people think it is useful I could put it up on a website (let me know). Also attached is an example, which can be compiled with:
ghc -o test NamedSem.hs Test.hs -package posix
Keean.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

OK, I'll play again.. On Wednesday 10 Nov 2004 4:39 pm, Judah Jacobson wrote:
What about the following? It does use unsafePerformIO, but only to wrap newMVar in this specific case.
once :: Typeable a => IO a -> IO a once m = let {-# NOINLINE r #-} r = unsafePerformIO (newMVar Nothing) in do y <- takeMVar r x <- case y of Nothing -> m Just x -> return x putMVar r (Just x) return x
My initial feeling is that this kind of swizzles the problem around a bit and leaves us right back where we started. Pretty much any use of unsafePerformIO is unsound, though whether or not this has any bad consequences probably depends a lot on the context it's used what transformations and optimisations the compiler implements. But I'd really like to avoid using it at all if possible. Unless I'm missing something, once is still unsafe (see below..)
Additionally, I'd like to repeat the point that "once" (whether defined my way or Keean's) is not just a consequence of module initialization; it can actually replace it in most cases!
Hmm, must of missed that..
For example:
myRef :: IO (IORef Char) myRef = once (newIORef 'a')
readMyRef :: IO Char readMyRef = myRef >>= readIORef
writeMyRef :: Char -> IO () writeMyRef c = myRef >>= flip writeIORef c
Suppose I had.. myOtherRef :: IO (IORef Char) myOtherRef = once (newIORef 'a') There's nothing to stop the compiler doing CSE and producing, in effect.. commonRef :: IO (IORef Char) commonRef = once (newIORef 'a') .. followed by substitution of all occurrences of myRef and myOtherRef with commonRef. I think this would break your programs.
A library interface might consist of readMyRef and writeMyRef, while hiding myRef itself from the user. However, what happens in IO stays in the IO monad; myRef is an action, so the IORef is not initialized until the first time that one of read/writeMyRef is called. Indeed, any action wrapped by once will only be run in the context of the IO monad. IMO, this is the primary advantage of a function like once over the proposal for top-level x <- someAction where the exact time someAction is evaluated is unspecified.
AFAICS this is only true if the argument to once is something like (newIORef 'a), in which case the same is also true of the x <- newIORef 'a' solution. Also my own pet solution (SafeIO monad) and Koen's CIO monad are intended to make it impossible to use constructions whose initial value depends on when construction occurs. The intention of the (x <- constructor) proposal is more than just syntactic sugar for (x = unsafePerformIO constructor). I think something like this really is necessary for a proper solution, though not everybody agrees what that should be at the moment (and some don't seem agree that there's a problem in the first place :-) Hope I haven't missed anything. (I'm sure you'll let me know if you think I'm being stupid :-) Regards -- Adrian Hey

Adrian Hey wrote:
Suppose I had..
myOtherRef :: IO (IORef Char) myOtherRef = once (newIORef 'a')
There's nothing to stop the compiler doing CSE and producing, in effect..
commonRef :: IO (IORef Char) commonRef = once (newIORef 'a')
.. followed by substitution of all occurrences of myRef and myOtherRef with commonRef. I think this would break your programs.
Well thats certainly not the case with the solution I posted. There are no unsafe operations at all (Well the FFI imports are marked 'unsafe' but that means it is unsafe for them to call back into Haskell)... The solution I posted uses no unsafePerformIO, and the type of action allowed is limited to "IO ()". In other word all results must be returned as side effects, and the computation either happens (if its the first time) or doesn't happen... Also the semaphore method is very flexible, you can run things exactly twice, or you can reset the semaphore and allow the next init to happen (sort of closing the library and opening it again)... Also once-ness can be restricted accross many domais, thread, process, process-group, multiple sequential executions, once and for all (may only run once on any machine)... Of course in this case it ties up one semaphore indefiniely - but hey its a single bit, and there are 65536x32 semaphores available, so its not really going to cause a problem. Posix(1b) semaphores with string names would be even better (and have a slightly neater interface) but linux does not seem to support sem_open, so I have had to use SYSV semaphores. Keean.

Hi, Here's another completely safe (and simpler way) to limit a computation to only happen once: once' :: IO () -> IO () once' f = do k <- getProcessID a <- getEnv (showString "MyApp.Main" $ show k) case a of Just _ -> return () _ -> do f setEnv (showString "MyApp.Main" $ show k) "" False Actually both this and the semaphore example show that there is probably an alternative to allowing top-level '<-' type definitions - and that would be to have named-MVars in Haskell. It would be quite easy to code these as a small C library - and then FFI import the bindings to Haskell. I don't know whether from a 'purists' point of view whether this represents anything better than module-initialisations - but it does remove the diamond-inheritance style problem you get (if B imports A and C imports A and Main imports B and C, does A's init get run twice? are there two copies of the variables, each initialised once, or one copy that gets initialised twice?). But either way the idea could be implemented without requiring changes to the language spec or the compilers, and would just be a library which you could use. Keean.

Keean, As far as I can tell, both your solutions to the "one-shot" problem require that: (a) the expression to be one-shotted is in the IO monad. That seems reasonable, since why else does one care (semantically speaking)? (b) they depend on the host operating system platform (semaphores, process id, environment variables) rather than pure Haskell language features. Wouldn't it be easier to simply define "once" as a common Haskell library function? #g -- At 23:36 10/11/04 +0000, Keean Schupke wrote:
Hi, Here's another completely safe (and simpler way) to limit a computation to only happen once:
once' :: IO () -> IO () once' f = do k <- getProcessID a <- getEnv (showString "MyApp.Main" $ show k) case a of Just _ -> return () _ -> do f setEnv (showString "MyApp.Main" $ show k) "" False
Actually both this and the semaphore example show that there is probably an alternative to allowing top-level '<-' type definitions - and that would be to have named-MVars in Haskell. It would be quite easy to code these as a small C library - and then FFI import the bindings to Haskell. I don't know whether from a 'purists' point of view whether this represents anything better than module-initialisations - but it does remove the diamond-inheritance style problem you get (if B imports A and C imports A and Main imports B and C, does A's init get run twice? are there two copies of the variables, each initialised once, or one copy that gets initialised twice?). But either way the idea could be implemented without requiring changes to the language spec or the compilers, and would just be a library which you could use.
Keean.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne wrote:
Keean,
As far as I can tell, both your solutions to the "one-shot" problem require that:
(a) the expression to be one-shotted is in the IO monad. That seems reasonable, since why else does one care (semantically speaking)?
(b) they depend on the host operating system platform (semaphores, process id, environment variables) rather than pure Haskell language features.
Wouldn't it be easier to simply define "once" as a common Haskell library function?
#g --
Erm, it is a library function (I provided the NamedSem library as an attachment)... Are you suggesting it would be nice to be able to do this without talking to the OS? Remember a process is an operating system level identity... The boundries of a process are controlled by the OS not the language - therefore I think it is entirely appropriate to use it for this. Also see my proposal for NamedMVars, This would move the concept of this solution into a purely Haskell space, and would not need to communicate with the OS. The NamedMVar library could be implemented with a bit of C and Haskell in a library module - without changes to the language spec and the compiler. I guess I was relly looking for comments on the general technique to detemine if it is worth my while writing this (NamedMVar) library... Keean.

At 11:31 11/11/04 +0000, Keean Schupke wrote:
Wouldn't it be easier to simply define "once" as a common Haskell library function?
Erm, it is a library function (I provided the NamedSem library as an attachment)... Are you suggesting it would be nice to be able to do this without talking to the OS?
OK, I didn't sufficiently emphasize *common* library function. Maybe I should have said "standard". I'm not suggesting that the calling program should not use the operating system, but that it should be insulated from the details of said OS. Not all OSes have process IDs or named semaphores. This discussion has shown many ways to implement "once", and which is best may depend upon the underlying OS. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne wrote:
Wouldn't it be easier to simply define "once" as a common Haskell library function?
Depends on the type and the expected semantics. As Adrian Hey already pointed out, (once :: IO a -> IO a) with the obvious semantics is never going to be safe, because it's just not the case that x = once (newIORef ()) y = x has the same intended meaning as x = once (newIORef ()) y = once (newIORef ()) No amount of compiler-specific magic is going to fix this. On the other hand, these are perfectly safe: once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users. In any case there would need to be support for different scopes: perProcess :: String -> IO a -> IO a perThread :: String -> IO a -> IO a perMachine :: String -> IO a -> IO a I suppose you could add perType :: Typeable a => IO a -> IO a with the stipulation that types in different processes are distinct (which seems like the only safe assumption). -- Ben

At 12:27 11/11/04 +0000, Ben Rudiak-Gould wrote:
Graham Klyne wrote:
Wouldn't it be easier to simply define "once" as a common Haskell library function?
Depends on the type and the expected semantics. As Adrian Hey already pointed out, (once :: IO a -> IO a) with the obvious semantics is never going to be safe, because it's just not the case that
x = once (newIORef ()) y = x
has the same intended meaning as
x = once (newIORef ()) y = once (newIORef ())
No amount of compiler-specific magic is going to fix this.
Ah, yes, I take the point now. Isn't this generally the case for any value in the IO monad? (Brushing a murky area of equivalence; the same IO computation used twice may yield different results, so I'm not clear to what extent it is meaningful to say that any IO value is the same as any other, including itself, in any observable sense.) #g --
On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
In any case there would need to be support for different scopes:
perProcess :: String -> IO a -> IO a perThread :: String -> IO a -> IO a perMachine :: String -> IO a -> IO a
I suppose you could add
perType :: Typeable a => IO a -> IO a
with the stipulation that types in different processes are distinct (which seems like the only safe assumption).
-- Ben
------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne wrote:
At 12:27 11/11/04 +0000, Ben Rudiak-Gould wrote: [..]
going to be safe, because it's just not the case that
x = once (newIORef ()) y = x
has the same intended meaning as
x = once (newIORef ()) y = once (newIORef ())
No amount of compiler-specific magic is going to fix this.
Ah, yes, I take the point now.
Isn't this generally the case for any value in the IO monad? (Brushing a murky area of equivalence; the same IO computation used twice may yield different results, so I'm not clear to what extent it is meaningful to say that any IO value is the same as any other, including itself, in any observable sense.)
No. "getChar" is always "the IO operation that reads a character from stdin". You can always substitute one instance of "getChar" for another; you can even say "foo = getChar" and substitute "foo" for every occurrence of "getChar". A value of type IO a is a *computation*; its result may change, but the computation itself cannot. --KW 8-)

At 16:07 11/11/04 +0000, Keith Wansbrough wrote:
Graham Klyne wrote:
At 12:27 11/11/04 +0000, Ben Rudiak-Gould wrote: [..]
going to be safe, because it's just not the case that
x = once (newIORef ()) y = x
has the same intended meaning as
x = once (newIORef ()) y = once (newIORef ())
No amount of compiler-specific magic is going to fix this.
Ah, yes, I take the point now.
Isn't this generally the case for any value in the IO monad? (Brushing a murky area of equivalence; the same IO computation used twice may yield different results, so I'm not clear to what extent it is meaningful to say that any IO value is the same as any other, including itself, in any observable sense.)
No. "getChar" is always "the IO operation that reads a character from stdin". You can always substitute one instance of "getChar" for another; you can even say "foo = getChar" and substitute "foo" for every occurrence of "getChar". A value of type IO a is a *computation*; its result may change, but the computation itself cannot.
So you say (and I do agree). But how can I *observe* that they are the same? #g -- ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

So you say (and I do agree). But how can I *observe* that they are the same?
Well, not with a single program run (obviously). But it is the case that for any program P and input sequence X (i.e., keys pressed): running P with X and running {foo = getChar; P'} with X (where P' is P with all occurrences of getChar replaced by foo) will behave identically. --KW 8-)

On Thu, 11 Nov 2004 12:27:17 +0000, Ben Rudiak-Gould
On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
Reflecting on the matter, I don't think that oncePerString is type-safe. For example, it allows us to create the following: ref :: IO (IORef a) ref = oncePerString "foo" (newIORef undefined) Here's an example in which we subvert the type system (and probably crash the program) by writing a String and reading an Int from the same IORef: do ref >>= writeIORef ("foo") (x :: Int) <- ref >>= readIORef print x This is similar to the reason for ML's value monomorphism restriction. In contrast, oncePerType preserves monomorphism nicely, since all instances of Typeable are monomorphic. Thoughts? Am I missing something? -Judah

I think you are right... The only safe operation I can see for a one-time init is type IO (). All results have to be returned via side effects. Hence with my named-MVar proposal the first execution of the init function initialises certain named-MVars, and subsequent executions do nothing at all. The functions in the library would use the names-MVars directly. Therefore the once function in the NamedSem library is: once :: IO () -> IO () Keean. Judah Jacobson wrote:
On Thu, 11 Nov 2004 12:27:17 +0000, Ben Rudiak-Gould
wrote: On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
Reflecting on the matter, I don't think that oncePerString is type-safe. For example, it allows us to create the following:
ref :: IO (IORef a) ref = oncePerString "foo" (newIORef undefined)
Here's an example in which we subvert the type system (and probably crash the program) by writing a String and reading an Int from the same IORef:
do ref >>= writeIORef ("foo") (x :: Int) <- ref >>= readIORef print x
This is similar to the reason for ML's value monomorphism restriction. In contrast, oncePerType preserves monomorphism nicely, since all instances of Typeable are monomorphic.
Thoughts? Am I missing something? -Judah _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 11 Nov 2004 20:14:24 +0000, Keean Schupke
I think you are right... The only safe operation I can see for a one-time init is type IO (). All results have to be returned via side effects. Hence with my named-MVar proposal the first execution of the init function initialises certain named-MVars, and subsequent executions do nothing at all. The functions in the library would use the names-MVars directly. Therefore the once function in the NamedSem library is:
once :: IO () -> IO ()
Keean.
Actually, I don't see anything wrong on the face of it with oncePerType :: Typeable a => IO a -> IO a since the only instances of Typeable are monomorphic. Indeed, the implementation seems pretty straightforward: store the results of already-run computations as Dynamic values in a global dictionary, keyed by TypeRep. -Judah

Actually, I don't see anything wrong on the face of it with oncePerType :: Typeable a => IO a -> IO a since the only instances of Typeable are monomorphic. Indeed, the implementation seems pretty straightforward: store the results of already-run computations as Dynamic values in a global dictionary, keyed by TypeRep.
What happens if more than one 'once' function returns the same type? I guess you could wrap the types in a unique constructor. Infact you could use the HList library to produce a type indexed list with a unqiue type constraint that can be enforced at compile time. The version with the String key also seems interesting, as the key could be the name of the function being memoised, which has to be unique. Template-Haskell could be used to generate the reified string from the function name, it would end up like (in the new TH syntax): a <- $(once 'function) a b c Keean.

On Thursday 11 Nov 2004 12:27 pm, Ben Rudiak-Gould wrote:
On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
Having taken a bit of time to look at this, I have to say that IMO saying they are "perfectly safe" is over stating things a bit :-) The only one that is perfectly safe is the first, but as you say, is useless (at least for the purposes under discussion) without the top-level <- extension. AFAICS the other two are unsound hacks. So it seems to me that either the top-level <- extension (in one form or another) really is necessary, or that top level TWI's are unnecessary. The latter is probably true, in a strict technical sense. But I can't see a way to live without them and keep modularity. In any case, I don't think there's any reason to force programmers "wear the hair shirt" in this respect (other than dogma and superstitious fear about the evils of "global variables" :-) Regards -- Adrian Hey

Adrian Hey wrote:
The latter is probably true, in a strict technical sense. But I can't see a way to live without them and keep modularity. In any case, I don't think there's any reason to force programmers "wear the hair shirt" in this respect (other than dogma and superstitious fear about the evils of "global variables" :-)
I agree about not wearing a hair-shirt, but how do you propose to solve the multiple import problem: B imports A, C imports A, D imports B & C. Now the top level inits (a <- computation) in A, do they happen once, defining the same init A.a, do they happen twice, perhaps initialising B.A.a and C.A.a, do they happen twice, meaning that 'a' may have different values depending on whether it is accessed from B or C? for example: module A where a <- newChan 0 module B where import A b <- do {writeChan A.a 7;return ()} module C import A c <- do {writeChan A.a 6;return ()} module D import A import B d <- readChan A.a does this mean the same as: module D' import B import A d <- readChan A.a Should values really depend on the order of includes? Even if you limit things to just newChan in top level '<-' you still don't know if A.a in B the same A.a in C. Perhaps it is enough to say A.a only exists once no matter how many times it is directly or indirectly imported? Keean.

Should values really depend on the order of includes? Even if you limit things to just newChan in top level '<-' you still don't know if A.a in B the same A.a in C. Perhaps it is enough to say A.a only exists once no matter how many times it is directly or indirectly imported?
This strikes me as the only sane thing to do. Are there any reasons you might want C.A.a to be different than B.A.a? In addition, perhaps we should require that modules using TWIs not have cicular dependancies. Then all init actions can be topo sorted by dependencies. Would those restrictions solve the problems that have been floating aroud?

This still has a problem. Lets say B implements some useful function that relies on A. Now also C implements some different useful function and also relies on A in its implementation. If there is only one A.a then using both B and C features in the same code will potentally break both B and C. The only thing that makes sense in this case is that the A imported by B is distinct from the A imported by C. The problem as far as I can see it is that you can create sensible examples that require the behaviour to be one way or the other. Perhaps one solution is to allow top level '<-' but to not allow them to be exported? Keean. Robert Dockins wrote:
Should values really depend on the order of includes? Even if you limit things to just newChan in top level '<-' you still don't know if A.a in B the same A.a in C. Perhaps it is enough to say A.a only exists once no matter how many times it is directly or indirectly imported?
This strikes me as the only sane thing to do. Are there any reasons you might want C.A.a to be different than B.A.a?
In addition, perhaps we should require that modules using TWIs not have cicular dependancies. Then all init actions can be topo sorted by dependencies.
Would those restrictions solve the problems that have been floating aroud?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Friday 12 Nov 2004 3:20 pm, Keean Schupke wrote:
Adrian Hey wrote:
The latter is probably true, in a strict technical sense. But I can't see a way to live without them and keep modularity. In any case, I don't think there's any reason to force programmers "wear the hair shirt" in this respect (other than dogma and superstitious fear about the evils of "global variables" :-)
I agree about not wearing a hair-shirt, but how do you propose to solve the multiple import problem: B imports A, C imports A, D imports B & C. Now the top level inits (a <- computation) in A, do they happen once, defining the same init A.a, do they happen twice, perhaps initialising B.A.a and C.A.a, do they happen twice, meaning that 'a' may have different values depending on whether it is accessed from B or C? for example:
I'm not sure I understand what problem you think there is. Are the inits you're talking about module inits? If so, I don't think there's a problem, for several reasons. The idea under discussion is that a top level (x <- newThing) should be lazy, (no action at all occurs until value of x is demanded). IOW, it's exactly the same as the current unsafePerformIO hack, but not unsafe because the compiler knows the semantics. So there is no implied "module initialisation".
module A where a <- newChan 0
module B where import A b <- do {writeChan A.a 7;return ()}
module C import A c <- do {writeChan A.a 6;return ()}
module D import A import B
d <- readChan A.a
does this mean the same as:
module D' import B import A
d <- readChan A.a
Should values really depend on the order of includes?
No, and they don't. Firstly, since the values of neither b or c are demanded, no writes will occur. Secondly, ordering is not dependent on import ordering, it's depencency ordering only. Assuming laziness, actions are performed as and when the corresponding TWI's are required (perhaps never). Thirdly, ordering doesn't matter anyway. The point of the restricted monad proposal (SafeIO or whatever), was to address precisely this ordering issue. With SafeIO ordering doesn't matter because you cannot perform any IO. All you can do is create IORefs (MVars etc..). You can't read or write them, nor can you do any other IO. So assuming we're using this monad for <- bindings, you wouldn't be able to use readChan or writeChan in any case. Regards -- Adrian Hey

I'm not sure I understand what problem you think there is. Are the inits you're talking about module inits? If so, I don't think there's a problem, for several reasons.
The idea under discussion is that a top level (x <- newThing) should be lazy, (no action at all occurs until value of x is demanded). IOW, it's exactly the same as the current unsafePerformIO hack, but not unsafe because the compiler knows the semantics. So there is no implied "module initialisation"
Okay - I can see that with lazy semantics this might not be a problem... What happens with the second problem: That where module B uses A internally and C uses A internally, then I write a new module that tries to use B & C together... This potentially breaks B & C. I think you need the extra restriction that the top level '<-' bindings must not be exported. So where does that leave us. Top level inits are safe (I think) iff: - They are lazy (the definition only happens when required) - They contain only a subset of IO actions - namely those concerned with name creation within Haskell that don't actually do any IO. - They are not exportable from the module that contains them. I think that covers it... have I forgotten anything? Keean.

On Saturday 13 Nov 2004 9:15 am, Keean Schupke wrote:
I'm not sure I understand what problem you think there is. Are the inits you're talking about module inits? If so, I don't think there's a problem, for several reasons.
The idea under discussion is that a top level (x <- newThing) should be lazy, (no action at all occurs until value of x is demanded). IOW, it's exactly the same as the current unsafePerformIO hack, but not unsafe because the compiler knows the semantics. So there is no implied "module initialisation"
Okay - I can see that with lazy semantics this might not be a problem... What happens with the second problem: That where module B uses A internally and C uses A internally, then I write a new module that tries to use B & C together... This potentially breaks B & C. I think you need the extra restriction that the top level '<-' bindings must not be exported. So where does that leave us.
Top level inits are safe (I think) iff: - They are lazy (the definition only happens when required) - They contain only a subset of IO actions - namely those concerned with name creation within Haskell that don't actually do any IO. - They are not exportable from the module that contains them.
I think that covers it... have I forgotten anything?
One of us has :-) Not sure who though. I thought I'd covered the second problem you're alluding to already. But if you think there's still a problem you'd better elaborate a little more. Certainly I see no reason why top level TWI's cannot be exported from a module. We don't have this constraint with the unsafePerformIO hack. For instance, if I had userInit <- oneShot realInit is there any reason why userInit can't be safely exported and used in many different modules? The whole idea was that it should be. Regards -- Adrian Hey

Well lets say: userInit <- oneShot realInit where realInit defines an MVar used for state storage that is used in module A to implement an accumulator. Now module B does some maths using the accumulator, and module C does some maths using the accumulator. If Main uses functions defined in both B and C then they will both be trying to use the _same_ MVar to store their state in - which will result in the wrong answer. The following is a contrived example, If arith and geom were in the same module, this would be an error on the programmers part. But consider if A were in the standard libraries, and B and C were two orthogonal extensions by different authors, do we really want the situation where they break each other. Note: this does not apply to declarations like (i=4) as this is true for all time. The problem is essentially that the declaration in the example is mutable. If mutable-declarations are not exportable, you can reasonably say it is the module authors job to make sure all uses of the MVar are consistent. module A mVarA <- newMVar 1 acc :: Int -> IO () acc i = writeMVar mVarA (readMVar mVarA + i) val :: IO Int val = readMVar mVarA module B import A arith :: IO [Int] arith = do i <- val acc (7+val) j <- arith return (i:j) module C import A geom :: IO [Int] geom = do i <- val acc (7*val) j <- geom return (i:j) module D import B import C main = do a <- arith g <- geom putStrLn $ show (take 100 a) putStrLn $ show (take 100 g) Keean Adrian Hey wrote:
On Saturday 13 Nov 2004 9:15 am, Keean Schupke wrote:
I'm not sure I understand what problem you think there is. Are the inits you're talking about module inits? If so, I don't think there's a problem, for several reasons.
The idea under discussion is that a top level (x <- newThing) should be lazy, (no action at all occurs until value of x is demanded). IOW, it's exactly the same as the current unsafePerformIO hack, but not unsafe because the compiler knows the semantics. So there is no implied "module initialisation"
Okay - I can see that with lazy semantics this might not be a problem... What happens with the second problem: That where module B uses A internally and C uses A internally, then I write a new module that tries to use B & C together... This potentially breaks B & C. I think you need the extra restriction that the top level '<-' bindings must not be exported. So where does that leave us.
Top level inits are safe (I think) iff: - They are lazy (the definition only happens when required) - They contain only a subset of IO actions - namely those concerned with name creation within Haskell that don't actually do any IO. - They are not exportable from the module that contains them.
I think that covers it... have I forgotten anything?
One of us has :-) Not sure who though.
I thought I'd covered the second problem you're alluding to already. But if you think there's still a problem you'd better elaborate a little more. Certainly I see no reason why top level TWI's cannot be exported from a module. We don't have this constraint with the unsafePerformIO hack.
For instance, if I had
userInit <- oneShot realInit
is there any reason why userInit can't be safely exported and used in many different modules? The whole idea was that it should be.
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Actually, I Think I'm wrong - I think its not even safe if you cannot export the '<-' def. If any functions which use it are exported you are in the same situation. I cannot say the kind of code in the example I gave is good, can you? Infact the availability of these top level IO actions seems to completely change the feel of the language... Keean. Keean Schupke wrote:
Well lets say:
userInit <- oneShot realInit
where realInit defines an MVar used for state storage that is used in module A to implement an accumulator. Now module B does some maths using the accumulator, and module C does some maths using the accumulator. If Main uses functions defined in both B and C then they will both be trying to use the _same_ MVar to store their state in - which will result in the wrong answer. The following is a contrived example, If arith and geom were in the same module, this would be an error on the programmers part. But consider if A were in the standard libraries, and B and C were two orthogonal extensions by different authors, do we really want the situation where they break each other. Note: this does not apply to declarations like (i=4) as this is true for all time. The problem is essentially that the declaration in the example is mutable. If mutable-declarations are not exportable, you can reasonably say it is the module authors job to make sure all uses of the MVar are consistent.
module A mVarA <- newMVar 1
acc :: Int -> IO () acc i = writeMVar mVarA (readMVar mVarA + i)
val :: IO Int val = readMVar mVarA
module B import A
arith :: IO [Int] arith = do i <- val acc (7+val) j <- arith return (i:j)
module C import A
geom :: IO [Int] geom = do i <- val acc (7*val) j <- geom return (i:j)
module D import B import C
main = do a <- arith g <- geom putStrLn $ show (take 100 a) putStrLn $ show (take 100 g)
Keean
Adrian Hey wrote:
On Saturday 13 Nov 2004 9:15 am, Keean Schupke wrote:
I'm not sure I understand what problem you think there is. Are the inits you're talking about module inits? If so, I don't think there's a problem, for several reasons.
The idea under discussion is that a top level (x <- newThing) should be lazy, (no action at all occurs until value of x is demanded). IOW, it's exactly the same as the current unsafePerformIO hack, but not unsafe because the compiler knows the semantics. So there is no implied "module initialisation"
Okay - I can see that with lazy semantics this might not be a problem... What happens with the second problem: That where module B uses A internally and C uses A internally, then I write a new module that tries to use B & C together... This potentially breaks B & C. I think you need the extra restriction that the top level '<-' bindings must not be exported. So where does that leave us.
Top level inits are safe (I think) iff: - They are lazy (the definition only happens when required) - They contain only a subset of IO actions - namely those concerned with name creation within Haskell that don't actually do any IO. - They are not exportable from the module that contains them.
I think that covers it... have I forgotten anything?
One of us has :-) Not sure who though.
I thought I'd covered the second problem you're alluding to already. But if you think there's still a problem you'd better elaborate a little more. Certainly I see no reason why top level TWI's cannot be exported from a module. We don't have this constraint with the unsafePerformIO hack.
For instance, if I had
userInit <- oneShot realInit
is there any reason why userInit can't be safely exported and used in many different modules? The whole idea was that it should be.
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Saturday 13 Nov 2004 10:39 am, Keean Schupke wrote:
Actually, I Think I'm wrong - I think its not even safe if you cannot export the '<-' def. If any functions which use it are exported you are in the same situation. I cannot say the kind of code in the example I gave is good, can you? Infact the availability of these top level IO actions seems to completely change the feel of the language...
I've looked at your example, and the behaviour you describe is exactly what would be expected and what it is intended. That's the whole point of things having identity. The reason they have identity is because they are mutable and all users of a particular TWI are indeed mutating the same thing, just as all users of stdout are writing to the same file. The point is that if the shared TWI is something like an IORef this is (of course) extremely dangerous because anybody can write anything they like to it at any time. But that is not how this should be used. The module exporting one or more TWIs typically will not be exporting raw IORefs. It will be exporting a well designed stateful api which access IORefs etc via closures. It's the responsibility of the author of the exporting module to organise the code so that it delevers (on whatever promisies it's making) to all clients, and clients should not rely on anything that isn't being promised. So it seems to me that the only thing that's wrong here is your expectations (I.E. that a module should assume it has exclusive access to whatever state the TWI's it imports mutate). This is not so. If it wants it's own private TWI (a mutable queue say) it should not be importing another modules queue (not that any good design should be exporting such a thing anyway), it should be importing a newQueue constructor and making it's own queue (either at the top level or via normal IO monadic operations).. myQueue <- newQueue But there's no magic here. All IO actions have potentially unknown state dependencies and mutating effects, that's why they're in the IO monad. All the top level <- extension does is enable the user to extend the initial world state (as seen by main) with user defined state, but it doesn't fundamentally change nature or hazards of programming via the IO monad, for better or worse. Regards -- Adrian Hey

On Fri, 12 Nov 2004 14:53:33 +0000, Adrian Hey
On Thursday 11 Nov 2004 12:27 pm, Ben Rudiak-Gould wrote:
On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
Having taken a bit of time to look at this, I have to say that IMO saying they are "perfectly safe" is over stating things a bit :-)
How is oncePerType in particular unsound? I've given a quick example implementation below. It's a referentially transparent function (no use of unsafePerformIO except to implement an internal global hashtable), it's type-safe, and I imagine that the discipline involved is no worse than that of dynamic exceptions, for example. I'm not necessarily suggesting that this solves the discussion, but it could be good enough to replace unsafePerformIO in many situations. Incidentally, a similar idea was suggested by George Russell, but not really followed up on: http://www.haskell.org/pipermail/haskell/2004-June/014104.html (This was perhaps the first message in the current months-long discussion?) -Judah --------------- module OnceType(oncePerType) where import Data.Dynamic import Data.Hashtable as HT import Data.Int(Int32) import GHC.IOBase (unsafePerformIO) type Dict = HT.HashTable TypeRep Dynamic oncePerType :: Typeable a => IO a -> IO a oncePerType (action :: IO a) = do let rep = typeOf (undefined :: a) l <- HT.lookup globalDict rep case l of Nothing -> do -- run the action x <- action HT.insert globalDict (typeOf x) (toDyn x) return x Just dyn -> case fromDynamic dyn of -- since we store values according to their TypeRep, -- fromDynamic should never fail. Just x -> return x {-# NOINLINE globalDict #-} globalDict :: Dict globalDict = unsafePerformIO $ HT.new (==) hashTypeRep -- this could be implemented better using the internals of Data.Typeable hashTypeRep :: TypeRep -> Int32 hashTypeRep = hashString . show

On Friday 12 Nov 2004 5:42 pm, Judah Jacobson wrote:
On Fri, 12 Nov 2004 14:53:33 +0000, Adrian Hey
wrote: On Thursday 11 Nov 2004 12:27 pm, Ben Rudiak-Gould wrote:
On the other hand, these are perfectly safe:
once' :: IO a -> IO (IO a) oncePerString :: String -> IO a -> IO a oncePerType :: Typeable a => IO a -> IO a
once' seems virtually useless unless you have top-level <-, but the other two don't need it. I'm not sure which would be preferable. I lean toward oncePerString as more flexible and predictable, though it requires a certain discipline on the part of its users.
Having taken a bit of time to look at this, I have to say that IMO saying they are "perfectly safe" is over stating things a bit :-)
How is oncePerType in particular unsound?
Actually, I think I'd better retract that statement as looking at your definition closely I can't see obvious transformation that a compiler could do that would change the observable behaviour of a program. Reliance on current dynamics (which is it self a cheap and cheerful hack) worries me though. It certainly seems reasonable to eliminate your use of unsafePerformIO as the source of any unsoundness because this is exactly the sort of thing you'd do with top level <- bindings if they existed (and if they can't be given sound semantics I guess we'd better forget the whole idea :-). To be honest, all the alternatives that have been put forward have looked like extrordinarily complex hacks to me. I dislike having to use unsafePerformIO, but for one reason or another all the suggested alternatives come with so many strings attached (to get them to work properly) that I dislike them even more. A case of the cure(s) being worse than the disease :-( The problem just doesn't seem to be solvable at the library level. It's something that's just plain missing from the language. Regards -- Adrian Hey
participants (9)
-
Adrian Hey
-
Ben Rudiak-Gould
-
Graham Klyne
-
Graham Klyne
-
Judah Jacobson
-
Keean Schupke
-
Keith Wansbrough
-
Robert Dockins
-
Simon Marlow