Current situation regarding global IORefs

Hi - I've run into the global mutable state problem described in http://www.haskell.org/hawiki/GlobalMutableState Since the page was last edited in March last year, I'm wondering if there have been any developments or further thoughts on how to safely create top level IORefs since they are absolutely essential for the library I'm writing. For my library, which implements a GUI, I have a Manager module which keeps track of which control currently has the keyboard focus etc, and I don't want to have to pass round the state of the manager to every control since this would be monstrously inconvenient and a total waste of space/time, so at the moment I'm reduced to: module Manager where keyboard :: IORef (Maybe Control) {-# NOINLINE keyboard #-} keyboard = unsafePerformIO $ newIORef Nothing The problem is that I don't know if this is guaranteed to be completely safe for all Haskell compilers or even for all future versions of ghc (?) Thanks, Brian.

On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// www.haskell.org/hawiki/GlobalMutableState Since the page was last edited in March last year, I'm wondering if there have been any developments or further thoughts on how to safely create top level IORefs since they are absolutely essential for the library I'm writing.
For my library, which implements a GUI, I have a Manager module which keeps track of which control currently has the keyboard focus etc, and I don't want to have to pass round the state of the manager to every control since this would be monstrously inconvenient and a total waste of space/time, so at the moment I'm reduced to:
module Manager where keyboard :: IORef (Maybe Control) {-# NOINLINE keyboard #-} keyboard = unsafePerformIO $ newIORef Nothing
The problem is that I don't know if this is guaranteed to be completely safe for all Haskell compilers or even for all future versions of ghc (?)
RE: the technique itself, you should also compile the module with - fno-cse. RE: the design, Isn't that bit of state local to a dialog/window/ control group or something? I understand that top level state is a problem in general that needs some sort of solution, but I'm not sure it's the right hammer here.... As far as I know, the only recent developments in this area are a rumor from the Simons that they are working on some sort of thread- local state which (under some sets of design decisions) can fill the needs of top level state. If you press them, they might be willing to give some details about this. Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Robert Dockins wrote:
On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// www.haskell.org/hawiki/GlobalMutableState Since the page was last edited in March last year, I'm wondering if there have been any developments or further thoughts on how to safely create top level IORefs since they are absolutely essential for the library I'm writing.
For my library, which implements a GUI, I have a Manager module which keeps track of which control currently has the keyboard focus etc, and I don't want to have to pass round the state of the manager to every control since this would be monstrously inconvenient and a total waste of space/time, so at the moment I'm reduced to:
module Manager where keyboard :: IORef (Maybe Control) {-# NOINLINE keyboard #-} keyboard = unsafePerformIO $ newIORef Nothing
The problem is that I don't know if this is guaranteed to be completely safe for all Haskell compilers or even for all future versions of ghc (?)
RE: the technique itself, you should also compile the module with - fno-cse.
Thanks
RE: the design, Isn't that bit of state local to a dialog/window/ control group or something? I understand that top level state is a problem in general that needs some sort of solution, but I'm not sure it's the right hammer here....
There is only one GUI for the application and only one control in it can have the keyboard focus so it seems natural to use global state here, but I suppose I could also look into using a state monad. The advantage (perhaps also disadvantage ;-) ) of global state is that it allows me to easily convert all my old C++ singleton classes to Haskell modules...
As far as I know, the only recent developments in this area are a rumor from the Simons that they are working on some sort of thread- local state which (under some sets of design decisions) can fill the needs of top level state. If you press them, they might be willing to give some details about this.
I was kind of hoping that there would just be a safe, simple way to create a top level monomorphic IORef without having to use a pragma etc. Thanks, Brian.

On Apr 21, 2006, at 10:34 AM, Brian Hulley wrote:
Robert Dockins wrote:
On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// www.haskell.org/hawiki/GlobalMutableState Since the page was last edited in March last year, I'm wondering if there have been any developments or further thoughts on how to safely create top level IORefs since they are absolutely essential for the library I'm writing.
For my library, which implements a GUI, I have a Manager module which keeps track of which control currently has the keyboard focus etc, and I don't want to have to pass round the state of the manager to every control since this would be monstrously inconvenient and a total waste of space/time, so at the moment I'm reduced to:
module Manager where keyboard :: IORef (Maybe Control) {-# NOINLINE keyboard #-} keyboard = unsafePerformIO $ newIORef Nothing
The problem is that I don't know if this is guaranteed to be completely safe for all Haskell compilers or even for all future versions of ghc (?)
RE: the technique itself, you should also compile the module with - fno-cse.
Thanks
RE: the design, Isn't that bit of state local to a dialog/window/ control group or something? I understand that top level state is a problem in general that needs some sort of solution, but I'm not sure it's the right hammer here....
There is only one GUI for the application and only one control in it can have the keyboard focus so it seems natural to use global state here
I'd suggest you consider not making those assumptions... they are the kinds of assumptions that can make later code reuse and maintenance more difficult than it should be. (Obviously, if code reuse/ maintenance is a low priority then it doesn't matter).
, but I suppose I could also look into using a state monad. The advantage (perhaps also disadvantage ;-) ) of global state is that it allows me to easily convert all my old C++ singleton classes to Haskell modules...
<ramble type="somewhat coherent"> Ahhh... the singleton pattern. There is a debate among OO theorists about whether the singleton pattern is actually a good idea. I tend to side with those who say that it is Just Wrong. The reality is that "singletons" are only unique within some scope (OS process, VM, sandbox, whatever). "Global" state is similar; it is always bounded by _something_. I think its always better to make the boundaries explicit and aligned with the problem domain rather than implicit, because the implicit boundaries sometimes/often don't do what you want. As soon as you have an even slightly unusual execution environment, your assumptions can be violated (eg, within Java application containers *shudder*). I have to imagine using, eg, HS plugins with modules containing top-level state could cause all sorts of havoc. </ramble>
As far as I know, the only recent developments in this area are a rumor from the Simons that they are working on some sort of thread- local state which (under some sets of design decisions) can fill the needs of top level state. If you press them, they might be willing to give some details about this.
I was kind of hoping that there would just be a safe, simple way to create a top level monomorphic IORef without having to use a pragma etc.
I don't think that exists currently.
Thanks, Brian.
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Robert Dockins wrote:
On Apr 21, 2006, at 10:34 AM, Brian Hulley wrote:
Robert Dockins wrote:
On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// [snip]
There is only one GUI for the application and only one control in it can have the keyboard focus so it seems natural to use global state here
I'd suggest you consider not making those assumptions... they are the kinds of assumptions that can make later code reuse and maintenance more difficult than it should be. (Obviously, if code reuse/ maintenance is a low priority then it doesn't matter).
, but I suppose I could also look into using a state monad. The advantage (perhaps also disadvantage ;-) ) of global state is that it allows me to easily convert all my old C++ singleton classes to Haskell modules...
<ramble type="somewhat coherent"> Ahhh... the singleton pattern. There is a debate among OO theorists about whether the singleton pattern is actually a good idea. I tend to side with those who say that it is Just Wrong. [snip]
Thanks for the comments. I've now changed everything so that controls use a ManagerM monad which wraps up the state instead of using the IO monad so there are no longer any global variables. It wasn't as difficult as I had thought and as you say it makes everything much more scalable, although at the expense of having to use liftIO in various places. I've defined my state monad by: data MState = MState {keyboard:: !Maybe Control} -- etc - other state here also type ManagerM a = StateT MState IO a and everything works ok. However if I try to use a newtype instead of a type (to completely hide the representation) eg newtype ManagerM a = ManagerM (StateT MState IO a) deriving (Monad, MonadIO, MonadState) it won't compile. Does this mean it is not possible to wrap combined monads in a newtype? I notice that the examples in tutorials I've looked at tend to always just use type instead of newtype. Another point is that I'm not sure what is the "proper" way to represent the state itself ie should each component of the state be a separate IORef to avoid having to copy the whole state each time or is it better practice to just use an immutable record as I've done above? Thanks, Brian.

On Apr 21, 2006, at 1:27 PM, Brian Hulley wrote:
Robert Dockins wrote:
On Apr 21, 2006, at 10:34 AM, Brian Hulley wrote:
Robert Dockins wrote:
On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// [snip]
There is only one GUI for the application and only one control in it can have the keyboard focus so it seems natural to use global state here
I'd suggest you consider not making those assumptions... they are the kinds of assumptions that can make later code reuse and maintenance more difficult than it should be. (Obviously, if code reuse/ maintenance is a low priority then it doesn't matter).
, but I suppose I could also look into using a state monad. The advantage (perhaps also disadvantage ;-) ) of global state is that it allows me to easily convert all my old C++ singleton classes to Haskell modules...
<ramble type="somewhat coherent"> Ahhh... the singleton pattern. There is a debate among OO theorists about whether the singleton pattern is actually a good idea. I tend to side with those who say that it is Just Wrong. [snip]
Thanks for the comments. I've now changed everything so that controls use a ManagerM monad which wraps up the state instead of using the IO monad so there are no longer any global variables. It wasn't as difficult as I had thought and as you say it makes everything much more scalable, although at the expense of having to use liftIO in various places.
This is true, and mildly irritating. One additional (very unfortunate) point is that higher-order IO monad combinators will not work on your monad, eg, the ones in Control.Exception. I hope H' will generalize the types to (use MonadIO) these combinators to make this sort of thing easier, because I think this is a great way to structure programs. *makes mental note to create a ticket for this* Sometimes I also think it would be nice if all the standard lib functions with IO types would instead take arbitrary MonadIO types, so you could avoid having to write down liftIO all the time....
I've defined my state monad by:
data MState = MState {keyboard:: !Maybe Control} -- etc - other state here also type ManagerM a = StateT MState IO a
and everything works ok. However if I try to use a newtype instead of a type (to completely hide the representation) eg
newtype ManagerM a = ManagerM (StateT MState IO a) deriving (Monad, MonadIO, MonadState)
it won't compile.
Are you compiling with -fglasgow-exts? You're relying on generalized newtype deriving, which is a GHC extension. http://www.haskell.org/ghc/docs/latest/html/users_guide/type- extensions.html#newtype-deriving If that's not it, what's the error you are getting?
Does this mean it is not possible to wrap combined monads in a newtype? I notice that the examples in tutorials I've looked at tend to always just use type instead of newtype.
I usually use a newtype myself; but then I usually roll my own monads instead of using monad transformers (not a value judgement, just habit).
Another point is that I'm not sure what is the "proper" way to represent the state itself ie should each component of the state be a separate IORef to avoid having to copy the whole state each time or is it better practice to just use an immutable record as I've done above?
I usually use immutable records as you have done; it somehow "feels better". Unfortunately, going this way exposes you to the clunkiness of Haskell's record system. If all your record components are declared with a bang, you may be able to coerce the compiler to unbox the record (-funbox-strict-fields, I think), which would prevent copying altogether. Immutable records are also a little nicer to the garbage collector. However, I've never actually tried to measure the performance difference. If you're going to use a record of IORefs, you should probably go with ReaderT instead.
Thanks, Brian.
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Robert Dockins wrote:
Are you compiling with -fglasgow-exts? You're relying on generalized newtype deriving, which is a GHC extension.
http://www.haskell.org/ghc/docs/latest/html/users_guide/type- extensions.html#newtype-deriving
If that's not it, what's the error you are getting?
'MonadState does not have arity 1' (I see now from the docs and from Cale's clause below that the instance declaration would be instance MonadState MState ManagerM where ... hence the need for (MonadState MState) to be written in the deriving clause) Cale Gibbard wrote:
try deriving (Monad, MonadIO, MonadState MState) -- I find that newtype deriving doesn't like guessing at other class parameters, even with the fundeps there.
Thanks. The other compilation error was caused by the fact that newtype deriving doesn't work in a hs-boot file. I had Control and ManagerM defined in different modules, but when I just merged these into one module, and used Cale's deriving clause, everything now works.. (Perhaps it doesn't really matter about type/newtype in this case anyway since MState is hidden)
[suggestions about ReaderT]
I think I'll stick with an immutable state record for the moment since there does not seem to be a clear advantage one way or the other, and AFAIK ghc 6.4.2 at the moment does not use a write barrier for IORefs so every intergenerational garbage collection follows every IORef in existence which could slow things down for a large GUI with individual IORefs for each state component. Robert Dockins wrote:
Sometimes I also think it would be nice if all the standard lib functions with IO types would instead take arbitrary MonadIO types, so you could avoid having to write down liftIO all the time....
Thanks for the suggestion - it is certainly a lot better to write liftIO inside my FFI wrappers than each time I use these functions elsewhere. Thanks, Brian.

Brian Hulley wrote:
Robert Dockins wrote:
Sometimes I also think it would be nice if all the standard lib functions with IO types would instead take arbitrary MonadIO types, so you could avoid having to write down liftIO all the time....
Thanks for the suggestion - it is certainly a lot better to write liftIO inside my FFI wrappers than each time I use these functions elsewhere.
[rearranged] One additional (very unfortunate) point is that higher-order IO monad combinators will not work on your monad, eg, the ones in Control.Exception. I hope H'
I see now what you mean. This really kills any attempt to make my own api use MonadIO since things like bracket_ etc are fundamental. Thus the use of IO in Control.Exception has a kind of viral property that destroys all hope of eradicating liftIO from user code... Regards, Brian.

Robert Dockins wrote:
On Apr 21, 2006, at 10:34 AM, Brian Hulley wrote:
Robert Dockins wrote:
On Apr 21, 2006, at 9:56 AM, Brian Hulley wrote:
Hi - I've run into the global mutable state problem described in http:// [snip]
There is only one GUI for the application and only one control in it can have the keyboard focus so it seems natural to use global state here
I'd suggest you consider not making those assumptions... they are the kinds of assumptions that can make later code reuse and maintenance more difficult than it should be. (Obviously, if code reuse/ maintenance is a low priority then it doesn't matter).
, but I suppose I could also look into using a state monad. The advantage (perhaps also disadvantage ;-) ) of global state is that it allows me to easily convert all my old C++ singleton classes to Haskell modules...
<ramble type="somewhat coherent"> Ahhh... the singleton pattern. There is a debate among OO theorists about whether the singleton pattern is actually a good idea. I tend to side with those who say that it is Just Wrong. [snip]
Thanks for the comments. I've now changed everything so that controls use a ManagerM monad which wraps up the state instead of using the IO monad so there are no longer any global variables. It wasn't as difficult as I had thought and as you say it makes everything much more scalable, although at the expense of having to use liftIO in various places.
I've defined my state monad by:
data MState = MState {keyboard:: !Maybe Control} -- etc - other state here also type ManagerM a = StateT MState IO a
and everything works ok. However if I try to use a newtype instead of a type (to completely hide the representation) eg
newtype ManagerM a = ManagerM (StateT MState IO a) deriving (Monad, MonadIO, MonadState)
it won't compile. Does this mean it is not possible to wrap combined monads in a newtype? I notice that the examples in tutorials I've looked at tend to always just use type instead of newtype.
On 21/04/06, Brian Hulley
Another point is that I'm not sure what is the "proper" way to represent the state itself ie should each component of the state be a separate IORef to avoid having to copy the whole state each time or is it better practice to just use an immutable record as I've done above?
If you were to use IORefs/MVars, it would likely be enough to use ReaderT instead of StateT, since you likely wouldn't be replacing your mutable cells, just their contents. Both routes are okay -- note that Haskell data structure nodes are usually just bunches of pointers to values (or more properly, code which returns values) anyway, so you should only be copying a few pointers when updates are made. (In a similar way to how replacing the head of a list is constant time and space, and not linear.) - Cale

Robert Dockins wrote:
<ramble type="somewhat coherent"> Ahhh... the singleton pattern. There is a debate among OO theorists about whether the singleton pattern is actually a good idea. I tend to side with those who say that it is Just Wrong. The reality is that "singletons" are only unique within some scope (OS process, VM, sandbox, whatever). "Global" state is similar; it is always bounded by _something_. I think its always better to make the boundaries explicit and aligned with the problem domain rather than implicit, because the implicit boundaries sometimes/often don't do what you want. As soon as you have an even slightly unusual execution environment, your assumptions can be violated (eg, within Java application containers *shudder*). I have to imagine using, eg, HS plugins with modules containing top-level state could cause all sorts of havoc. </ramble>
I couldn't have rambled it better myself. :) I think global mutable variables should be regarded with utmost suspicion. There are very few situations where they are the right solution. -- Lennart

Lennart Augustsson wrote:
I think global mutable variables should be regarded with utmost suspicion. There are very few situations where they are the right solution.
Well IMO even the use of the term "global mutable variable" causes muddled thinking on this and I wish people would stop it. There's no reason to regard top level "things with identity" (call them "TWI"s or "objects" or whatever) with any more suspicion than top level IO actions themselves. One thing I never could fathom about the position of the nay-sayers in this debate is what exactly is it that they object to? Is it existence of top level "TWI"s and of IO operations that reference them *in principle*? Or are they content with their existence but just don't want to allow people to use Haskell to define them? If it's the former then we should be purging the IO libraries of all such horrors, though I can't see much remaining of them (well anything actually). But I guess an IO incapable language might still have some niche uses. If it's the latter then we are advocating a language which cannot be used to implement many demonstrably useful IO modules and libraries, *by design*. If so, the claim that Haskell is a general purpose programming language seems quite bogus and should be removed from the haskell.org home page IMO. Regards -- Adrian Hey

On 24/04/06, Adrian Hey
Lennart Augustsson wrote:
I think global mutable variables should be regarded with utmost suspicion. There are very few situations where they are the right solution.
Well IMO even the use of the term "global mutable variable" causes muddled thinking on this and I wish people would stop it. There's no reason to regard top level "things with identity" (call them "TWI"s or "objects" or whatever) with any more suspicion than top level IO actions themselves.
What do you mean by "top level IO action"? If you mean something like 'getLine', then there is a huge difference. If you mean actions which execute automatically in any program which imports the given module, then I'd contend that Haskell doesn't really even have those (apart from possibly 'main', if you count that at all). In the latter case, you might have disconnected components which all must run, but since they do IO, are potentially noncommutative. With the ability to mark actions as automatically executing without some way to control the order of that execution, program behaviour is unpredictable in general. At least, I wouldn't want the language to allow that. However, putting an explicit ordering on their execution is essentially equivalent to defining a new IO action which executes each of them in a given sequence, obviating the need for such a feature in the first place. While only allowing the definition of top-level IORefs (i.e. not unrestricted IO) wouldn't cause quite as much harm, it's still questionable as to whether it's ever actually necessary. One can get computation-global IORefs easily using something along the lines of ReaderT IORef IO. By newtyping the monad, one could exert even more control over how the IORef was used. One problem with real top-level IORefs is that it leaves no way for the module user to reset things in general. As a library designer, you might think that the users of your library will only ever want to initialise things once, but this is being a little bit overly forceful -- what if the program has reloaded its configuration while running and wants to restart the functionality your module provides? If you haven't explicitly provided a way to reset the IORefs, there's no way for the module user to reliably do so. There are plenty of other nice ways to enforce things such as particular actions only running once and so on. For one, you could use a new monad, which would give extremely fine control over what actions were permitted, but even with just IO, there are plenty of things one can do -- just not globally. We can relatively easily create a context which provides an IO action that will only run once in that context, and return the same value as the first time without repeating the execution thereafter: singletonRegion :: IO a -> (IO a -> IO b) -> IO b singletonRegion action region = do r <- newIORef Nothing let action' = do t <- readIORef r case t of Nothing -> do v <- action writeIORef r (Just v) return v Just v -> return v region action' test1 = singletonRegion (putStrLn "Hello") $ \greet -> do greet -- only this one will actually have any effect. greet greet With a custom monad, this sort of setup would be built into the "run" function which transforms the action back into IO, and so would be even more transparent, while still not creating irrevocable program-global state.
One thing I never could fathom about the position of the nay-sayers in this debate is what exactly is it that they object to? Is it existence of top level "TWI"s and of IO operations that reference them *in principle*? Or are they content with their existence but just don't want to allow people to use Haskell to define them?
If it's the former then we should be purging the IO libraries of all such horrors, though I can't see much remaining of them (well anything actually). But I guess an IO incapable language might still have some niche uses.
If it's the latter then we are advocating a language which cannot be used to implement many demonstrably useful IO modules and libraries, *by design*. If so, the claim that Haskell is a general purpose programming language seems quite bogus and should be removed from the haskell.org home page IMO.
This argument sounds quite a lot like the ones used by those arguing against the removal of GOTO from programming languages. Sometimes, by explicitly not providing some feature, you're also making code in general easier to comprehend (especially when that feature has invisible global effects on programs, where you'd need to look carefully at the source of all modules involved in order to determine what was actually happening in the presence of its use). It can also prevent certain classes of design problems from ever cropping up. The more state which one has to reconstruct in order to test an arbitrary bit of code in a program, the harder it is to debug and understand it, which is one major reason that pure code is favoured over IO code in the first place. With top-level IORefs, you essentially relinquish control over who gets access to a bit of statefulness which you've created. Everything within the module might read or write it, and if it's exported, then any module which imports yours might influence it as well. Now, you might make the claim that IO computations can read and write files on disk, and one could essentially (ab)use those, at least for storable types, instead of top level IORefs if one were only crazy enough to do so. So if this sort of state is already present in the form of the local filesystem, why not have global IORefs too? An example of the difference is that people don't have the tendency to abuse file IO as an alternative to passing something as a parameter (barring the sort of code which ends up on thedailywtf.com). State is dangerous. It is the cause of a lot of misunderstandings about program behaviour and as a result, the cause of a lot of bugs. It should not be taken lightly. Most Haskell programmers I know, myself included, have a tendency to be extremely picky about what parts of the program have access to a piece of state, and for how long that state can affect the computation. You might see how proposing a feature which allows for unlimited globally-accessible-from-IO state variables which last forever goes against this culture very much. I would not want to have to read, let alone debug, a program which made heavy use of more than a very small number of top level IORefs. (Not to mention that, if reasonable to do so, one of the first things I'd probably do is to translate the code such that the IORefs are no longer top-level, just for the sake of my own understanding.) It's not that anyone thinks it would break the world to have top level IORefs, it's just that it would make thinking about any program which possibly used them a good deal harder, and with any new feature, one has to ask "what happens if people actually start to use this?" - Cale

Adrian Hey wrote: ..... I was going to respond, but Cale very eloquently said most of what I was thinking. Let me just add one thing. Sometimes you hear the argument "I need a global IORef here because it's to track the use of my single screen" (or keyboard, or elevator, or some some other gizmo in th real world). I think such decisions are just generally poor design, and it should not be done in any language. The number of physical resources that a program can control should never be assumed to be one; things change. -- Lennart

Lennart Augustsson wrote:
I was going to respond, but Cale very eloquently said most of what I was thinking.
I don't think eloquent is the word I would use, but I'm certainly glad you didn't feel the need to repeat all that. It'd be really nice if just for once the "global mutable state is evil" folk could give unsafePerformIO hack users the benefit of the doubt, not start out from the presumption that we are a bunch of lazy incompetent retards who are badly in need of education in the rudiments of the IO monad and (allegedly) good programming practice in general. These arguments about the language needing to protect users from doing dangerous things (specifically creating top level mutable state) are quite wrong headed. Concurrency can be dangerous too, so should we lose it? Top level mutable state can be a way of *gaining safety*, not losing it. What really frustrates me about all this is that AFAIK there are no significant technical or theoretical reasons why we can't get this safety (without resort to the unsafePerformIO hack). The only serious obstacle seems political, with this very strange but apparently widespread dogma about so called "global variables" being inherently evil, regardless of the circumstances or purpose for which they are used. With regard to Roberts post, I don't want too say much other than Robert is the first person to provide an answer to my question. I hope I'm not mis-representing his views, but I believe Robert objects to the existence of IO libraries that could not be implemented in Haskell (in principle). I.E. those that implicitly reference top level mutable state. I might be missing something, but AFAICS just about all the current IO libraries fall into this category. To take a specific example, look at the socket API. None of the functions there take any kind of OS or network sub-system state handle as an explicit argument. So it seems to me that either the implementation is entirely stateless, all the way down to peeking and poking the registers of Ethernet MACs, DMA controllers and wotnot (unlikely), or it's sneakily accessing top level mutable state in some extremely devious and non-transparent manner (heaven forbid). Or put another way, would it be possible to implement the socket API, exactly as it currently is, entirely in Haskell, starting with nothing but hardware? I don't believe it is possible, but perhaps somebody can show me I'm wrong.
Let me just add one thing. Sometimes you hear the argument "I need a global IORef here because it's to track the use of my single screen" (or keyboard, or elevator, or some some other gizmo in th real world).
No, this is not the justification for the creation of top level TWI's. This is the justification for not requiring that the API that mutates a particular top level TWI state takes that state as an explicit argument. There's no point if there is (and can be) only one of them. This is why you don't have to pass an OS state handle to every IO function that interacts with "the" OS (note singular). But even if there are two or more, you still need some mechanism to ensure that you have precisely 1:1 correspondance between physical devices and device state TWI's and/or device driver threads. This more or less prevents a robust API allowing unconstrained creation of new device state TWI's. It's far safer and simpler to provide top level TWIs (state handles) as *abstract data types* (not IORefs!). This is no different from (or less safe than) having stdout appear at the top level. Even if there is an unknown (at compile time) number of such devices and instead they are discovered somehow at boot time, you still need to maintain a some kind of finite pool of these device states, which is itself necessarily a unique TWI. IME, the approach you take to these kinds of problems can vary depending what you know or don't know for certain about the system your working with. But you always end up using top level mutable state somewhere along the way. I can only assume folk who insist it's unnecessary (or worse) have never actually tried implementing an IO sub-system from the ground up, starting with nothing but bare hardware.
I think such decisions are just generally poor design, and it should not be done in any language. The number of physical resources that a program can control should never be assumed to be one; things change.
So is it reasonable to assume that there is only one OS? Perhaps it would be best to let individual Haskell users decide for themselves what assumptions are or are not reasonable in the context of their work. Regards -- Adrian Hey

Not to fuel the flame war, I will limit myself to two comments. Adrian Hey wrote:
Or put another way, would it be possible to implement the socket API, exactly as it currently is, entirely in Haskell, starting with nothing but hardware? I don't believe it is possible, but perhaps somebody can show me I'm wrong. If I get to implement the IO monad, sure. :)
IME, the approach you take to these kinds of problems can vary depending what you know or don't know for certain about the system your working with. But you always end up using top level mutable state somewhere along the way. I can only assume folk who insist it's unnecessary (or worse) have never actually tried implementing an IO sub-system from the ground up, starting with nothing but bare hardware. I've written about 50000 lines of USB devices drivers for *BSD (in C). They work from the bare metal and up. They contain no global mutable state (except for variables that define debugging levels, because you need to access these from the in-kernel debugger).
-- Lennart

Lennart Augustsson wrote:
Adrian Hey wrote:
Or put another way, would it be possible to implement the socket API, exactly as it currently is, entirely in Haskell, starting with nothing but hardware? I don't believe it is possible, but perhaps somebody can show me I'm wrong. If I get to implement the IO monad, sure. :)
Perhaps you could give some indication of what you have in mind? AFAICS the problem, if it is a problem, is a semantic problem that afflicts all IO monad implementations. What would be different about your implementation?
IME, the approach you take to these kinds of problems can vary depending what you know or don't know for certain about the system your working with. But you always end up using top level mutable state somewhere along the way. I can only assume folk who insist it's unnecessary (or worse) have never actually tried implementing an IO sub-system from the ground up, starting with nothing but bare hardware. I've written about 50000 lines of USB devices drivers for *BSD (in C). They work from the bare metal and up. They contain no global mutable state (except for variables that define debugging levels, because you need to access these from the in-kernel debugger).
Yes, I was aware of this. But with all due respect, if you're running with an OS already present (not what I would call bare metal :-), your device driver alone is not the entire IO sub-system. I'm not sure what you mean by it having no mutable state. Do you mean no top level mutable state, or do you mean there is no mutable state whatsoever associated with a particular USB port(device..whatever)? I assume you mean the former, and that the mutable device state itself is passed as a parameter to your code, some how. If so I think this is fine and good practice (even my noddy device driver on the wiki does this). But it seems to me when you consider the system as whole, the real problem is the creation, management and aquisition of the device states (state handles) by software that uses particular devices. I'll try to keep this brief, maybe you or someone else can explain the flaw in my logic, if there is one.. Consider a bit of code in the IO monad that needs a device handle to perform some IO action with the corresponding device. How does it acquire this handle? I can think of 3 ways.. 1- It receives it as an argument. No problem, except when I look at the Haskell libraries most of them don't seem to be taking such arguments (main takes none whatsoever). 2- It creates it locally with some kind of newDeviceState constructor. Well there might be some particularly privileged bit of code somewhere in the system that is allowed to do this, but clearly this isn't an option for most code. 3- It gets from some other data structure which is accessible at the top level. IOW a "global variable". This doesn't seem to be an option either in a language that has been designed to prevent the creation of such things. Unless I'm missing something, there seems to be a very serious problem somewhere. If so, it should be fixed IMO, one way or another. Now as JM mentioned, the reason nobody seems to notice this problem is that in reality folk *are* relying on "global variables" somewhere in the system. But they're buried so deeply in unsafePerformIO hacked Haskell libraries, C libraries and OS internals that nobody notices. They're just accepted as being part of "the world". So when people say things like "I have never ever had to use a global variable", I'm inclined to believe them. But I also say that the only reason they can claim to be without "sin" is because somebody else has already sinned on their behalf to provide them with the IO libraries they use. Regards -- Adrian Hey

Adrian Hey wrote:
I've written about 50000 lines of USB devices drivers for *BSD (in C). They work from the bare metal and up. They contain no global mutable state (except for variables that define debugging levels, because you need to access these from the in-kernel debugger).
Yes, I was aware of this. But with all due respect, if you're running with an OS already present (not what I would call bare metal :-), your device driver alone is not the entire IO sub-system. I'm not sure what you mean by it having no mutable state. Do you mean no top level mutable state, or do you mean there is no mutable state whatsoever associated with a particular USB port(device..whatever)?
I said "no *global* mutable state", by which I meant no global variables. Of course there's mutable state. This is C after all. :) (And it would there in Haskell too.) And yes, somewhere there's some global mutable state in the OS. I've never claimed that it should be totally forbidden. Various circumstances forces it upon us. What I've been claiming is that it should be avoided where possible. Which is almost always. -- Lennart

Lennart Augustsson wrote:
And yes, somewhere there's some global mutable state in the OS. I've never claimed that it should be totally forbidden. Various circumstances forces it upon us. What I've been claiming is that it should be avoided where possible. Which is almost always.
Thus there seems to be agreement that whereas this should be avoided where possible, it is still needed in some cases. Therefore the question arises as to how to safely incorporate this into the language. As everyone knows, use of unsafePerformIO could break the type system without a programmer knowing it, but I think I am right in saying that if only monomorphic refs were allowed at the top level, type safety would be ensured (so there would be no need for the truly horrible value restriction that infects the whole of SML for example) Therefore I propose a new keyword to define monomorphic top level IORefs, something like: augment ref = monomorphicvalue -- just a plain value not a monadic computation where "augment" refers to the augmentation of the RealWorld state by the state of ref as in John's suggestion to use "augmented IO" in preference to "global mutable state" The use of a plain value to initialize the ref rather than a monadic computation would ensure that there would be no problems with trying to work out which order to initialize top level refs that are dependent on values of refs in other modules since there could be no dependencies. Regards, Brian.

Brian Hulley wrote:
The use of a plain value to initialize the ref rather than a monadic computation would ensure that there would be no problems with trying to work out which order to initialize top level refs that are dependent on values of refs in other modules since there could be no dependencies.
I meant to say that while there *could* be dependencies I don't think they would be any different from the dependencies that can exist at the moment with one value depending on another which in turn is defined in terms of the first either within a module or between mutually recursive modules. Also, to qualify my suggestion, I am not 100% sure that it would be powerful enough for all uses - perhaps monadic computations are needed in some cases to init global refs? Regards, Brian.

Brian Hulley wrote:
The use of a plain value to initialize the ref rather than a monadic computation would ensure that there would be no problems with trying to work out which order to initialize top level refs that are dependent on values of refs in other modules since there could be no dependencies.
I'm not sure what problem you see with the ACIO monad proposal. It was designed to prevent the kind of ordering dependency problems which (I think) you're refering to. You can't read or write IORefs/MVars or do any other "real IO" operation from ACIO. All you can do is create them (and more complex data structures based on them). So they could be evaluated at compile time, in principle (AFAICS etc..). I think it should also be possible to fork threads from ACIO, provided you arrange that they're initially blocked on an empty MVar or something. If it turns out that that isn't enough to properly initialise them (like you need to do real IO), then you just can't have them as top level identifiers. But you can still use something like the "oneShot" function to implement "get" actions at the top level (in the IO monad). Regards -- Adrian Hey

Adrian Hey wrote:
Brian Hulley wrote:
The use of a plain value to initialize the ref rather than a monadic computation would ensure that there would be no problems with trying to work out which order to initialize top level refs that are dependent on values of refs in other modules since there could be no dependencies.
I'm not sure what problem you see with the ACIO monad proposal. It was designed to prevent the kind of ordering dependency problems which (I think) you're refering to. You can't read or write IORefs/MVars or do any other "real IO" operation from ACIO. All you can do is create them (and more complex data structures based on them). So they could be evaluated at compile time, in principle (AFAICS etc..). I think it should also be possible to fork threads from ACIO, provided you arrange that they're initially blocked on an empty MVar or something.
If it turns out that that isn't enough to properly initialise them (like you need to do real IO), then you just can't have them as top level identifiers. But you can still use something like the "oneShot" function to implement "get" actions at the top level (in the IO monad).
I was thinking that there would be a problem with polymorphic refs but now I think I understand that there would not be a problem, seeing the description of ACIO in http://www.haskell.org//pipermail/haskell-cafe/2004-November/007664.html where it is clear that the ACIO actions are just prepended to main. So I'll retract my proposal in favour of ACIO :-) Regards, Brian.

On Sat, 29 Apr 2006, Lennart Augustsson wrote:
And yes, somewhere there's some global mutable state in the OS. I've never claimed that it should be totally forbidden. Various circumstances forces it upon us. What I've been claiming is that it should be avoided where possible. Which is almost always.
The strict object-capability style of programming makes this a strict
rule to prevent capabilities from leaking.
http://www.erights.org/talks/thesis/
Tony.
--
f.a.n.finch

On Thu, Apr 27, 2006 at 11:09:58AM +0100, Adrian Hey wrote:
What really frustrates me about all this is that AFAIK there are no significant technical or theoretical reasons why we can't get this safety (without resort to the unsafePerformIO hack). The only serious obstacle seems political, with this very strange but apparently widespread dogma about so called "global variables" being inherently evil, regardless of the circumstances or purpose for which they are used.
indeed. perhaps we just need to come up with a more functional name than 'global variables'. like 'universal monad' or 'world transformer', maybe 'augmented IO'. :) I am tired of having to apologize for using them, they really are the right solution to a number of practical problems when writing real-world code. Every haskell programer depends on them whethre they know it or not, they are just abstracted away in libraries, but the bottom line is that someone needs to be able to write those libraries. my ForeignData proposal (on the wiki) fills a hole in the haskell FFI that can somewhat mitigate the problem, but only for Storable strict values. I have thought a safe 'StorableRef' would be a useful data type, that uses peeks and pokes internally for very fast mutable state, but provides a safe interface. There is also the 'dependingOn' primitive which is in jhc and I think will be in ghc (?) that lets you do global variables without having to turn off cse. dependingOn :: a -> b -> a dependingOn = ... where it has the same meaning as const, but introduces an artificial dependence of its return value on its second argument. so you can do data Var1 = Var1 {-# NOINLINE myVar #-} myVar :: IORef Int myVar = unsafePerformIO $ newIORef (0 `dependingOn` Var1) and now myVar cannot be cse'd with anything else because Var1 will not match anything of another type. still not ideal though. (I have been finding all sorts of uses for 'dependingOn' when it comes to various tasks)
Let me just add one thing. Sometimes you hear the argument "I need a global IORef here because it's to track the use of my single screen" (or keyboard, or elevator, or some some other gizmo in th real world).
heh. the canonical strawman tactic. :)
No, this is not the justification for the creation of top level TWI's. This is the justification for not requiring that the API that mutates a particular top level TWI state takes that state as an explicit argument. There's no point if there is (and can be) only one of them. This is why you don't have to pass an OS state handle to every IO function that interacts with "the" OS (note singular).
it is not just about convinience, for instance my 'Atom' module in jhc and ginsu depends on the fact you cannot pass in different states for correctness. its API is purly funcitonal, but it needs to use global state internally. Haskell has great tools for abstraction, global state would be another one. What it boils down to is that it helps you write more correct code in certain cases and there is no reasonable work-around. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Thu, Apr 27, 2006 at 11:09:58AM +0100, Adrian Hey wrote:
What really frustrates me about all this is that AFAIK there are no significant technical or theoretical reasons why we can't get this safety (without resort to the unsafePerformIO hack). The only serious obstacle seems political, with this very strange but apparently widespread dogma about so called "global variables" being inherently evil, regardless of the circumstances or purpose for which they are used.
indeed. perhaps we just need to come up with a more functional name than 'global variables'. like 'universal monad' or 'world transformer', maybe 'augmented IO'. :)
I like "augmented IO", because this makes it clear that there is absolutely no difference between the existing IO monad which keeps track of global RealWorld state, and an augmented IO monad which keeps track of RealWorld + state of IORefs needed internally by different components of a software system. At the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as newUnique which are "allowed" to make use of global state and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc. Regards, Brian.

On Thu, Apr 27, 2006 at 09:53:35PM +0100, Brian Hulley wrote:
At the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as newUnique which are "allowed" to make use of global state and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.
In fact, you reminded me of the so obvious it is easy to forget example of global state that every haskell programer uses since day one. CAFs. as in fibs = 0 : 1 : zipWith (+) fibs (tail fibs) fibs is mutable global state that is updated with its value when it is evaluated. not sure how I missed the obvious example. John -- John Meacham - ⑆repetae.net⑆john⑈

On Apr 24, 2006, at 2:42 AM, Adrian Hey wrote:
Lennart Augustsson wrote:
I think global mutable variables should be regarded with utmost suspicion. There are very few situations where they are the right solution.
Well IMO even the use of the term "global mutable variable" causes muddled thinking on this and I wish people would stop it. There's no reason to regard top level "things with identity" (call them "TWI"s or "objects" or whatever) with any more suspicion than top level IO actions themselves.
Sure there is. TWI's are just the object-oriented singleton pattern warmed over, and the singleton pattern is much maligned in some circles (for good reason, IMO).
One thing I never could fathom about the position of the nay-sayers in this debate is what exactly is it that they object to? Is it existence of top level "TWI"s and of IO operations that reference them *in principle*? Or are they content with their existence but just don't want to allow people to use Haskell to define them?
The former, in my case. As I stated in an earlier message, the problem is primarily one of defining the dynamic scope of the thing. If you look back in the archives, you'll notice I proposed a thread- local state mechanism because I felt it placed the scope boundary in an appropriate place, where it could be manipulated and reasoned about by programmers, and where it has a reasonable semantic interpretation. Presumably, runtime models will have to deal somehow with the notion of a thread of execution (even if just to say there is only ever one) and will thus fix the dynamic scope of thread local state. I additionally think that thread-local state (and similar mechanisms) can be abused to create difficult-to-maintain and buggy code, but that's a somewhat separate issue.
If it's the former then we should be purging the IO libraries of all such horrors, though I can't see much remaining of them (well anything actually). But I guess an IO incapable language might still have some niche uses.
Argument by straw-man: there are important differences between regular IO actions and TWI's, AKA singletons. The former is referentially transparent, while the latter is referentially opaque, for starters. There's also the scoping issue: the properties of the IO monad bound the dynamic scope of regular IO actions, but not so for singletons.
If it's the latter then we are advocating a language which cannot be used to implement many demonstrably useful IO modules and libraries, *by design*. If so, the claim that Haskell is a general purpose programming language seems quite bogus and should be removed from the haskell.org home page IMO.
You presuppose that a language which "cannot be used to implement many demonstrably useful IO modules and libraries" is not general purpose. I claim that is silly. If we take that argument to its logical conclusion, then we should throw out static typing, referential transparency and most of the things that make Haskell what it is. I think that what a programming language keeps you from doing is as least as important as what it lets you do. If you don't believe that at least to some extent, then your first exposure to Haskell probably made your head explode. That said, I also disagree with the premise. I don't know of any stateful library designs that can't be pretty straightforwardly refactored using explicit initialization and ReaderT-over-IO monads. Furthermore, I believe that a library so structured is actually _more_ useful than the library with implicit state.
Regards -- Adrian Hey
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG
participants (7)
-
Adrian Hey
-
Brian Hulley
-
Cale Gibbard
-
John Meacham
-
Lennart Augustsson
-
Robert Dockins
-
Tony Finch