Re: [Haskell] Re: Global Variables and IO initializers

[posted to haskell-cafe per SLPJ's request] Hi Adrian,
I can assure you that for the intended applications of oneShot it is vital that realInit is executed once at most, but the user must [..] So please, no more handwaving arguments about this kind of thing being unnecessary, bad programming style, or whatever..
Please show me a concrete alternative in real Haskell code, other
I'm mystified as to why you are insisting others provide real examples when you are not. Can you give one concrete example of an "intended application of oneShot", so that we can either propose a concrete Haskell implementation of it, or agree that global variables really are necessary. Hoping to increase the light / heat ratio in this discussion... Cheers, --KW 8-)

On Monday 08 Nov 2004 3:57 pm, Keith Wansbrough wrote:
[posted to haskell-cafe per SLPJ's request]
Hi Adrian,
I can assure you that for the intended applications of oneShot it is vital that realInit is executed once at most, but the user must
[..]
So please, no more handwaving arguments about this kind of thing being unnecessary, bad programming style, or whatever..
Please show me a concrete alternative in real Haskell code, other
I'm mystified as to why you are insisting others provide real examples when you are not.
Maybe you should read the whole thread. AFAIK I am the only person who has provided a concrete example of anything, and I did so in direct response to a request to do so from Keaan IIRC. Unfortunately my own requests for counter examples showing that there are safer (easier, more elegant or whatever) solutions have been ignored (not that I'm in the least bit surprised by this). Instead all I get is repeated denial of the reality of this problem. The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
Can you give one concrete example of an "intended application of oneShot", so that we can either propose a concrete Haskell implementation of it, or agree that global variables really are necessary.
Any C library which requires an explicit initialisation call before anything in that library can be used (common enough IME). Accidental re-initialisation (e.g. by two independent modules/libraries) will destroy any state currently be used by the libraries existing "clients". The need to do this may or may not indicate "bad design" on the part of the library author. But so what? It just happens to be a fact that must be dealt with from Haskell (in a safe manner preferably). Regards -- Adrian Hey

Adrian Hey writes:
The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
Can you give one concrete example of an "intended application of oneShot", so that we can either propose a concrete Haskell implementation of it, or agree that global variables really are necessary.
Any C library which requires an explicit initialisation call before anything in that library can be used (common enough IME). Accidental re-initialisation (e.g. by two independent modules/libraries) will destroy any state currently be used by the libraries existing "clients".
Great, thanks, that's just what I was hoping for - I now see the problem you're trying to address. --KW 8-)

As a purely practical matter, it seems like the easiest solution (to this particular use case) is to write a small wrapper initializer in C which is idempotent, then use FFI to call the wrapper, rather than calling the initialization directly. This is easy enough to do with a static local variable: void doInit() { static int doneInit = 0; if( !doneInit ) { reallyInit(); doneInit = 1; } } Then your haskell libs can call doInit any number of times, and reallyInit will be called at most once. Since your committed to FFI anyway (calling a C lib is the premise), the wrapper seems a small price to pay. For Haskell-only code, something else would be nice. Keith Wansbrough wrote:
Adrian Hey writes:
The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
Can you give one concrete example of an "intended application of oneShot", so that we can either propose a concrete Haskell implementation of it, or agree that global variables really are necessary.
Any C library which requires an explicit initialisation call before anything in that library can be used (common enough IME). Accidental re-initialisation (e.g. by two independent modules/libraries) will destroy any state currently be used by the libraries existing "clients".
Great, thanks, that's just what I was hoping for - I now see the problem you're trying to address.
--KW 8-)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Adrian Hey wrote:
The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
No language can guarantee this - all I have to do is run 2 copies of the executable at once... or wven sequentially! Keean.

On Monday 08 Nov 2004 6:48 pm, Keean Schupke wrote:
Adrian Hey wrote:
The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
No language can guarantee this - all I have to do is run 2 copies of the executable at once... or wven sequentially!
Read what I wrote :-) Regards -- Adrian Hey

Yes I didn't read your specification accurately... However I would argue such a constraint on the problem domain is artificial as operating systems exist. At the end of the day it is the job of the OS to manage such one-shot hardware inits, not application code. (As the OS is the only thing that can manage resources accross multiple programs)... What did you think of the code example given where the one-shot nature is provided by a 'C' wrapper around the FFI function. I think this is the best solution... Keean. Adrian Hey wrote:
On Monday 08 Nov 2004 6:48 pm, Keean Schupke wrote:
Adrian Hey wrote:
The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once."
No language can guarantee this - all I have to do is run 2 copies of the executable at once... or wven sequentially!
Read what I wrote :-)
Regards -- Adrian Hey _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Monday 08 Nov 2004 9:53 pm, Keean Schupke wrote:
What did you think of the code example given where the one-shot nature is provided by a 'C' wrapper around the FFI function. I think this is the best solution...
As a pragmatic solution to this (and only this) particular problem it's OK. But let's not pretend the real problem has gone way (or just doesn't exist) as a result of this. There are many reasons why people might need top-level TWIs. You asked for a simple example and I provided one, that's all. Also note that Roberts solution still requires the use of a top level mutable variable. I take it the position of those who object to such things is not.. "Top level mutable variables are a very very bad thing and should never ever be used (Errm..well unless they're really necessary, in which case you should use C)." Now that would be strange. I would call that "militant denial". As a side note (not specifically directed at you) I would also like folk to take note that the mutable variable used in Roberts solution is top level, but is NOT global. As I have observed in an earlier post, the thread title chosen by the OP is a rather unfortunate choice of words IMO. I wish people stop talking about "global variables". Nobody is advocating the use of global mutable variables. I sure hope I'm not going to have to repeat this (yet again!). Actually, I know I'm not going to have to repeat this yet again because I'm going to make this is my last post on this thread. Regards -- Adrian Hey

I take it the position of those who object to such things is not.. "Top level mutable variables are a very very bad thing and should never ever be used (Errm..well unless they're really necessary, in which case you should use C)."
more like: if you have two parts of your codebase, one of which easily accomodates such variables and is already flooded with them, the other not, then it may be a good idea to put such variables in the first part. but that doesn't mean that there aren't common programming problems that are currently inconvenient in Haskell and could be supported better, without the need for unsafePerformIO.
As I have observed in an earlier post, the thread title chosen by the OP is a rather unfortunate choice of words IMO. I wish people stop talking about "global variables". Nobody is advocating the use of global mutable variables.
perhaps you aren't, and some other posters in this thread aren't, but it is one of the most common uses of unsafePerformIO, and it is one of the subjects of this thread (and the ones before). then again, perhaps you're only thinking you're not talking about global mutable variables (the emphasis being more on mutable than on global). if you look back at your own oneShot example, you might find that the local MVar putting and taking isn't doing much at all, and the magic lies in the use of unsafePerformIO to share the result of the IO action. so you could move the unsafePerformIO into your oneShot (if you're certain to inspect the result of initialisation, you could avoid the strict application $!): Prelude System.IO.Unsafe> let realInit = putStrLn "okay" Prelude System.IO.Unsafe> let {oneShot :: IO a -> IO a; oneShot io = return $! unsafePerformIO io} Prelude System.IO.Unsafe> let userInit = oneShot realInit Prelude System.IO.Unsafe> userInit >>= print okay () Prelude System.IO.Unsafe> userInit >>= print () Prelude System.IO.Unsafe> userInit >>= print () Prelude System.IO.Unsafe> in other words, the core of your example is the variable userInit that is modified exactly once. but modified it is, even though userInit is a Haskell variable, no MVar or other inherently modifiable thing. depending on what realInit does (and being in IO a, that could be a lot), that may or may not be observable. and as others pointed out, reasoning about programs involving unsafePerformIO involves contextual equivalences, no longer replacing equals in all contexts, so hoping for referential transparency in the general case might be a bit optimistic (that's why one has to disable compiler optimisations based on this property, after all). your specific case seems slightly less problematic since userInit is itself of type IO (), but in combination with the monad laws, one might still run into trouble -- every use of unsafePerformIO indicates a proof obligation that such trouble will not arise (or rather: under what constraints it won't).
Actually, I know I'm not going to have to repeat this yet again because I'm going to make this is my last post on this thread.
as i said in my other post (waiting for moderator approval), there are many people on this thread, and i'm not sure they are all talking about the same thing. perhaps a good step forward would be for each concrete proposal to go into a separate thread (beginning with a summary of the use pattern to be covered and the concrete extension proposal claiming to do the job), and then to see whether there is any consensus for any of them. cheers, claus

as i said in my other post (waiting for moderator approval), there are many people on this thread, and i'm not sure they are all talking about the same thing. perhaps a good step forward would be for each concrete proposal to go into a separate thread (beginning with a summary of the use pattern to be covered and the concrete extension proposal claiming to do the job), and then to see whether there is any consensus for any of them.
Well, my suggestion for one-shot routines would be to implement a simple Haskell library supporting named semaphores, and named channels. These resources need to be managed by the OS, so on unix the obvious way to implement them is to use unix-domain sockets for the channels, but their might be a more efficient way. oneTimeInit = do s <- testAndSetSemaphore "myUniqueString" if s then -- already run else -- not run yet Keean.

The problem is simple enough to restate for anyone who's interested. "Provide a simple reliable mechanism to ensure that in a given program run one particular top level IO operation cannot be executed more than once." No language can guarantee this - all I have to do is run 2 copies of the executable at once... or wven sequentially! Read what I wrote :-)
oh well, if you insist on overspecification: Loading package base ... linking ... done. Prelude> let once io = getContents >> io Prelude> let init = once $ putStrLn "okay" Prelude> init okay Prelude> init *** Exception: <stdin>: hGetContents: illegal operation (handle is closed Prelude> init *** Exception: <stdin>: hGetContents: illegal operation (handle is closed Prelude> this method may have some unexpected side-effects, but you don't mind that, do you?-) unsafePerformIO is a wonderful extension hook for making Haskell implementations do things they wouldn't normally do without having to write such an implementation from scratch. the problem is that those extended Haskell implementations may then do things they wouldn't normally do.. for instance, you shouldn't bet your life on your method working with all future Haskell implementations - you'll have to check with every new release whether your use of the extension hook is still compatible with whatever other progress has been made (e.g., a distributed implementation may decide to start with copies of the code on each node, etc.). today you can say {-# please don't mess with this #-}, or if your compiler is a bit more eager, you may have to say {-# please, please don't mess with this #-}, and it may just work most of the time as long as everybody remembers that there are these user-defined extensions hanging around that will break in horrible ways if we forget that we've left the domain of pure functional programming. i thought the point of this thread was to look for a way to take one particular use pattern of unsafePerformIO that is deemed to be safe, and to devise a proper language extension that captures exactly this use pattern in such a way that no unsafe constructs need be involved anymore. iirc, this use pattern started out as being global variables, then became IO initialisers, then IO initialisers per module, then commutative monads, then merging of IO and ST, then run-once code, .. you won't be able to capture all uses of unsafePerformIO unless you recreate it, which is exactly what you don't want - it is there already, and you want to find ways not having to use it. your example is still useful because it describes a situation at the borderline between the functional and IO worlds where one is tempted to use global variables. as has been pointed out, the reason in this particular case is that one might want to do something in Haskell-land that should perhaps be done in the outside world, because the whole point of the exercise is to make something behave as a functional object when it is not. now one could argue that things should be converted to a functional point of view before importing them into Haskell, or one could argue that as much as possible should be done on the Haskell side, even if that means compromising the language a little or balancing the library author over an abyss.both arguments have their merit. afaik, the main problem that people try to solve with global variable tricks is not executing code (you could call an init action in main), but having to distribute the results of running that code. as others have pointed out, that is similar to the situation with stdin/etc - you want to open the channels *and* make the resulting handles available everywhere. now, if every module by default had a stdinitMVar, you could do your initialisation in main and put the results into Main.stdinitMVar. and if you wanted to forward the information to an imported module, you could put the info into Module.stdinitMVar. and if you wanted per-module initialisation, you'd use Main.init to call Module.init (name init just a convention), which would put its results into its very own Module.stdinitMVar. problem solved. problem solved? i'm not so sure about that, for the same reasons global variables/registers/etc. have been considered evil by many of who reinvented them.and shouldn't multiple instances of modules be possible, each with its own stdinitMVar? but some of the proposals that have been circulating in this thread are even worse, as they include an arbitrary number of user-defined and -named initialisation variables, and arbitrary numbers of initilisation actions, to be called in some underspecified form and sequence, making them hard to predict and find for those having to maintain such code. there are actually at least *two* problems you need to solve: one is providing for those few cases where global-variable-like things are too convenient to consider anything else. that's actually fairly easy. the other is to make sure that the cases in which people consider using your mechanism are as limited as possible, for otherwise people will use them for everything (like the IO monad). that temptation is there because such things are too convenient at the start to be worried about the terrible inconveniences that appear later. it is this second problem people have failed to solve so far in every variation of the scheme. which is why so many in this thread have been burned by someone who abused one of those wonderfully convenient mechanisms. examples from non-functional languages have been mentioned. another is Erlang, where (by convention) processes are instances of modules which (by convention) tend to have init functions, and each process has a process dictionary (a collection of process-local variables). that feature used to be very popular, but its use is now heavily discouraged (although they have the additional difficulty of not distinguishing between IO and non-IO code..): http://www.erlang.se/doc/programming_rules.shtml#REF18861 hth, claus

Any C library which requires an explicit initialisation call before anything in that library can be used (common enough IME). Accidental re-initialisation (e.g. by two independent modules/libraries) will destroy any state currently be used by the libraries existing "clients".
The need to do this may or may not indicate "bad design" on the part of the library author. But so what? It just happens to be a fact that must be dealt with from Haskell (in a safe manner preferably).
You are right, the C library that works like this is "bad design"... any library should really be reentrant, an preferably state free. An example of a well designed C library is the ODBC database connection library, where all the state is stored in opaque handles returned to the user. For 'broken' libraries that cannot support multiple simultaneous contexts, it would be better to use the 'C' FFI based solution suggested by another poster. Ideally you would want to find a library with a better interface - If you tell me the library you wish to use I may be able to suggest a better alternative. Keean.

Just to add a small point... you can see how the 'bad' single context design affects the code that uses it. Because C allows global variables it is possible to write libraries that require once-and-only-once initialisation. In Haskell (without global variables) it is impossible (or at least extreemly hard) to write such librarys. Haskell libraries tend to allow multiple concurrent independent threads of access. Allowing global vars into Haskell would make it easy for coders moving to Haskell from C to carry on coding in a bad style. It seems correcting the problem outside of Haskell and in C is the right approach - as it does not involve making these 'bad' things easier to do in Haskell. Keean. Keean Schupke wrote:
Any C library which requires an explicit initialisation call before anything in that library can be used (common enough IME). Accidental re-initialisation (e.g. by two independent modules/libraries) will destroy any state currently be used by the libraries existing "clients".
The need to do this may or may not indicate "bad design" on the part of the library author. But so what? It just happens to be a fact that must be dealt with from Haskell (in a safe manner preferably).
You are right, the C library that works like this is "bad design"... any library should really be reentrant, an preferably state free. An example of a well designed C library is the ODBC database connection library, where all the state is stored in opaque handles returned to the user.
For 'broken' libraries that cannot support multiple simultaneous contexts, it would be better to use the 'C' FFI based solution suggested by another poster. Ideally you would want to find a library with a better interface - If you tell me the library you wish to use I may be able to suggest a better alternative.
Keean. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Mon, 8 Nov 2004, Keean Schupke wrote:
For 'broken' libraries that cannot support multiple simultaneous contexts, it would be better to use the 'C' FFI based solution suggested by another poster. Ideally you would want to find a library with a better interface - If you tell me the library you wish to use I may be able to suggest a better alternative.
Really? That also interests me. I'm using FFTW and PLPlot (but not with Haskell), both uses internal states and thus must be considered as ill designed. Do you know of better alternatives?

Henning Thielemann
On Mon, 8 Nov 2004, Keean Schupke wrote:
If you tell me the library you wish to use I may be able to suggest a better alternative.
I'm using FFTW and PLPlot (but not with Haskell), both uses internal states and thus must be considered as ill designed. Do you know of better alternatives?
I'm no expert on this, being exposed to FFTW for a couple of hours, but isn't its internal state encapsulated into the 'plan', which is suitable as a handle? -- Feri.

On Tue, 9 Nov 2004, Ferenc Wagner wrote:
Henning Thielemann
writes: On Mon, 8 Nov 2004, Keean Schupke wrote:
If you tell me the library you wish to use I may be able to suggest a better alternative.
I'm using FFTW and PLPlot (but not with Haskell), both uses internal states and thus must be considered as ill designed. Do you know of better alternatives?
I'm no expert on this, being exposed to FFTW for a couple of hours, but isn't its internal state encapsulated into the 'plan', which is suitable as a handle?
Additional to plans it stores some "wisdom" which is handled globally. http://www.fftw.org/fftw3_doc/Thread-safety.html#Thread-safety :-(
participants (7)
-
Adrian Hey
-
Claus Reinke
-
Ferenc Wagner
-
Henning Thielemann
-
Keean Schupke
-
Keith Wansbrough
-
Robert Dockins