
On Thu, Dec 02, 2004 at 09:08:21AM +0000, Keean Schupke wrote:
Ben Rudiak-Gould wrote:
Just a small comment on the Wiki page... it says
"Several real-life examples of pure haskell code which needs fast global variables to either be implemented efficiently or statically guarantee their invariants are given in http://www.haskell.org//pipermail/haskell/2004-November/014929.html"
The first example is that of randomIO - not implementable in Haskell, however the function, randoms :: RandomGen g => g -> [a], is (and is probably more idomatic haskell anyway).
Yes. There are lots of ways to do things without global variables, that was never in doubt. However randomIO is a part of the haskell standard. Why is it not (efficiently) implementable in haskell? There is no particular reason it should not be. it should optimize to exactly about 5 instructions to run the linear congruence algorithm on a static location in memory.
The second example "Unique", can be implemented:
getUniqueSupply = do a <- newIORef 0 return (nextUnique a) where
nextUnqiue n = do x <- readIORef n writeIORef n (x+1) return x
Which should be just as fast as the global version, and all you do is pass the 'unique' supply around... you can even generate a lazy list of unqiues which can be used outside the IO monad. Again the "disadvantage" is that you can have multiple unique supplies and you could use the "wrong" one... (which is an advantage in my opinion, as it increases flexibility and reuse of the code).
Yes, this would be as fast as the global version*, but it implements something else. The entire point of Data.Unique is that one can consider the unique supply as part of the world, just like you consider the filesystem, the screen, the network, various OS routines, etc as part of the world. This should be implementable efficiently, after all, you can store the counter in a file in /tmp, or just create a stub C file to do it, so it is obviously not a bad thing to allow, it is already allowed, it just needs to be able to done efficiently or people will resort to unsafe hacks like unsafePerformIO which is a serious impediment to aggressive compiler optimizations and a plauge on the mathematical semantics of the intermediate language.
The same applies to the AtomHash, it can be implemented just as effieciently without globals... The only difference appears to be the supposed ability of globals stopping the programmer using an alternate Hash... but of course there is nothing stopping the programmer using the wrong global at all! (In other words it seems just as easy to access the wrong top-level name as to pass the wrong parameter).
No, because then it would not typecheck. the whole point of Atom.hs is that the only way to generate values of type 'Atom' is to go through the single unique hash table. Hence the static guarentee that there is always an isomorphism between everything of type 'Atom' and everything of type 'String' in the system. This is only made possible by the modules ability to hide access to routines which could be used to break the invarient (such as the raw global hash). This is obviously a very important invarient! Let us please not confuse the many philosophical issues against global variables in design which I wholeheartily agree with, with what the global variables proposal is meant to achieve. It is for use at the very lowest level of the libraries. i.e. not to be used by the average person. They are for Atom tables, memoization, anti-memoization, I have desires to move some of the runtime stable/weak pointer infrastructure out of being magic implemented by the runtime, to being implemented in haskell itself, this requires the global hash of stablepointers to be implementable directly. Ghc itself is getting rid of global variables AS SEEN BY THE PROGRAMMER but many libraries still NEED them inside to do their clever memoization tricks and fast strings which are required to make ghc usable at all. Really, you should not be opposed to them unless you are also opposed to the FFI. At some level, deep inside the libraries, this functionality is needed, just like the FFI. it is even needed to implement the type indexed execution context proposals. Exposing the fact there is global state will still be a bad idea, their usage will be hidden by pure interfaces by good programers, just like unsafePerformIO or uses of the ST monad are done now. A module which provides observable global state, but does not let you parameterize over it is bad form. For example randomIO has implicit global state, but you can use the parameterized versions such as randoms. unlike Random Atom.hs DOES NOT HAVE IMPLICIT GLOBAL STATE. A perfectly acceptable implementation would be toAtom = fromAtom = id. This is why Atom does not need to be parameterized over its global state. the fact it is used is completly abstracted away because it is an implementation detail. There is a real _fundamental difference_ here, please try to understand it before rebutting. I really dislike this argument because I find myself having to vocally disagree with things I actually fully agree with in their proper context :). They are needed to support the various tricks necessary to create the standard haskell libraries and large programs that need to do real work and are pushing the system to the limits like ghc, ginsu and darcs. At its base level, top level initializers are about providing a very low level tool which provides a generic and semantically correct way to implement necessary performance improving measures and higher level libraries such as George Russel's which provide a much nicer and higher level interface for general users to use. John * actually, under the mdo proposal, Data.Unique can even be implemented faster with some easy compiler transformations to lift static heap allocated cells to the bss. but this is not important for the discussion however I would be interested in implementing them if the mdo proposal is ever implemented. -- John Meacham - ⑆repetae.net⑆john⑈