Re: [Haskell] Top Level <-

I have a feeling this is going to be a very long thread so I'm trying to go to Haskell cafe again (without mucking it up again). Derek Elkins wrote:
Haskell should be moving -toward- a capability-like model, not away from it.
Could you show how to implement Data.Random or Data.Unique using such a model, or any (preferably all) of the use cases identified can be implemented? Like what about implementing the socket API starting with nothing but primitives to peek/poke ethernet mac and dma controller registers? Why should Haskell should be moving -toward- a capability-like model and why does top level <- declarations take us away from it? Regards -- Adrian Hey

Making a network stack from peek and poke is easy in a well structured OS.
The boot loader (or whatever) hands you the capability (call it
something else if you want) to do raw hardware access, and you build
from there. If you look at well structured OSs like NetBSD, this is
pretty much how they work. No hardware drivers use global variables.
-- Lennart
On Tue, Aug 26, 2008 at 6:34 PM, Adrian Hey
I have a feeling this is going to be a very long thread so I'm trying to go to Haskell cafe again (without mucking it up again).
Derek Elkins wrote:
Haskell should be moving -toward- a capability-like model, not away from it.
Could you show how to implement Data.Random or Data.Unique using such a model, or any (preferably all) of the use cases identified can be implemented? Like what about implementing the socket API starting with nothing but primitives to peek/poke ethernet mac and dma controller registers?
Why should Haskell should be moving -toward- a capability-like model and why does top level <- declarations take us away from it?
Regards -- Adrian Hey _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
Making a network stack from peek and poke is easy in a well structured OS. The boot loader (or whatever) hands you the capability (call it something else if you want) to do raw hardware access, and you build from there. If you look at well structured OSs like NetBSD, this is pretty much how they work. No hardware drivers use global variables.
So? We all know this is possible outside Haskell. But I don't want to rely on mysterious black box OS's to "hand me the capability" any more than I want to rely on mysterious extant but unimplementable libs like Data.Unique. Most real world computing infrastructure uses no OS at all. How could I use Haskell to implement such systems? Also (to mis-quote Linus Torvalds) could you or anyone else who agrees with you please SHOW ME THE CODE in *Haskell*! If scripture is all that's on offer I'm just not going to take any of you seriously. Frankly I'm tired of the patronising lectures that always acompany these threads. It'd be good if someone who "knows" global variables are evil could put their code where their mouth is for a change. Fixing up the base libs to eliminate the dozen or so uses of the "unsafePerformIO hack" might be a good place to start. I'll even let you change the API of these libs if you must, provided you can give a sensible explanation why the revised API is better, safer, more convenient or whatever. Regards -- Adrian Hey

I told you where to look at code. It's C code, mind you, but written
in a decent way.
No well written device driver ever accesses memory or IO ports
directly, doing so would seriously hamper portability.
Instead you use an abstraction layer to access to hardware, and the
driver gets passed a "bus" (whatever that might be) access token (akin
to a capability).
I know you're not going to be convinced, so I won't even try. :)
-- Lennart
On Tue, Aug 26, 2008 at 9:47 PM, Adrian Hey
Lennart Augustsson wrote:
Making a network stack from peek and poke is easy in a well structured OS. The boot loader (or whatever) hands you the capability (call it something else if you want) to do raw hardware access, and you build from there. If you look at well structured OSs like NetBSD, this is pretty much how they work. No hardware drivers use global variables.
So? We all know this is possible outside Haskell. But I don't want to rely on mysterious black box OS's to "hand me the capability" any more than I want to rely on mysterious extant but unimplementable libs like Data.Unique. Most real world computing infrastructure uses no OS at all. How could I use Haskell to implement such systems?
Also (to mis-quote Linus Torvalds) could you or anyone else who agrees with you please SHOW ME THE CODE in *Haskell*! If scripture is all that's on offer I'm just not going to take any of you seriously.
Frankly I'm tired of the patronising lectures that always acompany these threads. It'd be good if someone who "knows" global variables are evil could put their code where their mouth is for a change. Fixing up the base libs to eliminate the dozen or so uses of the "unsafePerformIO hack" might be a good place to start. I'll even let you change the API of these libs if you must, provided you can give a sensible explanation why the revised API is better, safer, more convenient or whatever.
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
I told you where to look at code. It's C code, mind you, but written in a decent way. No well written device driver ever accesses memory or IO ports directly, doing so would seriously hamper portability.
Well something must be accessing both. Dunno what you mean by "directly" I take it you must mean that the driver does not make use of global variables or "baked in" port addresses in it's source code.
Instead you use an abstraction layer to access to hardware, and the driver gets passed a "bus" (whatever that might be) access token (akin to a capability).
I know you're not going to be convinced, so I won't even try. :)
I have actually spent the last 20 years or so writing both non-hosted and hosted device drivers for a few OS's. I'm perfectly convinced about the truth of what you say, but not at all convinced about the relevance. It's a red herring IMO as you've introduced a very complex and mysterious black box that itself cannot be implemented without making use of "global variables". You can find them easily enough in the Linux kernel source. I'm sure they'll be there in NetBSD too (never looked though). Regards -- Adrian Hey

I've also written quite a few hosted and non-hosted device drivers (in C).
None of them have any global variables.
On Wed, Aug 27, 2008 at 9:07 AM, Adrian Hey
Lennart Augustsson wrote:
I told you where to look at code. It's C code, mind you, but written in a decent way. No well written device driver ever accesses memory or IO ports directly, doing so would seriously hamper portability.
Well something must be accessing both. Dunno what you mean by "directly" I take it you must mean that the driver does not make use of global variables or "baked in" port addresses in it's source code.
Instead you use an abstraction layer to access to hardware, and the driver gets passed a "bus" (whatever that might be) access token (akin to a capability).
I know you're not going to be convinced, so I won't even try. :)
I have actually spent the last 20 years or so writing both non-hosted and hosted device drivers for a few OS's. I'm perfectly convinced about the truth of what you say, but not at all convinced about the relevance.
It's a red herring IMO as you've introduced a very complex and mysterious black box that itself cannot be implemented without making use of "global variables". You can find them easily enough in the Linux kernel source. I'm sure they'll be there in NetBSD too (never looked though).
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
I've also written quite a few hosted and non-hosted device drivers (in C). None of them have any global variables.
The point is to be able to properly model, understand and if necessary implement *entire systems* without using "global variables" (allegedly). You can always define sub-system boundaries in such a way that you can claim that this/that or the other sub-system, device driver or whatever does not use "global variables" if you put the global variables somewhere else and pass a reference to the sub-system concerned. We could play that game with Data.Unique, for example. Regards -- Adrian Hey

BTW, I'm not contradicting that the use of global variables can be
necessary when interfacing with legacy code, I just don't think it's
the right design when doing something new.
-- Lennart
On Tue, Aug 26, 2008 at 9:47 PM, Adrian Hey
Lennart Augustsson wrote:
Making a network stack from peek and poke is easy in a well structured OS. The boot loader (or whatever) hands you the capability (call it something else if you want) to do raw hardware access, and you build from there. If you look at well structured OSs like NetBSD, this is pretty much how they work. No hardware drivers use global variables.
So? We all know this is possible outside Haskell. But I don't want to rely on mysterious black box OS's to "hand me the capability" any more than I want to rely on mysterious extant but unimplementable libs like Data.Unique. Most real world computing infrastructure uses no OS at all. How could I use Haskell to implement such systems?
Also (to mis-quote Linus Torvalds) could you or anyone else who agrees with you please SHOW ME THE CODE in *Haskell*! If scripture is all that's on offer I'm just not going to take any of you seriously.
Frankly I'm tired of the patronising lectures that always acompany these threads. It'd be good if someone who "knows" global variables are evil could put their code where their mouth is for a change. Fixing up the base libs to eliminate the dozen or so uses of the "unsafePerformIO hack" might be a good place to start. I'll even let you change the API of these libs if you must, provided you can give a sensible explanation why the revised API is better, safer, more convenient or whatever.
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
BTW, I'm not contradicting that the use of global variables can be necessary when interfacing with legacy code, I just don't think it's the right design when doing something new.
AFAICS the use of top level mutable state in the base libs has nothing at all to do with interfacing with legacy code, it's a semantic necessity and there's no legacy code involved. If you want to dispute that then please show some real Haskell code that does as good or better job without it (or point me too the relevant legacy code that makes it necessary). Regards -- Adrian Hey

On Wed, Aug 27, 2008 at 02:23:04AM +0100, Lennart Augustsson wrote:
BTW, I'm not contradicting that the use of global variables can be necessary when interfacing with legacy code, I just don't think it's the right design when doing something new.
As with all design decisions, it is sometimes the right thing and sometimes the wrong one. And sometimes the most expedient. (which, occasionally, is a perfectly valid driving force behind a certain bit of coding). However, I am fully convinced it is necessary. You don't even have to look further than Haskell 98 to find a use in the Random module, and Data.Unique _depends_ on the state being global for correctness. now, _interfaces_ that depend on global state are a completely different matter. To quote what I wrote in the jhc manual: "Note, top level global variables can be indicative of design issues. In general, they should only be used when necessary to interface with an external library, opaque uses inside a library where the shared state can not be externally observed, or inside your Main program as design dictates." However, note the weasel words. Those are in there on purpose, every design calls for different solutions. To blanketly say certain constructs are just wrong to the point of disallowing them in the language, especially when they are common practice at the moment, just doesn't seem right. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
As with all design decisions, it is sometimes the right thing and sometimes the wrong one. And sometimes the most expedient. (which, occasionally, is a perfectly valid driving force behind a certain bit of coding). However, I am fully convinced it is necessary. You don't even have to look further than Haskell 98 to find a use in the Random module, and Data.Unique _depends_ on the state being global for correctness.
..and of course there's stdin, stdout. That takes some explaining. Even with the proposed ACIO and top level <- bindings I still couldn't implement a lib that exported a top level nonStdout handle. It'd have to be a getNonStdout IO action. Regarding the necessity of "global variables", despite what I've been saying it is of course possible to implement entire systems (programs/processes or whatever main corresponds to) without them if you don't mind explicitly creating all those micro states immediately on entry to main and passing the references around. But this is a highly unmodular, inconvenient, unsafe (because you must expose and allow potentially uncontrained use of newWhateverMicroState constuctors) and a general maintainance nightmare. Definitely not the way to go IMO. So it would be more accurate to say that IMO it's impossible to implement many sane and inherently safe IO lib APIs without using "global variables". But people who prefer insane and inherently unsafe APIs could live without them quite happily I guess :-) Regards -- Adrian Hey

On Wed, 2008-08-27 at 11:53 +0100, Adrian Hey wrote:
John Meacham wrote:
As with all design decisions, it is sometimes the right thing and sometimes the wrong one. And sometimes the most expedient. (which, occasionally, is a perfectly valid driving force behind a certain bit of coding). However, I am fully convinced it is necessary. You don't even have to look further than Haskell 98 to find a use in the Random module, and Data.Unique _depends_ on the state being global for correctness.
..and of course there's stdin, stdout. That takes some explaining.
Not really. If you don't have buffered IO, then you just say stdin = 0 stdout = 1 stderr = 2 If you need buffered IO, you just change your IO monad* to look like: newtype NewIO alpha = NewIO (ReaderT (Map Fd Buffer) OldIO alpha) Of course, if you do this, you can't go mixing IO with unique values with RNG with mutable state with everything else under the sun anymore. You might actually have to declare exactly what effects you need when you give your function's type, now. Clearly, a horror we must avoid at all costs. jcc * I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.

Hello Jonathan, Wednesday, August 27, 2008, 8:12:42 PM, you wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
s/it/exchange with external world, i.e., essentially, I/O/ -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, 2008-08-28 at 00:53 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Wednesday, August 27, 2008, 8:12:42 PM, you wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
s/it/exchange with external world, i.e., essentially, I/O/
Bald assertion. I don't buy it. What do IORefs have to do with exchange with the external world? jcc

Hello Jonathan, Thursday, August 28, 2008, 12:57:19 AM, you wrote:
s/it/exchange with external world, i.e., essentially, I/O/
Bald assertion. I don't buy it. What do IORefs have to do with exchange with the external world?
IORefs got their names because they are implemented in IO monad :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, 2008-08-28 at 01:09 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Thursday, August 28, 2008, 12:57:19 AM, you wrote:
s/it/exchange with external world, i.e., essentially, I/O/
Bald assertion. I don't buy it. What do IORefs have to do with exchange with the external world?
IORefs got their names because they are implemented in IO monad :)
But why are they implemented in the IO monad? jcc

Hello Jonathan, Thursday, August 28, 2008, 1:11:35 AM, you wrote:
IORefs got their names because they are implemented in IO monad :)
But why are they implemented in the IO monad?
because they need its features. but i/o may be implemented w/o IORefs. so sequence was: searching for a was to make i/o in Haskell (you can read about this in "History of Haskell"), finding monads as general approach to implement side-effects, naming one of monads used to interact with external world as IO, adding "variables" support to list of operations of this monad -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, 2008-08-28 at 01:23 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Thursday, August 28, 2008, 1:11:35 AM, you wrote:
IORefs got their names because they are implemented in IO monad :)
But why are they implemented in the IO monad?
because they need its features. but i/o may be implemented w/o IORefs. so sequence was: searching for a was to make i/o in Haskell (you can read about this in "History of Haskell"), finding monads as general approach to implement side-effects,
Note that this was true even in Haskell 1.0: the Haskell report included example code with definitions of (>>=) and return in the context of I/O, but without the explicit recognition of the connection to monads.
naming one of monads used to interact with external world as IO, adding "variables" support to list of operations of this monad
Do you really think I'm unaware of the history? I know what happened, what I don't understand is, looking at the matter from a language-design perspective circa 2008, IORefs would go in the same monad as Handles. What is the theoretical justification for this, independent of GHC's current implementation of these and their requisite monad(s)? jcc

On Wed, 2008-08-27 at 14:50 -0700, Jonathan Cast wrote:
On Thu, 2008-08-28 at 01:23 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Thursday, August 28, 2008, 1:11:35 AM, you wrote:
IORefs got their names because they are implemented in IO monad :)
But why are they implemented in the IO monad?
because they need its features.
Oh, and by the way: what `features' of IO do IORefs use? Other than the fact that IO can implement anything in C, of course. jcc

IMO, there's no justification for having IORefs etc in the IO monad.
They should be in a separate monad. There could then be an operation
to lift that monad to the IO monad, if you so wish.
But wait, we already have that, it's the ST monad! (So there is no
justification.)
-- Lennart
On Wed, Aug 27, 2008 at 10:50 PM, Jonathan Cast
On Thu, 2008-08-28 at 01:23 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Thursday, August 28, 2008, 1:11:35 AM, you wrote:
IORefs got their names because they are implemented in IO monad :)
But why are they implemented in the IO monad?
because they need its features. but i/o may be implemented w/o IORefs. so sequence was: searching for a was to make i/o in Haskell (you can read about this in "History of Haskell"), finding monads as general approach to implement side-effects,
Note that this was true even in Haskell 1.0: the Haskell report included example code with definitions of (>>=) and return in the context of I/O, but without the explicit recognition of the connection to monads.
naming one of monads used to interact with external world as IO, adding "variables" support to list of operations of this monad
Do you really think I'm unaware of the history? I know what happened, what I don't understand is, looking at the matter from a language-design perspective circa 2008, IORefs would go in the same monad as Handles. What is the theoretical justification for this, independent of GHC's current implementation of these and their requisite monad(s)?
jcc
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, 2008-08-27 at 23:00 +0100, Lennart Augustsson wrote:
IMO, there's no justification for having IORefs etc in the IO monad. They should be in a separate monad. There could then be an operation to lift that monad to the IO monad, if you so wish. But wait, we already have that, it's the ST monad! (So there is no justification.)
Right. We'd have ST (which has the advantage that, if you don't have any free variables of STRef type, your code can be used in a pure context), together with a monad homomorphism stToProgram :: ST RealWorld alpha -> Program alpha We'd also have IO (stripped of IORef, Random, Unique, and other such irrelevant ugliness), together with a monad homomorphism ioToProgram :: IO alpha -> Program alpha Then, the top-level type rule would be Main.main :: Program () We'd flame people for using the Program monad outside of the Main module or monad homomorphisms like ioToProgram or stToProgram. Then, using the functional programming language research of the last 20 years instead of ignoring it for historical reasons, we'd get a free monad homomorphism ioAndStToProgram :: Coproduct IO (ST RealWorld) alpha -> Program alpha which would let you use both in the same program. It doesn't dispense with the need for top-level Program (yet), but it's a step in the right direction. jcc

Am Mittwoch, 27. August 2008 22:57 schrieb Jonathan Cast:
On Thu, 2008-08-28 at 00:53 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Wednesday, August 27, 2008, 8:12:42 PM, you wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
s/it/exchange with external world, i.e., essentially, I/O/
Bald assertion. I don't buy it. What do IORefs have to do with exchange with the external world?
jcc
Well, you wrote <snip>
change your IO monad* <snip> * I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
I believe, Bulat took it to include getLine, readFile, writeFile et al.

On Wed, 2008-08-27 at 23:20 +0200, Daniel Fischer wrote:
Am Mittwoch, 27. August 2008 22:57 schrieb Jonathan Cast:
On Thu, 2008-08-28 at 00:53 +0400, Bulat Ziganshin wrote:
Hello Jonathan,
Wednesday, August 27, 2008, 8:12:42 PM, you wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
s/it/exchange with external world, i.e., essentially, I/O/
Bald assertion. I don't buy it. What do IORefs have to do with exchange with the external world?
jcc
Well, you wrote <snip>
change your IO monad* <snip> * I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
I believe, Bulat took it to include getLine, readFile, writeFile et al.
These things existed in Haskell 1.0, before IORefs ever did. jcc

On Wed, 27 Aug 2008, Jonathan Cast wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
The 'C' in ACIO says that it commutes with any operation in the IO monad. Without that property you can't safely implement it in a program where the top-level has type IO a. http://www.haskell.org/pipermail/haskell-cafe/2004-November/007664.html Ganesh

On 2008 Aug 27, at 12:12, Jonathan Cast wrote:
* I wonder why that name was chosen? The design doesn't seem to have anything to do with IO, it's more of a `we have this in C so we want it in Haskell too' monad.
As I understand it, "IO" means "anything not encompassed by equationally-reasoned visible program state". This includes randomness (the IO-based aspect of which requires process or OS state). This also encompasses mutable state (e.g. IORefs), since mutability doesn't fit with equational reasoning. So the name is perhaps poorly chosen, because it only encompasses the most common visible application. (And IORefs particularly so, since they're only so named by analogy with STRefs.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Jonathan Cast wrote:
On Wed, 2008-08-27 at 11:53 +0100, Adrian Hey wrote:
John Meacham wrote:
As with all design decisions, it is sometimes the right thing and sometimes the wrong one. And sometimes the most expedient. (which, occasionally, is a perfectly valid driving force behind a certain bit of coding). However, I am fully convinced it is necessary. You don't even have to look further than Haskell 98 to find a use in the Random module, and Data.Unique _depends_ on the state being global for correctness. ..and of course there's stdin, stdout. That takes some explaining.
Not really. If you don't have buffered IO, then you just say
stdin = 0 stdout = 1 stderr = 2
nonStdout = 42? I'm afraid I have no idea what your point is :-( I tried it anyway and doesn't seem to work, but that ain't so surprising as Handles aren't Nums. What needs explaining IMO is that we appear to have "global" Handles exported at the top level from System.IO, but no way for users to write their own modules that do the same for nonStdout, or even to implement getNonStdout. I think that's pretty weird and inconsistent. But perhaps you could show me how to do it with some real Haskell :-)
If you need buffered IO, you just change your IO monad* to look like:
newtype NewIO alpha = NewIO (ReaderT (Map Fd Buffer) OldIO alpha)
Of course, if you do this, you can't go mixing IO with unique values with RNG with mutable state with everything else under the sun anymore. You might actually have to declare exactly what effects you need when you give your function's type, now. Clearly, a horror we must avoid at all costs.
Indeed. If anyone thinks that's the way to go maybe Clean would be of some interest. IMHO Cleans treatment of IO and concurrency is just about the worst thing in an otherwise pretty decent language :-( Regards -- Adrian Hey

On Wed, 2008-08-27 at 02:35 -0700, John Meacham wrote: [cut]
However, note the weasel words. Those are in there on purpose, every design calls for different solutions. To blanketly say certain constructs are just wrong to the point of disallowing them in the language, especially when they are common practice at the moment, just doesn't seem right.
How can a Haskell user say this? And this is indeed exactly what capability languages do. In fact, this is what almost every language does (for one thing in common practice or another.)

On Wed, Aug 27, 2008 at 12:17:46PM -0500, Derek Elkins wrote:
On Wed, 2008-08-27 at 02:35 -0700, John Meacham wrote:
However, note the weasel words. Those are in there on purpose, every design calls for different solutions. To blanketly say certain constructs are just wrong to the point of disallowing them in the language, especially when they are common practice at the moment, just doesn't seem right.
How can a Haskell user say this? And this is indeed exactly what capability languages do. In fact, this is what almost every language does (for one thing in common practice or another.)
But as a system designer I *need* things like peek/poke/ACIO etc. I am the one implementing 'Random' or 'Data.Unique'. If I have to resort to C code to do such things, that makes haskell unsuitable for a wide variety of systems programming tasks (including implementing the haskell 98 libraries). I know it is certainly possible to do a lot of things in a capability-based system. but you don't always want or have the ability to use such a system. I am not sure why it was thought netbsd didn't use global state. a simple grep of the kernel headers for '^extern' turns up hundreds of them. even pure capability based systems such as EROS need it for their implementation. What such systems strive for is no or reduced state in their interface. which is a very different thing and also something that I strive for in haskell code. (and is fairly easy to achieve actually) John -- John Meacham - ⑆repetae.net⑆john⑈

I didn't say NetBSD doesn't use global variables, I said the device
driver model doesn't use global variables.
And quite a few things that used to be in global variables have been
moved into allocated variables and are being passed around instead.
That's simply a better way to structure the code.
I don't don't think global variables should be banned, I just think
they should be severly discouraged.
On Wed, Aug 27, 2008 at 11:25 PM, John Meacham
On Wed, Aug 27, 2008 at 12:17:46PM -0500, Derek Elkins wrote:
On Wed, 2008-08-27 at 02:35 -0700, John Meacham wrote:
However, note the weasel words. Those are in there on purpose, every design calls for different solutions. To blanketly say certain constructs are just wrong to the point of disallowing them in the language, especially when they are common practice at the moment, just doesn't seem right.
How can a Haskell user say this? And this is indeed exactly what capability languages do. In fact, this is what almost every language does (for one thing in common practice or another.)
But as a system designer I *need* things like peek/poke/ACIO etc. I am the one implementing 'Random' or 'Data.Unique'. If I have to resort to C code to do such things, that makes haskell unsuitable for a wide variety of systems programming tasks (including implementing the haskell 98 libraries). I know it is certainly possible to do a lot of things in a capability-based system. but you don't always want or have the ability to use such a system.
I am not sure why it was thought netbsd didn't use global state. a simple grep of the kernel headers for '^extern' turns up hundreds of them. even pure capability based systems such as EROS need it for their implementation. What such systems strive for is no or reduced state in their interface. which is a very different thing and also something that I strive for in haskell code. (and is fairly easy to achieve actually)
John
-- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, Aug 28, 2008 at 12:15:10AM +0100, Lennart Augustsson wrote:
I didn't say NetBSD doesn't use global variables, I said the device driver model doesn't use global variables. And quite a few things that used to be in global variables have been moved into allocated variables and are being passed around instead. That's simply a better way to structure the code.
Indeed. I have experimented with single address space operating systems where it is pretty much the only way to do things at the user level. But I still want to be able to implement my kernel in haskell. :)
I don't don't think global variables should be banned, I just think they should be severly discouraged.
Oh, I certainly agree with that. especially among new programmers. I think ACIO is a particularly elegant way to provide them in haskell for when they are needed. Every time I can avoid resorting to C code for a task without sacrificing performance, aethetics, or correctness, it is a good day. John -- John Meacham - ⑆repetae.net⑆john⑈

I'm certain you can write a kernel in Haskell where the only use of
global variables is those that hardware interfacing forces you to use.
On Thu, Aug 28, 2008 at 3:32 AM, John Meacham
On Thu, Aug 28, 2008 at 12:15:10AM +0100, Lennart Augustsson wrote:
I didn't say NetBSD doesn't use global variables, I said the device driver model doesn't use global variables. And quite a few things that used to be in global variables have been moved into allocated variables and are being passed around instead. That's simply a better way to structure the code.
Indeed. I have experimented with single address space operating systems where it is pretty much the only way to do things at the user level. But I still want to be able to implement my kernel in haskell. :)
I don't don't think global variables should be banned, I just think they should be severly discouraged.
Oh, I certainly agree with that. especially among new programmers. I think ACIO is a particularly elegant way to provide them in haskell for when they are needed. Every time I can avoid resorting to C code for a task without sacrificing performance, aethetics, or correctness, it is a good day.
John
-- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hello Lennart, Thursday, August 28, 2008, 12:00:41 PM, you wrote:
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
moreover, you can write it in Turing machine. the question is just how comfortable it will be :) having an experience of writing medium-size "real-world" app, i have to say that global vars make life much easier - it's a pain to pass all the refs across all the function boundaries. OTOH they are headache - the only parts of program that can't be instantiated many times in concurrent threads are those using global vars so, for me global vars is a must - sometimes you just don't have time budget to make aesthetic design and anyway adding a lot of vars to every function you wrote doesn't look very beautiful. OTOH it's better to limit their usage, but this decision should be made by user, not enforced by language. at practice, ghc already provides the way to make global vars, we say only about making this feature simpler and more standard -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Lennart Augustsson wrote:
I don't don't think global variables should be banned, I just think they should be severly discouraged.
If you're saying a language should not provide a sound way to do this (as I believe you are), then AFAICT for all practical purposes you *are* saying you think global variables should be banned. Where are we going to be if the unsafePerformIO hack ever becomes *really* unsafe? and..
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
But what you haven't explained is why this is even desirable? I don't doubt it's true in an academic sense if you don't mind sacrificing safety and modularity. Why wasn't this done in the (presumably) much simpler case of the Haskell base libs? No hardware constraints there. There are plenty situations where it makes no semantic sense to allow 2 or more or some "thing". A list of all active processes for example. Why would I ever want 2 or more lists of all active processes? I think I'd just be setting myself up for trouble and heartache by even allowing such a possibility. Now I could get the safety I need by wrapping all this stuff up in my own custom augmented IO monad right at the start of main. But this solution still lacks modularity. The top level <- bindings are just a modular and extensible way to achieve the same thing AFAICS (augmenting "real world" state with my own custom state). Regards -- Adrian Hey

Adrian Hey wrote:
There are plenty situations where it makes no semantic sense to allow 2 or more or some "thing". A list of all active processes for example.
"all" referring to what scope? perhaps there occurs a situation with several process (thread) pools, severals cores etc. See also "singleton considered harmful", there are similar arguments: http://www.oreillynet.com/cs/user/view/cs_msg/23417 and also Section 13.3 "Global Data" in McConnell: Code Complete (2nd ed.) has a nice discussion. J.W.

Johannes Waldmann wrote:
Adrian Hey wrote:
There are plenty situations where it makes no semantic sense to allow 2 or more or some "thing". A list of all active processes for example.
"all" referring to what scope? perhaps there occurs a situation with several process (thread) pools, severals cores etc.
Seeing as we're talking about an OS kernel I guess the scope would be all processes active on the (possibly virtual) machine being managed by the OS. But it really doesn't matter what the scope is. "All" is the key word here.
See also "singleton considered harmful", there are similar arguments: http://www.oreillynet.com/cs/user/view/cs_msg/23417
Following the arguments made against the singleton pattern over the years leads me to conclude there are 2 distinct camps. Applications programmers who consider it bad because it's way of making "global variables" and we all know how bad they are, right? Typically these folk appear to have no clue about how the underlying IO library, "framework" and OS infrastructure they are dependent on *actually works*. System programmers who recognise the need for singletons but regard being forced to use the singleton pattern hack as language design defect. The situation seems similar with us. The unsafePerformIO hack is just terrible (especially for a language like Haskell), but why is it being used so often? Is it incompetance of library writers or a language design defect? Regards -- Adrian Hey

On Thu, 2008-08-28 at 10:00 +0100, Adrian Hey wrote:
Lennart Augustsson wrote:
I don't don't think global variables should be banned, I just think they should be severly discouraged.
If you're saying a language should not provide a sound way to do this (as I believe you are), then AFAICT for all practical purposes you *are* saying you think global variables should be banned.
Where are we going to be if the unsafePerformIO hack ever becomes *really* unsafe?
and..
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
But what you haven't explained is why this is even desirable? I don't doubt it's true in an academic sense if you don't mind sacrificing safety
What `safety' is being sacrificed?
and modularity.
What modularity? jcc

Jonathan Cast wrote:
On Thu, 2008-08-28 at 10:00 +0100, Adrian Hey wrote:
Lennart Augustsson wrote:
I don't don't think global variables should be banned, I just think they should be severly discouraged.
If you're saying a language should not provide a sound way to do this (as I believe you are), then AFAICT for all practical purposes you *are* saying you think global variables should be banned.
Where are we going to be if the unsafePerformIO hack ever becomes *really* unsafe?
and..
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use. But what you haven't explained is why this is even desirable? I don't doubt it's true in an academic sense if you don't mind sacrificing safety
What `safety' is being sacrificed?
and modularity.
What modularity?
As I've pointed out several times already you can find simple examples in the standard haskell libs. So far nobody has accepted my challenge to re-implement any of these "competantly" (I.E. avoiding the use of global variables). Why don't you try it with Data.Unique and find out :-) Regards -- Adrian Hey

On Thursday 28 August 2008 12:26:27 pm Adrian Hey wrote:
As I've pointed out several times already you can find simple examples in the standard haskell libs. So far nobody has accepted my challenge to re-implement any of these "competantly" (I.E. avoiding the use of global variables).
Why don't you try it with Data.Unique and find out :-)
Here's a first pass: -- snip -- {-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving #-} module Unique where import Control.Monad.Reader import Control.Monad.Trans import Control.Concurrent.MVar -- Give Uniques a phantom region parameter, so that you can't accidentally -- compare Uniques from two different uniqueness sources. newtype Unique r = Unique Integer deriving Eq newtype U r a = U { unU :: ReaderT (MVar Integer) IO a } deriving (Functor, Monad, MonadIO) -- Higher rank type for region consistency runU :: (forall r. U r a) -> IO a runU m = newMVar 0 >>= runReaderT (unU m) newUnique :: U r (Unique r) newUnique = U (do source <- ask val <- lift $ takeMVar source let next = val + 1 lift $ putMVar source next return $ Unique next) -- hashUnique omitted -- snip -- It's possible that multiple unique sources can exist in a program with this implementation, but because of the region parameter, the fact that a Unique may not be "globally" unique shouldn't be a problem. If your whole program needs arbitrary access to unique values, then I suppose something like: main = runU realMain realMain :: U r () realMain = ... is in order. Insert standard complaints about this implementation requiring liftIO all over the place if you actually want to do other I/O stuff inside the U monad. You could also make a version that extracts to STM, or even a pure version if you don't need unique values across multiple threads. -- Dan

On Thu, Aug 28, 2008 at 01:17:29PM -0400, Dan Doel wrote:
On Thursday 28 August 2008 12:26:27 pm Adrian Hey wrote:
As I've pointed out several times already you can find simple examples in the standard haskell libs. So far nobody has accepted my challenge to re-implement any of these "competantly" (I.E. avoiding the use of global variables).
Why don't you try it with Data.Unique and find out :-)
Here's a first pass:
-- snip --
{-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving #-}
module Unique where
If you want this to actually provide any guarantees, of course, you'll have to provide an export list. David

On Thursday 28 August 2008 2:28:35 pm David Roundy wrote:
On Thu, Aug 28, 2008 at 01:17:29PM -0400, Dan Doel wrote:
On Thursday 28 August 2008 12:26:27 pm Adrian Hey wrote:
As I've pointed out several times already you can find simple examples in the standard haskell libs. So far nobody has accepted my challenge to re-implement any of these "competantly" (I.E. avoiding the use of global variables).
Why don't you try it with Data.Unique and find out :-)
Here's a first pass:
-- snip --
{-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving #-}
module Unique where
If you want this to actually provide any guarantees, of course, you'll have to provide an export list.
Yes, quite right. I didn't spend a lot of time on it. I believe U and unU would need to be hidden to prevent people from doing bad things. Another problem I thought of after the fact is that if you need to extend the IO monad in any other similar way, you're out of luck. However, I think you can modify things to something like: newtype UT r m a = UT { unUT :: ReaderT (MVar Integer) m a } ... runUT :: MonadIO m => (forall r. UT r m a) -> m a runUT m = liftIO (newMVar 0) >>= runReaderT (unUT m) ... Or if you want to get really fancy, maybe you could invent an entire new sectioned, composable IO monad like Datatypes a la Carte. But that's a fair amount more work. Cheers. -- Dan

Dan Doel wrote:
Here's a first pass:
-- snip --
{-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving #-}
module Unique where
import Control.Monad.Reader import Control.Monad.Trans
import Control.Concurrent.MVar
-- Give Uniques a phantom region parameter, so that you can't accidentally -- compare Uniques from two different uniqueness sources. newtype Unique r = Unique Integer deriving Eq
newtype U r a = U { unU :: ReaderT (MVar Integer) IO a } deriving (Functor, Monad, MonadIO)
-- Higher rank type for region consistency runU :: (forall r. U r a) -> IO a runU m = newMVar 0 >>= runReaderT (unU m)
newUnique :: U r (Unique r) newUnique = U (do source <- ask val <- lift $ takeMVar source let next = val + 1 lift $ putMVar source next return $ Unique next)
-- hashUnique omitted
-- snip --
It's possible that multiple unique sources can exist in a program with this implementation, but because of the region parameter, the fact that a Unique may not be "globally" unique shouldn't be a problem. If your whole program needs arbitrary access to unique values, then I suppose something like:
main = runU realMain
realMain :: U r () realMain = ...
is in order.
Insert standard complaints about this implementation requiring liftIO all over the place if you actually want to do other I/O stuff inside the U monad.
Well that wouldn't be my main complaint :-) Thanks for taking the time to do this Dan. I think the safety requirement has been met, but I think it fails on the improved API. The main complaint would be what I see as loss of modularity, in that somehow what should be a small irrelevant detail of the implementation of some obscure module somewhere has propogated it's way all the way upto main. This is something it seems to have in common with all other attempts I've seen to solve the "global variable" problem without actually using a..you know what :-) It doesn't matter whether it's explicit state handle args, withWhateverDo wrappers, novel monads or what. They all have this effect. To me this seems completely at odds with what I thought was generally accepted wisdom of how to write good maintainable, modular software. Namely hiding as much implemention detail possible and keeping APIs as simple and stable as they can be. I don't know if I'm alone in that view nowadays. I'm also not sure I understand why so many people seem to feel that stateful effects must be "accounted for" somehow in the args and/or types of the effecting function. Like if I had.. getThing :: IO Thing ..as an FFI binding, nobody would give it a moments thought. They'd see it from it's type that it had some mysterious world state dependent/effecting behaviour, but would be quite happy to just accept that the didn't really need to worry about all that magic... instead they'd accept that it "just works". Why then, if I want to implement precisely the same thing in Haskell (using a "global variable") does it suddenly become so important for this stateful magic to be accounted for? Like the presence of that "global variable" must be made so very painfully apparent in main (and everywhere else on the dependency path too I guess). In short, I just don't get it :-) Purists aren't going to like it, but I think folk *will* be using "real" global variables in I/O libs for the forseeable future. Seems a shame that they'll have to do this with unsafePerformIO hack though :-( Regards -- Adrian Hey

On Sun, 31 Aug 2008, Adrian Hey wrote:
Thanks for taking the time to do this Dan. I think the safety requirement has been met, but I think it fails on the improved API. The main complaint would be what I see as loss of modularity, in that somehow what should be a small irrelevant detail of the implementation of some obscure module somewhere has propogated it's way all the way upto main.
That's the key point, as I see it - they aren't "irrelevant details of the implementation", they are requirements the implementation places on its context in order for that implementation to be correct. So they should be communicated appropriately.
To me this seems completely at odds with what I thought was generally accepted wisdom of how to write good maintainable, modular software. Namely hiding as much implemention detail possible and keeping APIs as simple and stable as they can be. I don't know if I'm alone in that view nowadays.
It's no problem to hide implementation detail, but I don't think you should hide the *requirement* of the implementation that it has constraints on how it is called, namely that it requires once-only initialisation or whatever.
Purists aren't going to like it, but I think folk *will* be using "real" global variables in I/O libs for the forseeable future. Seems a shame that they'll have to do this with unsafePerformIO hack though :-(
From a "purist" point of view, it's a shame that they choose to do it at all :-)
Ganesh

I don't think anyone has claimed that any interface can be implemented
without globals.
Of course some can't (just pick an interface that is the specification
of a global variable).
What I (and others) claims is that such interfaces are bad. Using a
global variable makes an assumption that there's only ever going to be
one of something, and that's just an inflexible assumption to make.
You think global variables are essential, I think they are a sign of
bad design. So we have different opinions and neither one of us is
going to convince the other.
I think a lot of things related to the IO monad in Haskell is bad
design; influenced by imperative thinking.
For instance, I think the main function should have a type like
main :: (IOMonad io) => io a
where IOMonad contains some basic functionality like calling C.
Then you could do things like implement runInsandboxIO which traces all C calls.
-- Lennart
On Thu, Aug 28, 2008 at 5:26 PM, Adrian Hey
Jonathan Cast wrote:
On Thu, 2008-08-28 at 10:00 +0100, Adrian Hey wrote:
Lennart Augustsson wrote:
I don't don't think global variables should be banned, I just think they should be severly discouraged.
If you're saying a language should not provide a sound way to do this (as I believe you are), then AFAICT for all practical purposes you *are* saying you think global variables should be banned.
Where are we going to be if the unsafePerformIO hack ever becomes *really* unsafe?
and..
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
But what you haven't explained is why this is even desirable? I don't doubt it's true in an academic sense if you don't mind sacrificing safety
What `safety' is being sacrificed?
and modularity.
What modularity?
As I've pointed out several times already you can find simple examples in the standard haskell libs. So far nobody has accepted my challenge to re-implement any of these "competantly" (I.E. avoiding the use of global variables).
Why don't you try it with Data.Unique and find out :-)
Regards -- Adrian Hey
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson wrote:
I don't think anyone has claimed that any interface can be implemented without globals. Of course some can't (just pick an interface that is the specification of a global variable).
I said in the original challenge even I'd let you (anyone) change the interface if you could provide a sensible explanation of why the new interface was better, safer, more convenient or whatever.
What I (and others) claims is that such interfaces are bad. Using a global variable makes an assumption that there's only ever going to be one of something,
It's not an assumption, any more than I always want 1*N to yield N is an assumption. It's a fundamental property I absolutely want to guarantee. By far the simplest way to do this is simply not to expose a newWhatever constructor in my API. If I expose anything it should be Whatever itself or getWatever, neither of which is possible if Whatever contains MVars, Chans and the like. What's more, there seems to be no good *semantic* reason why this should not be possible. The only objections seem dogmatic to me.
and that's just an inflexible assumption to make.
You think global variables are essential, I think they are a sign of bad design. So we have different opinions and neither one of us is going to convince the other.
You might stand some chance of convincing me by showing a better design :-) Dan seems to have had a reasonable go at 1 of them. I'm not sure passes the improved interface test but I'll think about it. But there are quite a few left. There's the Hughes paper too of course, using implicit parameters (a highly dubious language feature IMO). But even if someone does produce an entirely unsafePerformIO hack free set of standard libs, I have to ask why jump through all these hoops? There's no semantic difficulty with the proposed language extension, and it should be very simple to implement (John seems to have done it already). Regards -- Adrian Hey

Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
implicit parameters (a highly dubious language feature IMO).
How can you say that with a straight face at the same time as advocating global variables? :-)
Quite easily, what's the problem? IORefs, Chans etc are perfectly ordinary values. Why should they not exist at the top level? The "global variable" lives in "the world", not the IORef. The IORef is just a reference, no different from filepaths in principle (and if having them at the top level is also evil then making this so easy and not screaming about it seems a little inconsistent to me). Regards -- Adrian Hey

On Thu, 2008-08-28 at 20:28 +0100, Adrian Hey wrote:
Lennart Augustsson wrote:
I don't think anyone has claimed that any interface can be implemented without globals. Of course some can't (just pick an interface that is the specification of a global variable).
I said in the original challenge even I'd let you (anyone) change the interface if you could provide a sensible explanation of why the new interface was better, safer, more convenient or whatever.
What I (and others) claims is that such interfaces are bad. Using a global variable makes an assumption that there's only ever going to be one of something,
It's not an assumption, any more than I always want 1*N to yield N is an assumption.
It's a fundamental property I absolutely want to guarantee. By far the simplest way to do this is simply not to expose a newWhatever constructor in my API. If I expose anything it should be Whatever itself or getWatever, neither of which is possible if Whatever contains MVars, Chans and the like.
This has been answered repeatedly, at least implicitly. Unless you insist that getWhatever should live in the IO monad and have no functional arguments (why?), there is no reason why this should be impossible.
What's more, there seems to be no good *semantic* reason why this should not be possible. The only objections seem dogmatic to me.
I haven't seen you give a non-dogmatic reason for wanting global variables yet, either.
and that's just an inflexible assumption to make.
You think global variables are essential, I think they are a sign of bad design. So we have different opinions and neither one of us is going to convince the other.
You might stand some chance of convincing me by showing a better design :-)
Dan seems to have had a reasonable go at 1 of them. I'm not sure passes the improved interface test but I'll think about it. But there are quite a few left.
There are even more implemented in languages such as ML, Lisp, Perl, etc. I think habit and the fact that globals sort of work in Haskell are the major drivers of their use in the existing standard library. See my position as an attempt to drive back those battalions of darkness :)
There's the Hughes paper too of course, using implicit parameters (a highly dubious language feature IMO).
But even if someone does produce an entirely unsafePerformIO hack free set of standard libs, I have to ask why jump through all these hoops?
To improve the APIs available? You're advocating an extension to a *purely functional programming language*. That's an awfully weird hoop you've already jumped through, there. Some of us think it logically extends to condemning global variables; I haven't seen you give a reason for disagreeing.
There's no semantic difficulty with the proposed language extension,
Although I've noticed it's grossly under-powered compared to what's needed to implement stdin the way you want to. jcc

Jonathan Cast wrote:
This has been answered repeatedly, at least implicitly. Unless you insist that getWhatever should live in the IO monad and have no functional arguments (why?), there is no reason why this should be impossible.
What's more, there seems to be no good *semantic* reason why this should not be possible. The only objections seem dogmatic to me.
I haven't seen you give a non-dogmatic reason for wanting global variables yet, either.
You consider real examples from real *standard* libs that we're all using (and presumably not written by clueless hackers such as myself) to be dogmatic? I would call that pragmatic myself. These are the standard libs after all. Shouldn't we expect them to be the perfect examples of how to do things rite?
But even if someone does produce an entirely unsafePerformIO hack free set of standard libs, I have to ask why jump through all these hoops?
To improve the APIs available?
There's nothing wrong with the APIs as they are as far as I am concerned. It's their implemenation that's the problem.
You're advocating an extension to a *purely functional programming language*.
So? What's being proposed doesn't compromise referential transparency (at least no more that the IO monad already does, as some might argue).
There's no semantic difficulty with the proposed language extension,
Although I've noticed it's grossly under-powered compared to what's needed to implement stdin the way you want to.
Can't recall expressing any opinion about how stdin should be implemented so I don't know what your on about here. Regards -- Adrian Hey

On Thu, 2008-08-28 at 22:24 +0100, Adrian Hey wrote:
Jonathan Cast wrote:
This has been answered repeatedly, at least implicitly. Unless you insist that getWhatever should live in the IO monad and have no functional arguments (why?), there is no reason why this should be impossible.
What's more, there seems to be no good *semantic* reason why this should not be possible. The only objections seem dogmatic to me.
I haven't seen you give a non-dogmatic reason for wanting global variables yet, either.
You consider real examples from real *standard* libs that we're all using (and presumably not written by clueless hackers such as myself) to be dogmatic?
Yeah. Same as if the examples were APIs from ML, or Lisp. The neat thing about Haskell is *precisely* that the ML I/O system has an API that is illegal in Haskell. I see no reason, in principle, why the Haskell standard libraries shouldn't contain APIs that should be illegal in new-and-improved Future Haskell.
I would call that pragmatic myself. These are the standard libs after all. Shouldn't we expect them to be the perfect examples of how to do things rite?
But even if someone does produce an entirely unsafePerformIO hack free set of standard libs, I have to ask why jump through all these hoops?
To improve the APIs available?
There's nothing wrong with the APIs as they are as far as I am concerned.
Right. That's exactly what we're arguing about. We maintain they are inferior. You haven't really given any defense of them at all, other than their existence. I consider that a rather weak argument.
It's their implemenation that's the problem.
You're advocating an extension to a *purely functional programming language*.
So? What's being proposed doesn't compromise referential transparency (at least no more that the IO monad already does, as some might argue).
What's a referential transparency? I just want a language completely specified by its denotational semantics, in the obvious fashion (e.g., (->) maps to an exponential in a real category). If IO compromises that (heck, who am I kidding, *since* IO compromises that), I'm arguing we get rid of whatever features it has that are questionable.
There's no semantic difficulty with the proposed language extension,
Although I've noticed it's grossly under-powered compared to what's needed to implement stdin the way you want to.
Can't recall expressing any opinion about how stdin should be implemented so I don't know what your on about here.
Well, you didn't like *my* implementation. You seem to be rather keen on the current implementation GHC happens to use, where stdin contains internally a mutable buffer. You also seem to be rather insistent that, whatever mechanism GHC uses to get that mutable buffer, be useable to create *any other top-level handle* I want. Now, I happen to know that the only top-level handles that can be established without issuing an open system call are stdin stdout stderr (unless you're happy to have your global nonStdErr start its life attached to an unopened FD). I really don't see what your point is, unless you want to be able to `call open at the top level'. On Thu, 2008-08-28 at 22:01 +0100, Adrian Hey wrote:
Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
implicit parameters (a highly dubious language feature IMO).
How can you say that with a straight face at the same time as advocating global variables? :-)
Quite easily, what's the problem? IORefs, Chans etc are perfectly ordinary values. Why should they not exist at the top level?
They aren't the denotations of any closed Haskell expressions. Scarcely `perfectly ordinary'.
The "global variable" lives in "the world", not the IORef. The IORef is just a reference, no different from filepaths in principle
You didn't like my implementation of stdout along those lines, though...
(and if having them at the top level is also evil then making this so easy and not screaming about it seems a little inconsistent to me).
Good point. We don't just pretend that filepaths make sense in themselves, independently of context. (Or would you like to tell me what the contents of ~/src/globalscript-0.0.1/Language/GlobalScript/Syntax.lhs are?) In fact, this mailing list is dedicated to a language that has the radical idea that you should have to use a whole different type (built using this scary category-theoretical concept called a `monad') just to associate filepaths with file contents. jcc

On Thu, 2008-08-28 at 14:45 -0700, Jonathan Cast wrote:
On Thu, 2008-08-28 at 22:24 +0100, Adrian Hey wrote:
Jonathan Cast wrote:
This has been answered repeatedly, at least implicitly. Unless you insist that getWhatever should live in the IO monad and have no functional arguments (why?), there is no reason why this should be impossible.
What's more, there seems to be no good *semantic* reason why this should not be possible. The only objections seem dogmatic to me.
I haven't seen you give a non-dogmatic reason for wanting global variables yet, either.
You consider real examples from real *standard* libs that we're all using (and presumably not written by clueless hackers such as myself) to be dogmatic?
Yeah. Same as if the examples were APIs from ML, or Lisp. The neat thing about Haskell is *precisely* that the ML I/O system has an API that is illegal in Haskell. I see no reason, in principle, why the Haskell standard libraries shouldn't contain APIs that should be illegal in new-and-improved Future Haskell.
I would call that pragmatic myself. These are the standard libs after all. Shouldn't we expect them to be the perfect examples of how to do things rite?
But even if someone does produce an entirely unsafePerformIO hack free set of standard libs, I have to ask why jump through all these hoops?
To improve the APIs available?
There's nothing wrong with the APIs as they are as far as I am concerned.
Right. That's exactly what we're arguing about. We maintain they are inferior. You haven't really given any defense of them at all, other than their existence. I consider that a rather weak argument.
It's their implemenation that's the problem.
You're advocating an extension to a *purely functional programming language*.
So? What's being proposed doesn't compromise referential transparency (at least no more that the IO monad already does, as some might argue).
What's a referential transparency? I just want a language completely specified by its denotational semantics, in the obvious fashion (e.g., (->) maps to an exponential in a real category).
If IO compromises that (heck, who am I kidding, *since* IO compromises that), I'm arguing we get rid of whatever features it has that are questionable.
There's no semantic difficulty with the proposed language extension,
Although I've noticed it's grossly under-powered compared to what's needed to implement stdin the way you want to.
Can't recall expressing any opinion about how stdin should be implemented so I don't know what your on about here.
Well, you didn't like *my* implementation. You seem to be rather keen on the current implementation GHC happens to use, where stdin contains internally a mutable buffer. You also seem to be rather insistent that, whatever mechanism GHC uses to get that mutable buffer, be useable to create *any other top-level handle* I want. Now, I happen to know that the only top-level handles that can be established without issuing an open system call are
stdin stdout stderr
(unless you're happy to have your global nonStdErr start its life attached to an unopened FD). I really don't see what your point is, unless you want to be able to `call open at the top level'.
On Thu, 2008-08-28 at 22:01 +0100, Adrian Hey wrote:
Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
implicit parameters (a highly dubious language feature IMO).
How can you say that with a straight face at the same time as advocating global variables? :-)
Quite easily, what's the problem? IORefs, Chans etc are perfectly ordinary values. Why should they not exist at the top level?
They aren't the denotations of any closed Haskell expressions. Scarcely `perfectly ordinary'.
The "global variable" lives in "the world", not the IORef. The IORef is just a reference, no different from filepaths in principle
You didn't like my implementation of stdout along those lines, though...
(and if having them at the top level is also evil then making this so easy and not screaming about it seems a little inconsistent to me).
Good point. We don't just pretend that filepaths make sense in themselves, independently of context. (Or would you like to tell me what the contents of ~/src/globalscript-0.0.1/Language/GlobalScript/Syntax.lhs are?) In fact, this mailing list is dedicated to a language that has the radical idea that you should have to use a whole different type (built using this scary category-theoretical concept called a `monad') just to associate filepaths with file contents.
All true. But nevertheless reading it over I think I need to step away from this and cool down a little. jcc

On 2008-08-28 14:45 -0700 (Thu), Jonathan Cast wrote:
Now, I happen to know that the only top-level handles that can be established without issuing an open system call are
stdin stdout stderr
(unless you're happy to have your global nonStdErr start its life attached to an unopened FD).
I've not thought through exactly how this might relate to your argument,
but certainly, though there might or might not be Haskell Handles for
other file descriptors, they can start out open without calling open.
Compile this simple program:
#import

On Thu, 28 Aug 2008, Adrian Hey wrote:
There's no semantic difficulty with the proposed language extension,
How does it behave in the presence of dynamic loading? What about remote procedure calls? Also what if I want a thread-local variable? It seems like an extension like this should also support that, and perhaps other scopes as Duncan suggested; why is the process scope special? Ganesh

Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
There's no semantic difficulty with the proposed language extension,
How does it behave in the presence of dynamic loading?
To answer this you need to be precise about the semantics of what is being dynamically loaded. But this is too complex an issue for me to get in to right now. Actually as far as things like hs-plugins are concerned I'd alway meant one day what exactly a "plugin" is, semantically. But as I've never had cause to use them so never got round to it. Like is it a value, or does it have state and identity or what?
What about remote procedure calls?
Dunno, what problem do you anticipate?
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads. Yes, I'm no big fan of the IO monad (or any other monad in fact) and IORefs and all that (all smacks of putting a purely function veneer on good ol fashioned procedural programming to me). But we are where we are and this isn't going to change any time soon. Just trying to fix what seem like obvious problems with Haskells current IO without doing anything too radical and unproven. (I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO) Regards -- Adrian Hey

On Thu, 28 Aug 2008, Adrian Hey wrote:
Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
There's no semantic difficulty with the proposed language extension,
How does it behave in the presence of dynamic loading?
To answer this you need to be precise about the semantics of what is being dynamically loaded. But this is too complex an issue for me to get in to right now.
If you want to standardise a language feature, you have to explain its behaviour properly. This is one part of the necessary explanation. To be concrete about scenarios I was considering, what happens if: - the same process loads two copies of the GHC RTS as part of two completely independent libraries? For added complications, imagine that one of the libraries uses a different implementation instead (e.g. Hugs) - one Haskell program loads several different plugins in a way that allows Haskell values to pass across the plugin boundary How do these scenarios work with use cases for <- like (a) Data.Unique and (b) preventing multiple instantiation of a sub-library?
Actually as far as things like hs-plugins are concerned I'd alway meant one day what exactly a "plugin" is, semantically. But as I've never had cause to use them so never got round to it. Like is it a value, or does it have state and identity or what?
Personally I think of them as values. I'm not sure what your questions about state and identity mean. If you don't have global variables, then state doesn't matter.
What about remote procedure calls?
Dunno, what problem do you anticipate?
Will Data.Unique still work properly if a value is sent across a RPC interface?
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads.
Even if you don't like them, people still use them.
(I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO)
Creating new language features means defining their semantics rather more clearly than just "no inlining or cse", IMO. Ganesh

C++ faced this very issue by saying that with global data, uniqueness of initialization is guaranteed but order of evaluation is not. Assuming that the global data are merely thunk wrappers over some common data source, this means that at minimum, there can be no data dependencies between plugins where the order of evaluation matters. Another C++ comparison is with a virtual base class, where A::B::D and A::C::D are supposed to be equal, irrespective of whether it was B or C that first called the constructor. In this case, some witness (a vtable) is necessary to ensure that this happens correctly. Dan Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
Ganesh Sittampalam wrote:
On Thu, 28 Aug 2008, Adrian Hey wrote:
There's no semantic difficulty with the proposed language extension,
How does it behave in the presence of dynamic loading?
To answer this you need to be precise about the semantics of what is being dynamically loaded. But this is too complex an issue for me to get in to right now.
If you want to standardise a language feature, you have to explain its behaviour properly. This is one part of the necessary explanation.
To be concrete about scenarios I was considering, what happens if:
- the same process loads two copies of the GHC RTS as part of two completely independent libraries? For added complications, imagine that one of the libraries uses a different implementation instead (e.g. Hugs)
- one Haskell program loads several different plugins in a way that allows Haskell values to pass across the plugin boundary
How do these scenarios work with use cases for <- like (a) Data.Unique and (b) preventing multiple instantiation of a sub-library?
Actually as far as things like hs-plugins are concerned I'd alway meant one day what exactly a "plugin" is, semantically. But as I've never had cause to use them so never got round to it. Like is it a value, or does it have state and identity or what?
Personally I think of them as values. I'm not sure what your questions about state and identity mean. If you don't have global variables, then state doesn't matter.
What about remote procedure calls?
Dunno, what problem do you anticipate?
Will Data.Unique still work properly if a value is sent across a RPC interface?
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads.
Even if you don't like them, people still use them.
(I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO)
Creating new language features means defining their semantics rather more clearly than just "no inlining or cse", IMO.
Ganesh _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Fri, Aug 29, 2008 at 4:33 PM, Dan Weston
C++ faced this very issue by saying that with global data, uniqueness of initialization is guaranteed but order of evaluation is not.
In C++ circles, this is referred to as the "static initialization order fiasco", and it is a frequent cause of crashes that are very difficult to debug. http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12 I think it would be fair to say that C++ pushed this problem off to every user of the language. I haven't seen a coherent description of what the semantics of top-level "<-" should be, but avoidance of widespread swearing would be at the top of my list of requirements.

I actually was more interested in the problems with the "obvious fix" for this, namely the "construct on first use" idiom: int A(int a) { static int aa = a; return aa; } int B() { return A(3); } int C() { return A(7); } int D() { if (today() == "Tuesday") B(); else C(); return a(0); } What is the value of D? Notice that this is never a problem with pure functions. The problem is that today() makes this an IO monad, and the swearing starts again. Dan Bryan O'Sullivan wrote:
On Fri, Aug 29, 2008 at 4:33 PM, Dan Weston
wrote: C++ faced this very issue by saying that with global data, uniqueness of initialization is guaranteed but order of evaluation is not.
In C++ circles, this is referred to as the "static initialization order fiasco", and it is a frequent cause of crashes that are very difficult to debug.
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12
I think it would be fair to say that C++ pushed this problem off to every user of the language. I haven't seen a coherent description of what the semantics of top-level "<-" should be, but avoidance of widespread swearing would be at the top of my list of requirements.

Bryan O'Sullivan wrote:
I haven't seen a coherent description of what the semantics of top-level "<-" should be, but avoidance of widespread swearing would be at the top of my list of requirements.
Don't the ACIO monad properties satisfy you? Anyway, as I pointed out in my last post, if this is a problem with top level <- ACIO monad bindings it's still going to be a problem (probably much worse) with unsafePerformIO hack IO monad bindings. This problem isn't just going to go away, no matter how long it's ignored :-) Regards -- Adrian Hey

On Fri, Aug 29, 2008 at 04:33:50PM -0700, Dan Weston wrote:
C++ faced this very issue by saying that with global data, uniqueness of initialization is guaranteed but order of evaluation is not. Assuming that the global data are merely thunk wrappers over some common data source, this means that at minimum, there can be no data dependencies between plugins where the order of evaluation matters.
Fortunately, we can do a whole lot better with haskell, the type system guarentees that order of evaluation is irrelevant :) no need to specify anything about implementations. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote: On Fri, Aug 29, 2008 at 04:33:50PM -0700, Dan Weston wrote:
C++ faced this very issue by saying that with global data, uniqueness of initialization is guaranteed but order of evaluation is not. Assuming that the global data are merely thunk wrappers over some common data source, this means that at minimum, there can be no data dependencies between plugins where the order of evaluation matters.
Fortunately, we can do a whole lot better with haskell, the type system guarentees that order of evaluation is irrelevant :) no need to specify anything about implementations.
Can't you write two recursive modules with <- that depend on each other, so that there's no valid initialisation order? Contrived example follows: module Module1 where glob1 :: IORef Int glob1 <- mod2 >>= newIORef mod1 :: IO Int mod1 = readIORef glob1 module Module2 where glob2 :: IORef Int glob2 <- mod1 >>= newIORef mod2 :: IO Int mod2 = readIORef glob2 It might need some strictness annotations to actually cause non-termination at initialisation rather than just make the results of mod1 and mod2 be _|_. I think those initialisers do satisfy ACIO, though I'm not certain - from the point of view of dataflow, you can certainly remove them both together if the rest of the code doesn't use mod1 or mod2, and likewise they commute with any other IO operations. But on the other hand there's no way to actually put them in an order that doesn't cause non-termination. Cheers, Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
Can't you write two recursive modules with <- that depend on each other, so that there's no valid initialisation order?
Contrived example follows:
module Module1 where
glob1 :: IORef Int glob1 <- mod2 >>= newIORef
mod1 :: IO Int mod1 = readIORef glob1
module Module2 where
glob2 :: IORef Int glob2 <- mod1 >>= newIORef
mod2 :: IO Int mod2 = readIORef glob2
Immediatly breaking my promise to shut up.. This is illegal because you're only allowed to use ACIO in top level <- bindings and readIORef isn't (and clearly could not be) ACIO. Regards -- Adrian Hey

Contrived example follows:
module Module1 (mod1) where import Module2
glob1 :: IORef Int glob1 <- mod2 >>= newIORef
mod1 :: IO Int mod1 = readIORef glob1
module Module2 (mod2) where
import Module1
glob2 :: IORef Int glob2 <- mod1 >>= newIORef
mod2 :: IO Int mod2 = readIORef glob2
This is illegal because you're only allowed to use ACIO in top level <- bindings and readIORef isn't (and clearly could not be) ACIO.
(made a couple of changes to quoted example; added import statements and explicit export lists) Even though I never call writeIORef on glob1 or glob2, and can change the example as above so we don't export them, so it's impossible to ever do so? As an alternative, consider module Module1 (mod1) where import Module2 glob1 :: Int glob1 <- return $! mod2 mod1 :: Int mod1 = glob1 module Module2 (mod2) where import Module1 glob2 :: Int glob2 <- return $! mod1 mod2 :: Int mod2 = glob2 Even more artificial, of course. Arguably both of these cases are not ACIO simply because of the non-termination effects, but it's not obvious to me how you tell just by looking at either one's code together with the declared API of the other. Is anything strict automatically forbidden by ACIO? Cheers, Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

On Tue, Sep 02, 2008 at 10:10:31AM +0100, Sittampalam, Ganesh wrote:
Contrived example follows:
module Module1 (mod1) where import Module2
glob1 :: IORef Int glob1 <- mod2 >>= newIORef
mod1 :: IO Int mod1 = readIORef glob1
module Module2 (mod2) where
import Module1
glob2 :: IORef Int glob2 <- mod1 >>= newIORef
mod2 :: IO Int mod2 = readIORef glob2
This is illegal because you're only allowed to use ACIO in top level <- bindings and readIORef isn't (and clearly could not be) ACIO.
(made a couple of changes to quoted example; added import statements and explicit export lists)
Even though I never call writeIORef on glob1 or glob2, and can change the example as above so we don't export them, so it's impossible to ever do so?
As an alternative, consider
module Module1 (mod1) where import Module2
glob1 :: Int glob1 <- return $! mod2
mod1 :: Int mod1 = glob1
module Module2 (mod2) where
import Module1
glob2 :: Int glob2 <- return $! mod1
mod2 :: Int mod2 = glob2
Even more artificial, of course.
Arguably both of these cases are not ACIO simply because of the non-termination effects, but it's not obvious to me how you tell just by looking at either one's code together with the declared API of the other. Is anything strict automatically forbidden by ACIO?
Isn't this just a pure infinite loop? Why is it a problem that ACIO allows you the flexibility that's present in any pure code? David

David Roundy wrote:
On Tue, Sep 02, 2008 at 10:10:31AM +0100, Sittampalam, Ganesh wrote:
Arguably both of these cases are not ACIO simply because of the non-termination effects, but it's not obvious to me how you tell just
by looking at either one's code together with the declared API of the
other. Is anything strict automatically forbidden by ACIO?
Isn't this just a pure infinite loop? Why is it a problem that ACIO allows you the flexibility that's present in any pure code?
ACIO promises that you can remove anything unused without changing the behaviour. The same problem doesn't arise in pure code because you can't write top-level strict bindings. The GHC extension to have strict bindings (bang patterns) is explicitly disallowed at top-level: http://www.haskell.org/ghc/docs/latest/html/users_guide/bang-patterns.ht ml Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Ganesh Sittampalam wrote:
Will Data.Unique still work properly if a value is sent across a RPC interface?
A value of type Unique you mean? This isn't possible. Data.Unique has been designed so cannot be Shown/Read or otherwise serialised/deserialised (for obvious reasons I guess).
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads.
Even if you don't like them, people still use them.
AFAICS this is irrelvant for the present discussions as Haskell doesn't support thread local variable thingies. If it ever does being precise about that is someone elses problem. For the time being the scope of IORefs/MVars/Chans is (and should remain) whatever process is described by main (whether or not they appear at top level).
(I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO)
Creating new language features means defining their semantics rather more clearly than just "no inlining or cse", IMO.
I wouldn't even know how to go about that to the satisfaction of purists. But "global variables" *are* being used whether or not the top level <- bindings are implemented. They're in the standard libraries! So if this stuff matters someone had better figure it out :-) Regards -- Adrian Hey

On Sat, 30 Aug 2008, Adrian Hey wrote:
Ganesh Sittampalam wrote:
Will Data.Unique still work properly if a value is sent across a RPC interface?
A value of type Unique you mean? This isn't possible. Data.Unique has been designed so cannot be Shown/Read or otherwise serialised/deserialised (for obvious reasons I guess).
How do the implementers of Data.Unique know that they musn't let them be serialised/deserialised? What stops the same rule from applying to Data.Random?
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads.
Even if you don't like them, people still use them.
AFAICS this is irrelvant for the present discussions as Haskell doesn't support thread local variable thingies. If it ever does being precise about that is someone elses problem.
The fact that your proposal isn't general enough to handle them is a mark against it; standardised language features should be widely applicable, and as orthogonal as possible to other considerations.
For the time being the scope of IORefs/MVars/Chans is (and should remain) whatever process is described by main (whether or not they appear at top level).
And if main isn't the entry point? This comes back to my questions about dynamic loading.
(I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO)
Creating new language features means defining their semantics rather more clearly than just "no inlining or cse", IMO.
I wouldn't even know how to go about that to the satisfaction of purists. But "global variables" *are* being used whether or not the top level <- bindings are implemented. They're in the standard libraries!
So if this stuff matters someone had better figure it out :-)
It's a hack that isn't robust in many situations. We should find better ways to do it, not standardise it. Cheers, Ganesh

Ganesh Sittampalam wrote:
How do the implementers of Data.Unique know that they musn't let them be serialised/deserialised?
Because if you could take a String and convert it to a Unique there would be no guarantee that result was *unique*.
What stops the same rule from applying to Data.Random?
Well the only data type defined by this is StdGen, which is a Read/Show instance. I guess there's no semantic problem with that (can't think of one off hand myself).
Also what if I want a thread-local variable?
Well actually I would say that threads are bad concurrency model so I'm not keen on thread local state at all. Mainly because I'd like to get rid of threads, but also a few other doubts even if we keep threads.
Even if you don't like them, people still use them.
AFAICS this is irrelvant for the present discussions as Haskell doesn't support thread local variable thingies. If it ever does being precise about that is someone elses problem.
The fact that your proposal isn't general enough to handle them is a mark against it; standardised language features should be widely applicable, and as orthogonal as possible to other considerations.
I think the whole thread local state thing is a complete red herring. I've never seen a convincing use case for it and I suspect the only reason these to issues have become linked is that some folk are so convinced that "global variables are evil", they mistakenly think thread local variables must be less evil (because they are "less global"). Anyway, if you understand the reasons why all the real world libraries that do currently use "global variables" do this, it's not hard to see why they don't want this to be thread local (it would break all the safety properties they're trying to ensure). So whatever problem thread local variables might solve, it isn't this one.
For the time being the scope of IORefs/MVars/Chans is (and should remain) whatever process is described by main (whether or not they appear at top level).
And if main isn't the entry point? This comes back to my questions about dynamic loading.
Well you're talking about some non-standard Haskell, so with this and other non standard stuff (like plugins etc) I guess the answer is it's up to whoever's doing this to make sure they do it right. I can't comment further as I don't know what it is they're trying to do, but AFAICS it's not a language design issue at present. If plugins breaks is down to plugins to fix itself, at least until such time as a suitable formal theory of plugins has been developed so it can become standard Haskell :-)
(I.E. Just making existing practice *safe*, at least in the sense that the compiler ain't gonna fcuk it up with INLINING or CSE and every one understands what is and isn't safe in ACIO)
Creating new language features means defining their semantics rather more clearly than just "no inlining or cse", IMO.
I wouldn't even know how to go about that to the satisfaction of purists. But "global variables" *are* being used whether or not the top level <- bindings are implemented. They're in the standard libraries!
So if this stuff matters someone had better figure it out :-)
It's a hack that isn't robust in many situations. We should find better ways to do it, not standardise it.
Nobody's talking about standardising the current hack. This the whole point of the top level <- proposal, which JM seems to think is sound enough for incorporation into JHC (correctly IMO). Nobody's found fault with it, other than the usual global variables are evil mantra :-) Regards -- Adrian Hey

On Sat, 30 Aug 2008, Adrian Hey wrote:
Ganesh Sittampalam wrote:
How do the implementers of Data.Unique know that they musn't let them be serialised/deserialised?
Because if you could take a String and convert it to a Unique there would be no guarantee that result was *unique*.
Well, yes, but if I implemented a library in standard Haskell it would always be safely serialisable/deserialisable (I think). So the global variables hack somehow destroys that property - how do I work out why it does in some cases but not others?
I think the whole thread local state thing is a complete red herring.
I've never seen a convincing use case for it and I suspect the only
Well, I've never seen a convincing use case for global variables :-)
reason these to issues have become linked is that some folk are so convinced that "global variables are evil", they mistakenly think thread local variables must be less evil (because they are "less global").
I don't think they're less evil, just that you might want them for the same sorts of reasons you might want global variables.
If plugins breaks is down to plugins to fix itself, at least until such time as a suitable formal theory of plugins has been developed so it can become standard Haskell :-)
Dynamic loading and plugins work fine with standard Haskell now, because nothing in standard Haskell breaks them. The <- proposal might well break them, which is a significant downside for it. In general, the smaller the "world" that the Haskell standard lives in, the less it can interfere with other concerns. <- massively increases that world, by introducing the concept of a process scope.
It's a hack that isn't robust in many situations. We should find better ways to do it, not standardise it.
Nobody's talking about standardising the current hack. This the whole point of the top level <- proposal,
It just amounts to giving the current hack some nicer syntax and stating some rules under which it can be used. Those rules aren't actually strong enough to provide a guarantee of process level scope.
which JM seems to think is sound enough for incorporation into JHC (correctly IMO). Nobody's found fault with it, other than the usual global variables are evil mantra :-)
Several people have found faults with it, you've just ignored or dismissed them. No doubt from your perspective the faults are irrelevant or untrue, but that's not my perspective. Ganesh

Ganesh Sittampalam wrote:
On Sat, 30 Aug 2008, Adrian Hey wrote:
Because if you could take a String and convert it to a Unique there would be no guarantee that result was *unique*.
Well, yes, but if I implemented a library in standard Haskell it would always be safely serialisable/deserialisable (I think). So the global variables hack somehow destroys that property - how do I work out why it does in some cases but not others?
This has nothing to do with the use of global variables. If you have a set of values that are guaranteed to be distinct ("unique") and you add another random/arbitrary value to that set you have no way of knowing that it is different from any current member (other than searching the entire set, assuming it's available).
Well, I've never seen a convincing use case for global variables :-)
Well apart from all the libs that couldn't be implemented with them...
reason these to issues have become linked is that some folk are so convinced that "global variables are evil", they mistakenly think thread local variables must be less evil (because they are "less global").
I don't think they're less evil, just that you might want them for the same sorts of reasons you might want global variables.
"Global variables" are needed to ensure important safety properties, but the only reasons I've seen people give for thread local variables is that explicit state threading is just so tiresome and ugly. Well that may be (wouldn't disagree), but I'm not aware of any library that simply couldn't be implemented without them.
If plugins breaks is down to plugins to fix itself, at least until such time as a suitable formal theory of plugins has been developed so it can become standard Haskell :-)
Dynamic loading and plugins work fine with standard Haskell now, because nothing in standard Haskell breaks them. The <- proposal might well break them, which is a significant downside for it.
I don't see how, but if so <- bindings are not the cause of the brokeness. They'd still be broken using the unsafePerformIO hack.
In general, the smaller the "world" that the Haskell standard lives in, the less it can interfere with other concerns. <- massively increases that world, by introducing the concept of a process scope.
All IORefs,MVars,Chans scope across the entire process defined by main. Or at least they *should*, if they don't then something is already badly wrong somewhere. This has nothing to do with whether or not they appear at top level. This is what an IORef/MVar whatever is defined to be.
It's a hack that isn't robust in many situations. We should find better ways to do it, not standardise it.
Nobody's talking about standardising the current hack. This the whole point of the top level <- proposal,
It just amounts to giving the current hack some nicer syntax and stating some rules under which it can be used.
No, the unsafePerformIO hack is a hack because it's *unsound*. The compiler doesn't know how to translate this into code that does what the programmer intended. Fortunately ghc at least does have a couple of flags that give the intended result (we hope). The new binding syntax is nicer, but it's real purpose is to leave the compiler no "wriggle room" when interpreting the programmers intent. But then again, I'm sure that some that will be adamant that any way of making "global variables" is a hack. But they'll still be happy to go on using file IO, sockets etc regardless, blissfully unaware of the hacks they are dependent on :-)
Those rules aren't actually strong enough to provide a guarantee of process level scope.
The rules for <- bindings shouldn't have to guarantee this. This should be guaranteed by newMVar returning a new *MVar*, wherever it's used (for example).
which JM seems to think is sound enough for incorporation into JHC (correctly IMO). Nobody's found fault with it, other than the usual global variables are evil mantra :-)
Several people have found faults with it, you've just ignored or dismissed them. No doubt from your perspective the faults are irrelevant or untrue, but that's not my perspective.
I mean semantic faults, as in the proposal just doesn't do what it promises for some subtle reason. If you consider not giving you thread local variables a fault I guess you're entitled to that view, but this was never the intent of the proposal in the first place (that's not what people are trying to do when they use the unsafePerformIO hack). Regards -- Adrian Hey

Adrian Hey wrote:
"Global variables" are needed to ensure important safety properties, but the only reasons I've seen people give for thread local variables is that explicit state threading is just so tiresome and ugly. Well that may be (wouldn't disagree), but I'm not aware of any library that simply couldn't be implemented without them.
I thought I ought to say a bit more about my unkind and hasty words re. thread local variables. This is discussed from time to time and there's a wiki page here sumarising proposals... http://www.haskell.org/haskellwiki/Thread-local_storage One thing that worries me is that nobody seems to know what problem thread local storage is solving, hence diversity of proposals. I'm also a struggling to see why we need it, but I don't have any passionate objections to it either. Unfortunately for those of us that want a solution to the "global variables" problem the two issues seem have been linked as being the part of same problem, so while there's all this uncertainty about what thread local variables are actually going to be used for and what they should look like the (IMO) much simpler "global variables" problem/solution is in limbo. This has been going on 4 or 5 years now IIRC. But the "global variables" problem is really much simpler. All we want is something that does exactly what the unsafePerformIO hack currently does (assuming flag/pragma hackery does the trick), but does it reliably. (IMO, YMMV..) Regards -- Adrian Hey

On Sat, 30 Aug 2008, Adrian Hey wrote:
Ganesh Sittampalam wrote:
Well, yes, but if I implemented a library in standard Haskell it would always be safely serialisable/deserialisable (I think). So the global variables hack somehow destroys that property - how do I work out why it does in some cases but not others?
This has nothing to do with the use of global variables. If you have a set of values that are guaranteed to be distinct ("unique") and you add another random/arbitrary value to that set you have no way of knowing that it is different from any current member (other than searching the entire set, assuming it's available).
OK, never mind about this. I was thinking that referential transparency was violated by remoting, but since unique values can only be constructed in IO, I think I was wrong.
Well, I've never seen a convincing use case for global variables :-)
Well apart from all the libs that couldn't be implemented with them...
They can't be implemented with an interface that is completely oblivious to the fact that the libraries require some state.
Dynamic loading and plugins work fine with standard Haskell now, because nothing in standard Haskell breaks them. The <- proposal might well break them, which is a significant downside for it.
I don't see how, but if so <- bindings are not the cause of the brokeness. They'd still be broken using the unsafePerformIO hack.
Which places the unsafePerformIO hack at fault, seeing as it's unsafe and a hack and all :-) If <- was standard then it'd be up to everyone else to work round its limitations.
In general, the smaller the "world" that the Haskell standard lives in, the less it can interfere with other concerns. <- massively increases that world, by introducing the concept of a process scope.
All IORefs,MVars,Chans scope across the entire process defined by main. Or at least they *should*, if they don't then something is already badly wrong somewhere. This has nothing to do with whether or not they appear at top level. This is what an IORef/MVar whatever is defined to be.
Their scope is where they can be used, and this is something we can explicitly track by inspecting the program text. If they are just used in one part of the program, their scope is limited to that part of the program.
But then again, I'm sure that some that will be adamant that any way of making "global variables" is a hack. But they'll still be happy to go on using file IO, sockets etc regardless, blissfully unaware of the hacks they are dependent on :-)
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does.
Those rules aren't actually strong enough to provide a guarantee of process level scope.
The rules for <- bindings shouldn't have to guarantee this. This should be guaranteed by newMVar returning a new *MVar*, wherever it's used (for example).
The issue is whether the <- is run multiple times in a single process or not, rather than how the thing it calls behaves.
I mean semantic faults, as in the proposal just doesn't do what it promises for some subtle reason.
It doesn't provide "once-only" semantics across an entire process in cases involving dynamic loading or two Haskell libraries together with RTS separately linked into the same C program. I don't know whether you intend that it does promise that or not, but it seems to be necessary for many of the applications that are used to justify it.
If you consider not giving you thread local variables a fault I guess you're entitled to that view, but this was never the intent of the proposal in the first place (that's not what people are trying to do when they use the unsafePerformIO hack).
The thread-local variables point was a relatively minor issue for me compared to the dynamic loading and related issues. Cheers, Ganesh

On 2008 Aug 31, at 10:20, Ganesh Sittampalam wrote:
On Sat, 30 Aug 2008, Adrian Hey wrote:
But then again, I'm sure that some that will be adamant that any way of making "global variables" is a hack. But they'll still be happy to go on using file IO, sockets etc regardless, blissfully unaware of the hacks they are dependent on :-)
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does.
But their representations in Haskell must have the same scope and are therefore de facto global variables. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:20, Ganesh Sittampalam wrote:
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does.
But their representations in Haskell must have the same scope and are therefore de facto global variables.
Yep, but this is not Haskell providing a way to make global variables, it is just providing an interface to ones that already exist. The point is that the RTS can't provide (process-scope) global variables of its own invention, because it can't guarantee to be running at the top-level of a process, which it needs to be in order to control their construction. Cheers, Ganesh

On 2008 Aug 31, at 10:29, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:20, Ganesh Sittampalam wrote:
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does.
But their representations in Haskell must have the same scope and are therefore de facto global variables.
Yep, but this is not Haskell providing a way to make global variables, it is just providing an interface to ones that already exist. The point is that the RTS can't provide (process-scope)
But that is done the same way as providing general global variables, so you can't get away from it. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:29, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:20, Ganesh Sittampalam wrote:
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does.
But their representations in Haskell must have the same scope and are therefore de facto global variables.
Yep, but this is not Haskell providing a way to make global variables, it is just providing an interface to ones that already exist. The point is that the RTS can't provide (process-scope)
But that is done the same way as providing general global variables, so you can't get away from it.
I don't follow what you mean. stdin, stdout and stderr are just file descriptors 0, 1 and 2, aren't they? You can create them as many times as you want with using that information without causing any confusion or conflict. Whereas the <- proposal has a "once-only" requirement. Ganesh

On 2008 Aug 31, at 10:34, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:29, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:20, Ganesh Sittampalam wrote:
I'm not sure of precisely what you mean here, but stdin, stdout and stderr are things provided by the OS to a process. That's what defines them as having process scope, not something the Haskell language or RTS does. But their representations in Haskell must have the same scope and are therefore de facto global variables. Yep, but this is not Haskell providing a way to make global variables, it is just providing an interface to ones that already exist. The point is that the RTS can't provide (process-scope)
But that is done the same way as providing general global variables, so you can't get away from it.
I don't follow what you mean. stdin, stdout and stderr are just file descriptors 0, 1 and 2, aren't they? You can create them as many times as you want with using that information without causing any confusion or conflict. Whereas the <- proposal has a "once-only" requirement.
The convention is to provide buffered versions to improve the performance of file I/O. These buffered filehandles must be created once per runtime instance (and ideally once per process so multiple runtimes don't find themselves overwriting each others' output). -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:34, Ganesh Sittampalam wrote:
I don't follow what you mean. stdin, stdout and stderr are just file descriptors 0, 1 and 2, aren't they? You can create them as many times as you want with using that information without causing any confusion or conflict. Whereas the <- proposal has a "once-only" requirement.
The convention is to provide buffered versions to improve the performance of file I/O. These buffered filehandles must be created once per runtime instance (and ideally once per process so multiple runtimes don't find themselves overwriting each others' output).
In that case it seems that any library that might be used from a runtime that isn't the top-level of a process should avoid doing IO to those handles, for fear of producing output corruption? Ganesh

On 2008 Aug 31, at 10:44, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:34, Ganesh Sittampalam wrote:
I don't follow what you mean. stdin, stdout and stderr are just file descriptors 0, 1 and 2, aren't they? You can create them as many times as you want with using that information without causing any confusion or conflict. Whereas the <- proposal has a "once- only" requirement.
The convention is to provide buffered versions to improve the performance of file I/O. These buffered filehandles must be created once per runtime instance (and ideally once per process so multiple runtimes don't find themselves overwriting each others' output).
In that case it seems that any library that might be used from a runtime that isn't the top-level of a process should avoid doing IO to those handles, for fear of producing output corruption?
You handle it the same way you handle I/O with concurrency: either one of the runtimes is "privileged" to the extent that it owns the filehandles and other runtimes must make an inter-runtime call to use them, or the filehandle structures include locking and are shared across runtimes. Both of these are used in Haskell (see most GUI libraries for the former, and the implementation of Handles for the latter). -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:44, Ganesh Sittampalam wrote:
In that case it seems that any library that might be used from a runtime that isn't the top-level of a process should avoid doing IO to those handles, for fear of producing output corruption?
You handle it the same way you handle I/O with concurrency: either one of the runtimes is "privileged" to the extent that it owns the filehandles and other runtimes must make an inter-runtime call to use them, or the filehandle structures include locking and are shared across runtimes. Both of these are used in Haskell (see most GUI libraries for the former, and the implementation of Handles for the latter).
Where do the filehandle structures live in the latter case? Ganesh

On 2008 Aug 31, at 11:20, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 10:44, Ganesh Sittampalam wrote:
In that case it seems that any library that might be used from a runtime that isn't the top-level of a process should avoid doing IO to those handles, for fear of producing output corruption?
You handle it the same way you handle I/O with concurrency: either one of the runtimes is "privileged" to the extent that it owns the filehandles and other runtimes must make an inter-runtime call to use them, or the filehandle structures include locking and are shared across runtimes. Both of these are used in Haskell (see most GUI libraries for the former, and the implementation of Handles for the latter).
Where do the filehandle structures live in the latter case?
The place you clearly think so little of that you need to ask: process-global (or process-local depending on how you think about it) storage. And everything in that storage must have locking. (And this requirement makes it similar in some ways to other non-directly- accessible process state such as (say) the process id.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 11:20, Ganesh Sittampalam wrote:
Where do the filehandle structures live in the latter case?
The place you clearly think so little of that you need to ask: process-global (or process-local depending on how you think about it) storage. And everything in that storage must have locking.
I'm sorry if this makes me seem ignorant, but I'd never heard of such a thing before. What is the API for accessing such storage, and/or where can I find its documentation? Cheers, Ganesh

On 2008 Aug 31, at 12:01, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 11:20, Ganesh Sittampalam wrote:
Where do the filehandle structures live in the latter case?
The place you clearly think so little of that you need to ask: process-global (or process-local depending on how you think about it) storage. And everything in that storage must have locking.
You'll have to look at specific implementations. One that I can think of off the top of my head is Perl 5's "ithreads"; there is a distinguished allocation store which is global to all ithreads, and the interpreter instance gives you primitives for locking and mutexing (see "use threads::shared;"). -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 12:01, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 11:20, Ganesh Sittampalam wrote:
Where do the filehandle structures live in the latter case?
The place you clearly think so little of that you need to ask: process-global (or process-local depending on how you think about it) storage. And everything in that storage must have locking.
You'll have to look at specific implementations. One that I can think of off the top of my head is Perl 5's "ithreads"; there is a distinguished allocation store which is global to all ithreads, and the interpreter instance gives you primitives for locking and mutexing (see "use threads::shared;").
From what I can see from the source of this, it seems to rely on a global variable inside the Perl library that contains a pointer to the shared state area (actually a separate Perl interpreter of its own, but that's just an implementation detail). However I may be wrong as I was unable to fully figure out how the BOOT: mechanism works from a simple grep.
I'm afraid I don't see how this generalises to sharing something across an entire process where the things that want to do the sharing are not in or controlled by the same shared library. In particular the filehandle structures required for buffered I/O need to be common to every single piece of code in the process that might want to use them, no matter what language or language implementation that code uses. Ganesh

On 2008 Aug 31, at 13:20, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 12:01, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
Where do the filehandle structures live in the latter case? The place you clearly think so little of that you need to ask:
On 2008 Aug 31, at 11:20, Ganesh Sittampalam wrote: process-global (or process-local depending on how you think about it) storage. And everything in that storage must have locking.
You'll have to look at specific implementations. One that I can think of off the top of my head is Perl 5's "ithreads"; there is a distinguished allocation store which is global to all ithreads, and the interpreter instance gives you primitives for locking and mutexing (see "use threads::shared;").
I'm afraid I don't see how this generalises to sharing something across an entire process where the things that want to do the sharing are not in or controlled by the same shared library. In particular the filehandle structures required for buffered I/O need to be common to every single piece of code in the process that might want to use them, no matter what language or language implementation that code uses.
For that you probably want to look at how ld.so.1 and libc interact to share the malloc pool and the stdin/stdout/stderr, among others. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 13:20, Ganesh Sittampalam wrote:
I'm afraid I don't see how this generalises to sharing something across an entire process where the things that want to do the sharing are not in or controlled by the same shared library. In particular the filehandle structures required for buffered I/O need to be common to every single piece of code in the process that might want to use them, no matter what language or language implementation that code uses.
For that you probably want to look at how ld.so.1 and libc interact to share the malloc pool and the stdin/stdout/stderr, among others.
If buffered IO is handled by libc rather than by specific language runtimes, then the same mechanism of using global variables inside libc would work fine; but this technique doesn't extend to providing process-scope shared state for library code that might be loaded multiple times with no knowledge of the other instances. Ganesh

On 2008 Sep 1, at 1:33, Ganesh Sittampalam wrote:
On Sun, 31 Aug 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Aug 31, at 13:20, Ganesh Sittampalam wrote:
I'm afraid I don't see how this generalises to sharing something across an entire process where the things that want to do the sharing are not in or controlled by the same shared library. In particular the filehandle structures required for buffered I/O need to be common to every single piece of code in the process that might want to use them, no matter what language or language implementation that code uses.
For that you probably want to look at how ld.so.1 and libc interact to share the malloc pool and the stdin/stdout/stderr, among others.
If buffered IO is handled by libc rather than by specific language runtimes, then the same mechanism of using global variables inside libc would work fine; but this technique doesn't extend to providing process-scope shared state for library code that might be loaded multiple times with no knowledge of the other instances.
True. If you want to allow for that, you must do it yourself. (I've been bitten by this; older ssh built against older heimdal would load two different versions of the crypto libraries, ssh would initialize one, heimdal would use the other (the initialized flag being global, but the buffers different sizes) and presto, core dump.) This is in large part what the discussion is about: making it possible to write such properly without having to delegate it across the FFI as C code. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 2008 Aug 30, at 6:28, Adrian Hey wrote:
Ganesh Sittampalam wrote:
How do the implementers of Data.Unique know that they musn't let them be serialised/deserialised?
Because if you could take a String and convert it to a Unique there would be no guarantee that result was *unique*.
What stops the same rule from applying to Data.Random?
Well the only data type defined by this is StdGen, which is a Read/ Show instance. I guess there's no semantic problem with that (can't think of one off hand myself).
You *want* to be able to reproduce a given random seed, for simulations and the like. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Ganesh Sittampalam wrote:
How do the implementers of Data.Unique know that they musn't let them be serialised/deserialised? What stops the same rule from applying to Data.Random?
Unique values should be no more deserialisable than IORefs. Is it the functionality of Data.Unique that you object to, or the fact that it's implemented with a global variable? If the former, one could easily build Unique values on top of IORefs, since IORef is in Eq. Thus Data.Unique is no worse than IORefs (ignoring hashability, anyway). If the latter, how do you recommend implementing Data.Unique? Implementing them on IORefs seems ugly. Or should they just be a primitive of the platform, like IORefs themselves? -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Is it the functionality of Data.Unique that you object to, or the fact that it's implemented with a global variable?
If the former, one could easily build Unique values on top of IORefs, since IORef is in Eq. Thus Data.Unique is no worse than IORefs (ignoring hashability, anyway).
If the latter, how do you recommend implementing Data.Unique? Implementing them on IORefs seems ugly.
This seems fine to me. It's based on something that already does work properly across a process scope, instead of some new language feature that is actually hard to implement across the process scope. Ganesh

Ganesh Sittampalam wrote:
This seems fine to me. It's based on something that already does work properly across a process scope,
But you agree that IORefs define a concept of "process scope"?
instead of some new language feature that is actually hard to implement across the process scope.
If we have a concept of "process scope", how is it hard to implement? -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
This seems fine to me. It's based on something that already does work properly across a process scope,
But you agree that IORefs define a concept of "process scope"?
I'm not sure that they *define* process scope, because it might be safe to use them across multiple processes; it depends on OS-dependent properties. But they exist *at least* at process scope.
instead of some new language feature that is actually hard to implement across the process scope.
If we have a concept of "process scope", how is it hard to implement?
Because memory allocation is already implemented, and not in a Haskell-dependent way. If two completely separate Haskell libraries are present in the same process, linked together by a C program, they don't even know about each others existence. But they still don't share memory space. Ganesh

Ganesh Sittampalam wrote:
On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
This seems fine to me. It's based on something that already does work properly across a process scope,
But you agree that IORefs define a concept of "process scope"?
I'm not sure that they *define* process scope, because it might be safe to use them across multiple processes; it depends on OS-dependent properties. But they exist *at least* at process scope.
How can one use IORefs across multiple processes? They cannot be serialised.

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
This seems fine to me. It's based on something that already does work properly across a process scope,
But you agree that IORefs define a concept of "process scope"?
I'm not sure that they *define* process scope, because it might be safe to use them across multiple processes; it depends on OS-dependent properties. But they exist *at least* at process scope.
How can one use IORefs across multiple processes? They cannot be serialised.
Firstly, that's a property of the current implementation, rather than a universal one, IMO. I don't for example see why you couldn't add a newIORef variant that points into shared memory, locking issues aside. Also, the issue is not whether you can *use* them across multiple processes, but whether they are unique across multiple processes. Uniqueness has two possible definitions; aliasing, and representational equality. No two IORefs will ever alias, so by that definition they exist at global scope. For representational equality, that exists at least at process scope, and perhaps more. Ganesh

Ganesh Sittampalam wrote:
Firstly, that's a property of the current implementation, rather than a universal one, IMO. I don't for example see why you couldn't add a newIORef variant that points into shared memory, locking issues aside.
OK, so that would be a new Haskell feature. And it's that feature that would be the problem, not top-level <-. It would bring its own garbage collection issues, for instance. Currently we have shared memory should be raw bytes, and IORef values can't be serialised there.
Also, the issue is not whether you can *use* them across multiple processes, but whether they are unique across multiple processes. Uniqueness has two possible definitions; aliasing, and representational equality. No two IORefs will ever alias, so by that definition they exist at global scope. For representational equality, that exists at least at process scope, and perhaps more.
By global scope, I mean the largest execution scope an IORef created by newIORef can have. Each top-level IORef declaration should create an IORef at most once in this scope. IORefs cannot be serialised, so they cannot be sent over serialised RPC. So let us consider your shared memory possibility. Do you mean simply an IORef of a block of bytes of the shared memory? That would be fine, but that is really a different type than IORef. It still keeps the "global scopes" separate, as IORefs cannot be passed through [Word8]. Or do you mean you could use shared memory to pass across IORefs? This would mean joining the address spaces with no memory protection between them. It would mean joining the garbage collectors somehow. Once you've dealt with that, the issue of making sure that each initialiser runs only once for the new shared space is really only one more issue. -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
Firstly, that's a property of the current implementation, rather than a universal one, IMO. I don't for example see why you couldn't add a newIORef variant that points into shared memory, locking issues aside.
OK, so that would be a new Haskell feature. And it's that feature that would be the problem, not top-level <-. It would bring its own garbage collection issues, for instance.
OK, never mind about that; I agree it's not a very good idea. An IORef shouldn't escape the scope of the RTS/GC that created it.
Also, the issue is not whether you can *use* them across multiple processes, but whether they are unique across multiple processes. Uniqueness has two possible definitions; aliasing, and representational equality. No two IORefs will ever alias, so by that definition they exist at global scope. For representational equality, that exists at least at process scope, and perhaps more.
By global scope, I mean the largest execution scope an IORef created by newIORef can have. Each top-level IORef declaration should create an IORef at most once in this scope.
That's a reasonable definition, if by "execution scope" you mean your previous definition of "where the IORef can be directly used". But it's not process scope; two independent Haskell libraries in the same process can no more share IORefs than two separate Haskell processes. [what I meant by global scope above was "the entire world"] Ganesh

Ganesh Sittampalam wrote:
By global scope, I mean the largest execution scope an IORef created by newIORef can have. Each top-level IORef declaration should create an IORef at most once in this scope.
That's a reasonable definition, if by "execution scope" you mean your previous definition of "where the IORef can be directly used". But it's not process scope; two independent Haskell libraries in the same process can no more share IORefs than two separate Haskell processes.
[what I meant by global scope above was "the entire world"]
OK. Let's call it "top-level scope". Haskell naturally defines such a thing, regardless of processes and processors. Each top-level <- would run at most once in top-level scope. If you had two Haskell runtimes call by C code, each would have its own memory allocator and GC; IORefs, Uniques and thunks cannot be shared between them; and each would have its own top-level scope, even though they're in the same process. -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
OK. Let's call it "top-level scope". Haskell naturally defines such a thing, regardless of processes and processors. Each top-level <- would run at most once in top-level scope.
If you had two Haskell runtimes call by C code, each would have its own memory allocator and GC; IORefs, Uniques and thunks cannot be shared between them; and each would have its own top-level scope, even though they're in the same process.
That sounds more feasible - though it does constrain a plugin architecture (in which Haskell code can dynamically load other Haskell code) to cooperate with the loading RTS and not load multiple copies of modules; this might make linking tricky. There's also the problem Duncan Coutts mentioned about loading multiple versions of the same module - what are the semantics of <- in relation to that? Also, it's no use for mediating access to a resource or library that can only be accessed once, right? In fact, even without the problem of two Haskell runtimes in one process this can't work, since some library in another language might also choose to access that resource or library. What applications does this leave beyond Data.Unique and Random? Ganesh

Ganesh Sittampalam wrote:
On Sat, 30 Aug 2008, Ashley Yakeley wrote:
OK. Let's call it "top-level scope". Haskell naturally defines such a thing, regardless of processes and processors. Each top-level <- would run at most once in top-level scope.
If you had two Haskell runtimes call by C code, each would have its own memory allocator and GC; IORefs, Uniques and thunks cannot be shared between them; and each would have its own top-level scope, even though they're in the same process.
That sounds more feasible - though it does constrain a plugin architecture (in which Haskell code can dynamically load other Haskell code) to cooperate with the loading RTS and not load multiple copies of modules; this might make linking tricky.
This is a good idea anyway. It's up to the dynamic loading architecture to get this right.
There's also the problem Duncan Coutts mentioned about loading multiple versions of the same module - what are the semantics of <- in relation to that?
If they are different versions, they ought to be considered different modules with different names. Thus, Unique in base-3.0.2.0 ought to be a different type than Unique in base-4.0. Thus any top-level initialisers ought to be considered different and be run separately. What's the current static behaviour? What happens if I link with packages B & C, which link with different versions of A?
Also, it's no use for mediating access to a resource or library that can only be accessed once, right? In fact, even without the problem of two Haskell runtimes in one process this can't work, since some library in another language might also choose to access that resource or library.
What applications does this leave beyond Data.Unique and Random?
So far we've just looked at declaring top-level IORefs and MVars. By declaring top-level values of type IOWitness, you can generate open witnesses to any type, and thus solve the expression problem. See my open witness library and paper: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/open-witness http://semantic.org/stuff/Open-Witnesses.pdf -- Ashley Yakeley

On Sun, 31 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
That sounds more feasible - though it does constrain a plugin architecture (in which Haskell code can dynamically load other Haskell code) to cooperate with the loading RTS and not load multiple copies of modules; this might make linking tricky.
This is a good idea anyway. It's up to the dynamic loading architecture to get this right.
Well, the question of whether multiple copies of a module are ok is still open, I guess - as you say later, it seems perfectly reasonable for two different versions of Data.Unique to exist, each with their own types and global variables - so why not two copies of the same version, as long as the types aren't convertible? My feeling is that the the execution of <- needs to follow the Data.Typeable instances - if the two types are the same according to Data.Typeable, then there must only be one <- executed. So another question following on from that is what happens if there isn't any datatype that is an essential part of the module - with Unique, it's fine for there to be two <-s, as long as the Uniques aren't compared. Does this kind of safety property apply elsewhere? It feels to me that this is something ACIO (or whatever it would be called after being changed) needs to explain.
What applications does this leave beyond Data.Unique and Random?
So far we've just looked at declaring top-level IORefs and MVars.
By declaring top-level values of type IOWitness, you can generate open witnesses to any type, and thus solve the expression problem. See my open witness library and paper: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/open-witness http://semantic.org/stuff/Open-Witnesses.pdf
I'd rather use Data.Typeable for this, and make sure (by whatever mechanism, e.g. compiler-enforced, or just an implicit contract) that the user doesn't break things with dodgy Typeable instances. Cheers, Ganesh

Ganesh Sittampalam wrote:
Well, the question of whether multiple copies of a module are ok is still open, I guess - as you say later, it seems perfectly reasonable for two different versions of Data.Unique to exist, each with their own types and global variables - so why not two copies of the same version, as long as the types aren't convertible? My feeling is that the the execution of <- needs to follow the Data.Typeable instances - if the two types are the same according to Data.Typeable, then there must only be one <- executed.
They will be different types if they are in different package versions. Thus they could have different instances of Typeable. But why do we care about Typeable?
So another question following on from that is what happens if there isn't any datatype that is an essential part of the module - with Unique, it's fine for there to be two <-s, as long as the Uniques aren't compared. Does this kind of safety property apply elsewhere? It feels to me that this is something ACIO (or whatever it would be called after being changed) needs to explain.
In the internal implementation of Unique, there must be only one MVar constructed with <- per Unique type, i.e. per package version. This will work correctly, since values of Unique types from different package versions have different types, and thus cannot be compared. Unique values constructed at top level by <- will also be unique and will work correctly. ua <- newUnique ub <- newUnique Here ua == ub will evaluate to False.
I'd rather use Data.Typeable for this, and make sure (by whatever mechanism, e.g. compiler-enforced, or just an implicit contract) that the user doesn't break things with dodgy Typeable instances.
You don't think that's rather ugly: a class that needs special "deriving" behaviour? I'd actually like to get rid of all special-case "deriving": it should be for newtypes only. Implicit contract is worse. I really shouldn't be able to write coerce without referring to something marked "unsafe" or "foreign". Have we stopped caring about soundness? In addition, one can only have one Typeable instance per type. By contrast, one can create multiple IOWitness values for the same type. For example, one can very easily create a system of open exceptions for IO, with an IOWitness value for each exception type, witnessing to the data that the exception carries. -- Ashley Yakeley

On Mon, 1 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
Well, the question of whether multiple copies of a module are ok is still open, I guess - as you say later, it seems perfectly reasonable for two different versions of Data.Unique to exist, each with their own types and global variables - so why not two copies of the same version, as long as the types aren't convertible? My feeling is that the the execution of <- needs to follow the Data.Typeable instances - if the two types are the same according to Data.Typeable, then there must only be one <- executed.
They will be different types if they are in different package versions.
Right, but they might be the same package version, if one is a dynamically loaded bit of code and the other isn't.
Thus they could have different instances of Typeable. But why do we care about Typeable?
Because of the coercion operation that follows from it.
So another question following on from that is what happens if there isn't any datatype that is an essential part of the module - with Unique, it's fine for there to be two <-s, as long as the Uniques aren't compared. Does this kind of safety property apply elsewhere? It feels to me that this is something ACIO (or whatever it would be called after being changed) needs to explain.
In the internal implementation of Unique, there must be only one MVar constructed with <- per Unique type, i.e. per package version. This will work correctly, since values of Unique types from different package versions have different types, and thus cannot be compared.
Unique values constructed at top level by <- will also be unique and will work correctly.
My question was actually about what happens with some different library that needs <-; how do we know whether having two <-s is safe or not?
I'd rather use Data.Typeable for this, and make sure (by whatever mechanism, e.g. compiler-enforced, or just an implicit contract) that the user doesn't break things with dodgy Typeable instances.
You don't think that's rather ugly: a class that needs special "deriving" behaviour? I'd actually like to get rid of all special-case "deriving": it should be for newtypes only.
No, it seems like the right way to do introspection to me, rather than adding some new mechanism for describing a datatype as your paper suggests.
Implicit contract is worse. I really shouldn't be able to write coerce without referring to something marked "unsafe" or "foreign". Have we stopped caring about soundness?
We could arrange for the class member of Typeable to be called "unsafe...".
In addition, one can only have one Typeable instance per type. By contrast, one can create multiple IOWitness values for the same type. For example, one can very easily create a system of open exceptions for IO, with an IOWitness value for each exception type, witnessing to the data that the exception carries.
I don't see what the point of multiple values is, I'm afraid. A single instance of Typeable is fine for doing type equality tests. Cheers, Ganesh

Ganesh Sittampalam wrote:
Right, but they might be the same package version, if one is a dynamically loaded bit of code and the other isn't.
OK. It's up to the dynamic loader to deal with this, and make sure that initialisers are not run more than once when it loads the package into the RTS. The scopes and names are all well-defined. How hard is this?
My question was actually about what happens with some different library that needs <-; how do we know whether having two <-s is safe or not?
I don't understand. When is it not safe?
No, it seems like the right way to do introspection to me, rather than adding some new mechanism for describing a datatype as your paper suggests.
Aesthetic arguments are always difficult. The best I can say is, why are some classes blessed with a special language-specified behaviour? It looks like an ugly hack to me. We have a class with a member that may be safely exposed to call, but not safely exposed to define. How is this the right way? By contrast, top-level <- is straightforward to understand. Even the scope issues are not hard. It's safe, it doesn't privilege a class with special and hidden functionality, it doesn't introspect into types, and it allows individual unique values rather than just unique instances per type. And it also allows top-level IORefs and MVars.
We could arrange for the class member of Typeable to be called "unsafe...".
We could, but it's not actually unsafe to call as such. It's only unsafe to implement. And if we're going the implicit contract route, we have to resort to unsafe functions to do type representation. It's not necessary, and seems rather against the spirit of Haskell. Time was when people would insist that unsafePerformIO wasn't Haskell, though perhaps useful for debugging. Now we have all these little unsafe things because people think they're necessary, and there's an implicit contract forced on the user not to be unsafe. But it turns out that they're not necessary.
I don't see what the point of multiple values is, I'm afraid. A single instance of Typeable is fine for doing type equality tests.
Sometimes you want to do witness equality tests rather than type equality tests. For instance, I might have a "foo" exception and a "bar" exception, both of which carry an Int. Rather than create new Foo and Bar types, I can just create a new witness for each. Or if I want, I can create a dictionary of heterogeneous items, with IOWitness values as keys. Then I can do a top-level <- to declare keys in this dictionary. Now I've got OOP objects. -- Ashley Yakeley

On Mon, 1 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
Right, but they might be the same package version, if one is a dynamically loaded bit of code and the other isn't.
OK. It's up to the dynamic loader to deal with this, and make sure that initialisers are not run more than once when it loads the package into the RTS. The scopes and names are all well-defined. How hard is this?
I have a feeling it might be non-trivial; the dynamically loaded bit of code will need a separate copy of the module in question, since it might be loaded into something where the module is not already present. So it'll have a separate copy of the global variable in a separate location, and the dynamic loader needs to arrange to do something weird, like copying the value of the first run <- to the second one instead of running it again.
My question was actually about what happens with some different library that needs <-; how do we know whether having two <-s is safe or not?
I don't understand. When is it not safe?
Well, the safety of <- being run twice in the Data.Unique case is based around the two different Data.Unique types not being compatible. Let's suppose some other module uses a <-, but returns things based on that <- that are some standard type, rather than a type it defines itself. Is module duplication still safe?
No, it seems like the right way to do introspection to me, rather than adding some new mechanism for describing a datatype as your paper suggests.
Aesthetic arguments are always difficult. The best I can say is, why are some classes blessed with a special language-specified behaviour?
Well, let me put it this way; since I don't like <-, and I don't particularly mind Typeable, I wouldn't accept IOWitness as an example of something that requires <- to implement correctly, because I don't see any compelling feature that you can only implement with <-.
We could arrange for the class member of Typeable to be called "unsafe...".
We could, but it's not actually unsafe to call as such. It's only unsafe to implement.
That's fine, it can export a non-class member without the unsafe prefix.
And if we're going the implicit contract route, we have to resort to unsafe functions to do type representation. It's not necessary, and seems rather against the spirit of Haskell.
Time was when people would insist that unsafePerformIO wasn't Haskell, though perhaps useful for debugging. Now we have all these little unsafe things because people think they're necessary, and there's an implicit contract forced on the user not to be unsafe. But it turns out that they're not necessary.
There's some unsafety somewhere in both Typeable and IOWitnesses, and in both cases it can be completely hidden from the user - with Typeable, just don't let the user define the typeOf function at all themselves. I'm not actually sure why it is exposed; is it necessary for some use pattern?
I don't see what the point of multiple values is, I'm afraid. A single instance of Typeable is fine for doing type equality tests.
Sometimes you want to do witness equality tests rather than type equality tests. For instance, I might have a "foo" exception and a "bar" exception, both of which carry an Int. Rather than create new Foo and Bar types, I can just create a new witness for each.
This is precisely what newtype is designed for, IMO. We don't need another mechanism to handle it. Cheers, Ganesh

Ganesh Sittampalam wrote:
I have a feeling it might be non-trivial; the dynamically loaded bit of code will need a separate copy of the module in question, since it might be loaded into something where the module is not already present.
Already the dynamic loader must load the module into the same address space and GC, i.e. the same runtime. So it should be able to make sure only one copy gets loaded. What is the status of dynamic loading in Haskell? What does hs-plugins do currently?
Well, the safety of <- being run twice in the Data.Unique case is based around the two different Data.Unique types not being compatible.
Right. The only code that can construct Unique values is internal to Data.Unique.
Let's suppose some other module uses a <-, but returns things based on that <- that are some standard type, rather than a type it defines itself. Is module duplication still safe?
In this case, duplicate modules of different versions is as safe as different modules. In other words, this situation: mypackage-1.0 that uses <- mypackage-2.0 that uses <- is just as safe as this situation: mypackage-1.0 that uses <- otherpackage-1.0 that uses <- The multiple versions issue doesn't add any problems.
Well, let me put it this way; since I don't like <-, and I don't particularly mind Typeable, I wouldn't accept IOWitness as an example of something that requires <- to implement correctly, because I don't see any compelling feature that you can only implement with <-.
Why don't you like <-? Surely I've addressed all the issues you raise? Multiple package versions does not actually cause any problems. Capabilities would be really nice, but the right approach for that is to create a new execution monad. There is an obligation regarding dynamic loading, but it looks like dynamic loading might need work anyway. Since this is a matter of aesthetics, I imagine it will end with a list of pros and cons.
There's some unsafety somewhere in both Typeable and IOWitnesses, and in both cases it can be completely hidden from the user - with Typeable, just don't let the user define the typeOf function at all themselves.
It's worse than that. If you derive an instance of Typeable for your type, it means everyone else can peer into your constructor functions and other internals. Sure, it's not unsafe, but it sure is ugly.
Sometimes you want to do witness equality tests rather than type equality tests. For instance, I might have a "foo" exception and a "bar" exception, both of which carry an Int. Rather than create new Foo and Bar types, I can just create a new witness for each.
This is precisely what newtype is designed for, IMO. We don't need another mechanism to handle it.
It's not what newtype is designed for. Newtype is designed to create usefully new types. Here, we're only creating different dummy types so that we can have different TypeRep values, which act as witnesses. It's the TypeReps that actually do the work. It would be much cleaner to declare the witnesses directly. -- Ashley Yakeley

On Tue, 2 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
I have a feeling it might be non-trivial; the dynamically loaded bit of code will need a separate copy of the module in question, since it might be loaded into something where the module is not already present.
Already the dynamic loader must load the module into the same address space and GC, i.e. the same runtime. So it should be able to make sure only one copy gets loaded.
I don't think it's that easy, modules aren't compiled independently of each other, and there are lots of cross-module optimisations and so on.
What is the status of dynamic loading in Haskell? What does hs-plugins do currently?
I don't know for sure, but I think it would load it twice. In any case, what I'm trying to establish below is that it should be a safety property of <- that the entire module (or perhaps mutually recursive groups of them?) can be duplicated safely - with a new name, or as if with a new name - and references to it randomly rewritten to the duplicate, as long as the result still type checks. If that's the case, then it doesn't matter whether hs-plugins loads it twice or not.
Let's suppose some other module uses a <-, but returns things based on that <- that are some standard type, rather than a type it defines itself. Is module duplication still safe?
In this case, duplicate modules of different versions is as safe as different modules. In other words, this situation:
mypackage-1.0 that uses <- mypackage-2.0 that uses <-
is just as safe as this situation:
mypackage-1.0 that uses <- otherpackage-1.0 that uses <-
The multiple versions issue doesn't add any problems.
Agreed - and I further claim that duplicating the entire module itself can't cause any problems.
Well, let me put it this way; since I don't like <-, and I don't particularly mind Typeable, I wouldn't accept IOWitness as an example of something that requires <- to implement correctly, because I don't see any compelling feature that you can only implement with <-.
Why don't you like <-? Surely I've addressed all the issues you raise?
I'm still not happy that the current specification is good enough, although I think this thread is getting closer to something that might work. Even with a good specification for <-, I would rather see the need for "once-only" state reflected in the type of things that have such a need.
There is an obligation regarding dynamic loading, but it looks like dynamic loading might need work anyway.
I think the obligation should be on <-, and the obligation is the duplication rule I proposed above.
Since this is a matter of aesthetics, I imagine it will end with a list of pros and cons.
Agreed.
There's some unsafety somewhere in both Typeable and IOWitnesses, and in both cases it can be completely hidden from the user - with Typeable, just don't let the user define the typeOf function at all themselves.
It's worse than that. If you derive an instance of Typeable for your type, it means everyone else can peer into your constructor functions and other internals. Sure, it's not unsafe, but it sure is ugly.
True. I would argue that this is better solved with a better typeclass hierarchy (e.g. one class to supply a witness-style representation that only supports equality, then the typereps on top of that if you want introspection too).
Sometimes you want to do witness equality tests rather than type equality tests. For instance, I might have a "foo" exception and a "bar" exception, both of which carry an Int. Rather than create new Foo and Bar types, I can just create a new witness for each.
This is precisely what newtype is designed for, IMO. We don't need another mechanism to handle it.
It's not what newtype is designed for. Newtype is designed to create usefully new types. Here, we're only creating different dummy types so that we can have different TypeRep values, which act as witnesses. It's the TypeReps that actually do the work.
newtype is frequently used to create something that you can make a separate set of typeclass instances for. This is no different. You can argue that this use of newtype is wrong, but there's no point in just providing an alternative in one specific case. Ganesh

On Tue, Sep 2, 2008 at 4:19 PM, Ganesh Sittampalam
On Tue, 2 Sep 2008, Ashley Yakeley wrote:
It's worse than that. If you derive an instance of Typeable for your type, it means everyone else can peer into your constructor functions and other internals. Sure, it's not unsafe, but it sure is ugly.
True. I would argue that this is better solved with a better typeclass hierarchy (e.g. one class to supply a witness-style representation that only supports equality, then the typereps on top of that if you want introspection too).
Isn't that what we have right now? Typeable gives you a TypeRep, which
can be compared for equality. All the introspection stuff is in Data.
--
Dave Menendez

Ganesh Sittampalam wrote:
In any case, what I'm trying to establish below is that it should be a safety property of <- that the entire module (or perhaps mutually recursive groups of them?) can be duplicated safely - with a new name, or as if with a new name - and references to it randomly rewritten to the duplicate, as long as the result still type checks.
That's not acceptable. This would cause Unique to break, as its MVar would be created twice. It would also mean that individual Unique and IOWitness values created by <- would have different values depending on which bit of code was referencing them. It would render the extension useless as far as I can see. It also introduces odd execution scopes again. In order for <- to work, it must be understood that a given <- initialiser in a given module in a given version of a given package will execute at most once per RTS. But your restriction breaks that. It's worth mentioning that the current Data.Unique is part of the standard base library, while hs-plugins is rather experimental. Currently Data.Unique uses the "NOINLINE unsafePerformIO" hack to create its MVar. If hs-plugins duplicates that MVar, that's a bug in hs-plugins. It's up to a dynamic loader to get initialisation code correct. -- Ashley Yakeley

Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
In any case, what I'm trying to establish below is that it should be a safety property of <- that the entire module (or perhaps mutually recursive groups of them?) can be duplicated safely - with a new name, or as if with a new name - and references to it randomly rewritten to
the duplicate, as long as the result still type checks.
That's not acceptable. This would cause Unique to break, as its MVar would be created twice. It would also mean that individual Unique and IOWitness values created by <- would have different values depending on which bit of code was referencing them. It would render the extension useless as far as I can see.
The result wouldn't typecheck if two Unique values that now pointed to the two different modules were compared. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
That's not acceptable. This would cause Unique to break, as its MVar would be created twice. It would also mean that individual Unique and IOWitness values created by <- would have different values depending on which bit of code was referencing them. It would render the extension useless as far as I can see.
The result wouldn't typecheck if two Unique values that now pointed to the two different modules were compared.
I don't understand. If the dynamic loader were to load the same package name and version, and it duplicated the MVar, then Unique values would have the same type and could be compared. -- Ashley Yakeley

I don't understand. If the dynamic loader were to load the same
Ashley Yakeley wrote: package
name and version, and it duplicated the MVar, then Unique values would have the same type and could be compared.
I am suggesting that this duplication process, whether conducted by the dynamic loader or something else, should behave as if they did not have the same package name or version. This is certainly a valid transformation for Data.Unique, I am simply saying that it should be a valid transformation on any module. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
I am suggesting that this duplication process, whether conducted by the dynamic loader or something else, should behave as if they did not have the same package name or version.
This is certainly a valid transformation for Data.Unique, I am simply saying that it should be a valid transformation on any module.
So if I dynamically load module M that uses base, I will in fact get a completely new and incompatible version of Maybe, IO, [], Bool, Char etc. in all the type-signatures of M? -- Ashley Yakeley

Ashley Yakeley wrote:
Sittampalam, Ganesh wrote:
I am suggesting that this duplication process, whether conducted by the dynamic loader or something else, should behave as if they did not have the same package name or version.
This is certainly a valid transformation for Data.Unique, I am simply saying that it should be a valid transformation on any module.
So if I dynamically load module M that uses base, I will in fact get a completely new and incompatible version of Maybe, IO, [], Bool, Char etc. in all the type-signatures of M?
I think it treats them as compatible, using the fact that Data.Typeable returns the same type reps (which was why I initially mentioned Data.Typeable in this thread). This is fine for "normal" modules. There's a bit of description in the "Dynamic Typing" section of http://www.cse.unsw.edu.au/~dons/hs-plugins/hs-plugins-Z-H-5.html#node_s ec_9 It's clearly the wrong thing to do for Data.Unique and any anything else that might use <-; but if there are no such types in the interface of the plugin, then it won't matter. I can't see how to make it safe to pass Data.Unique etc across a plugin interface without severely restricting the possible implementation strategies for a plugin library and its host. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
I think it treats them as compatible, using the fact that Data.Typeable returns the same type reps (which was why I initially mentioned Data.Typeable in this thread). This is fine for "normal" modules. There's a bit of description in the "Dynamic Typing" section of http://www.cse.unsw.edu.au/~dons/hs-plugins/hs-plugins-Z-H-5.html#node_s ec_9
It's clearly the wrong thing to do for Data.Unique and any anything else that might use <-; but if there are no such types in the interface of the plugin, then it won't matter. I can't see how to make it safe to pass Data.Unique etc across a plugin interface without severely restricting the possible implementation strategies for a plugin library and its host.
I think it's bad design for a dynamic loader to load a module more than once anyway. It's a waste of memory, for a start. We already know that hs-plugins won't for modules it already loaded itself (apparently it crashes the RTS), and I suspect it doesn't at all. -- Ashley Yakeley

Ashley Yakeley wrote:
I think it's bad design for a dynamic loader to load a module more than once anyway.
In compiled code module boundaries don't necessarily exist. So how do you relink the loaded code so that it points to the unique copy of the module?
It's a waste of memory, for a start. We already know that hs-plugins won't for modules it already loaded itself (apparently it crashes the RTS), and I suspect it doesn't at all.
It crashes the RTS of the plugins loader, which is based on ghci, which is built around loading modules independently. I believe there's a separate RTS running at the top level of the program which has no knowledge of the plugin loader. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
In compiled code module boundaries don't necessarily exist. So how do you relink the loaded code so that it points to the unique copy of the module?
hs-plugins loads modules as single .o files, I believe.
It crashes the RTS of the plugins loader, which is based on ghci, which is built around loading modules independently. I believe there's a separate RTS running at the top level of the program which has no knowledge of the plugin loader.
Two RTSs? Are you quite sure? How would GC work? "The loader is a binding to the GHC runtime system's dynamic linker, which does single object loading. GHC also performs the necessary linking of new objects into the running process." http://www.cse.unsw.edu.au/~dons/hs-plugins/hs-plugins-Z-H-2.html#node_sec_4 -- Ashley Yakeley

Sittampalam, Ganesh wrote:
In compiled code module boundaries don't necessarily exist. So how do you relink the loaded code so that it points to the unique copy of
Ashley Yakeley wrote: the
module?
hs-plugins loads modules as single .o files, I believe.
Yes, but (a) the loading program doesn't and (b) that's an implementation choice, not a necessity.
Two RTSs? Are you quite sure? How would GC work?
I talked to Don about this and you're right, that doesn't happen. However he also confirmed that it does load modules a second time if they are in the main program as well as the plugin, and it would be difficult to merge the static and dynamic versions of the module. Cheers, Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
I talked to Don about this and you're right, that doesn't happen. However he also confirmed that it does load modules a second time if they are in the main program as well as the plugin, and it would be difficult to merge the static and dynamic versions of the module.
Oh dear. To fix this, I suppose the RTS would have to be able to keep track of all static initialisers. But it shouldn't otherwise affect program optimisation. -- Ashley Yakeley

Ashley Yakeley wrote:
Sittampalam, Ganesh wrote:
I talked to Don about this and you're right, that doesn't happen. However he also confirmed that it does load modules a second time if they are in the main program as well as the plugin, and it would be difficult
to merge the static and dynamic versions of the module.
Oh dear. To fix this, I suppose the RTS would have to be able to keep track of all static initialisers. But it shouldn't otherwise affect program optimisation.
What would the RTS actually do? Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
Oh dear. To fix this, I suppose the RTS would have to be able to keep track of all static initialisers. But it shouldn't otherwise affect program optimisation.
What would the RTS actually do?
I don't know enough about the RTS to say. I imagine initialisers would have to be marked in object files, so the RTS could link them separately when dynamically loading. The RTS would also keep a list of initialisers in the main program. -- Ashley Yakeley

Ashley Yakeley wrote:
Sittampalam, Ganesh wrote:
Oh dear. To fix this, I suppose the RTS would have to be able to keep track of all static initialisers. But it shouldn't otherwise affect program optimisation.
What would the RTS actually do?
I don't know enough about the RTS to say. I imagine initialisers would
have to be marked in object files, so the RTS could link them separately when dynamically loading. The RTS would also keep a list of initialisers in the main program.
Sounds plausible, although dynamic relocations do slow down linking. Unloading is another interesting problem. Are we allowed to re-run <- if the module that contained it is unloaded and then reloaded? I'm not quite sure what the conditions for allowing a module to be unloaded in general should be, though. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
Sounds plausible, although dynamic relocations do slow down linking.
Unloading is another interesting problem. Are we allowed to re-run <- if the module that contained it is unloaded and then reloaded? I'm not quite sure what the conditions for allowing a module to be unloaded in general should be, though.
Interesting question. I suppose it's allowable if the guarantees attached to the ACIO type imply that it would not be possible to tell the difference. I think this means that all values of types, including newtypes, belonging to the module must be unreachable before unloading. Consider Data.Unique as a separate loadable module. It's loaded, and various Unique values are obtained. But Unique is just a newtype of Integer, and comparison between Uniques doesn't use code from Data.Unique. This might be difficult to track as once the newtype is boiled away, the code is basically dealing with Integers, not Uniques. I really don't know enough about the RTS to know. The alternative would be to keep all initialised values when the module is unloaded. I'm guessing this is more feasible. -- Ashley Yakeley

Ashley Yakeley wrote:
I really don't know enough about the RTS to know. The alternative would be to keep all initialised values when the module is unloaded. I'm guessing this is more feasible.
Easier, but a guaranteed memory leak. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Sittampalam, Ganesh wrote:
Ashley Yakeley wrote:
I really don't know enough about the RTS to know. The alternative would be to keep all initialised values when the module is unloaded. I'm guessing this is more feasible.
Easier, but a guaranteed memory leak.
But it's limited to the initialisers. An IORef holding an Integer isn't much memory, and it only ever gets leaked once. -- Ashley Yakeley

On Fri, 5 Sep 2008, Ashley Yakeley wrote:
Sittampalam, Ganesh wrote:
Ashley Yakeley wrote:
I really don't know enough about the RTS to know. The alternative would be to keep all initialised values when the module is unloaded. I'm guessing this is more feasible.
Easier, but a guaranteed memory leak.
But it's limited to the initialisers. An IORef holding an Integer isn't much memory, and it only ever gets leaked once.
It happens every time you load and unload, surely? Also I thought this was a general discussion with Data.Unique as a concrete example; something else might leak substantially more memory. Your witnesses stuff would leak one Integer per module, wouldn't it? Finally, any memory leak at all can be unacceptable in some contexts. It's certainly not something we should just dismiss as "oh, it's only small". Cheers, Ganesh

Ganesh Sittampalam wrote:
But it's limited to the initialisers. An IORef holding an Integer isn't much memory, and it only ever gets leaked once.
It happens every time you load and unload, surely?
No. An initialiser is only ever run once per run of the RTS.
Also I thought this was a general discussion with Data.Unique as a concrete example; something else might leak substantially more memory. Your witnesses stuff would leak one Integer per module, wouldn't it?
It would leak one Integer per IOWitness initialiser for the run of the RTS.
Finally, any memory leak at all can be unacceptable in some contexts. It's certainly not something we should just dismiss as "oh, it's only small".
Since it's of the order of the number of uniquely identified initialisers, it's arguably not a memory leak so much as a static overhead. The only way to get a continuous leak is to load and unload an endless stream of _different_ modules, each with their own initialisers. -- Ashley Yakeley

On Sat, 6 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
But it's limited to the initialisers. An IORef holding an Integer isn't much memory, and it only ever gets leaked once.
It happens every time you load and unload, surely?
No. An initialiser is only ever run once per run of the RTS.
Oh, I see. Yes, sorry.
Since it's of the order of the number of uniquely identified initialisers, it's arguably not a memory leak so much as a static overhead. The only way to get a continuous leak is to load and unload an endless stream of _different_ modules, each with their own initialisers.
I would call it a leak if something that is no longer being used cannot be reclaimed. The endless stream of different modules is possible in long-running systems where the code being run evolves or changes over time (e.g. something like lambdabot, which runs user-provided code). Cheers, Ganesh

Ganesh Sittampalam wrote:
I would call it a leak if something that is no longer being used cannot be reclaimed. The endless stream of different modules is possible in long-running systems where the code being run evolves or changes over time (e.g. something like lambdabot, which runs user-provided code).
This might be fixable with an option to the dynamic load function. Let us say a module M has a number of top-level <- of the form val <- exp The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M. When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M. It is the static results cache that might leak. Let us have an flag to the dynamic load function, to mark the static results cache of M as "freeable". If the static results cache is freeable, then it will be deleted when M is unloaded (and M is not part of the main program). If you pass True for this flag, your code is unsafe if all of the following: * M has static initialisers * M will be loaded again after unloading * Values from M will be stored elsewhere in the program. If you pass False for this flag, your code will continuously leak memory if you continuously load modules * that are all different * that contain static initialisers There may also have to be some way to specify how to apply the flag to dependencies as well. In general I'm way beyond my knowledge of the RTS, so I may have something Very Wrong here. I don't think hs-plugins implements unloading at all currently. -- Ashley Yakeley

On 2008 Sep 6, at 6:10, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
I would call it a leak if something that is no longer being used cannot be reclaimed. The endless stream of different modules is possible in long-running systems where the code being run evolves or changes over time (e.g. something like lambdabot, which runs user-provided code).
This might be fixable with an option to the dynamic load function.
Let us say a module M has a number of top-level <- of the form
val <- exp
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M.
When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak. Your proposed "freeable" flag is still useful, but I think this is not a problem. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sat, 6 Sep 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Sep 6, at 6:10, Ashley Yakeley wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M.
When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak.
You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen. Ganesh

On 2008 Sep 6, at 11:22, Ganesh Sittampalam wrote:
On Sat, 6 Sep 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Sep 6, at 6:10, Ashley Yakeley wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M. When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak.
You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
You want run-once behavior without giving the runtime the ability to tell that it's already been run? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sat, 6 Sep 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Sep 6, at 11:22, Ganesh Sittampalam wrote:
On Sat, 6 Sep 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Sep 6, at 6:10, Ashley Yakeley wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M. When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak.
You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
You want run-once behavior without giving the runtime the ability to tell that it's already been run?
I don't want run-once behaviour, other people do. I'm just trying to pin down what it would mean. Ganesh

Ganesh Sittampalam wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M.
When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak.
You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
That what will happen? -- Ashley Yakeley

On 2008 Sep 6, at 18:06, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M.
When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak. You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
That what will happen?
I have no idea what Ganesh is complaining about, but if the point of ACIO is one-time initialization, it is pretty much required that something will be allocated to record the fact that a given item has been initialized (otherwise it can't guarantee the initializer runs exactly once, after all), and therefore that is part of the base ACIO API. So what exactly is the issue here? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sat, 2008-09-06 at 18:16 -0400, Brandon S. Allbery KF8NH wrote:
I have no idea what Ganesh is complaining about, but if the point of ACIO is one-time initialization, it is pretty much required that something will be allocated to record the fact that a given item has been initialized (otherwise it can't guarantee the initializer runs exactly once, after all), and therefore that is part of the base ACIO API. So what exactly is the issue here?
The issue is this: 1. Results from initialisers cannot be GC'd even if they become otherwise unreachable, because the dynamic loader might re-load the module (and then we'd need those original results). 2. If the dynamic loader loads an endless stream of different modules containing initialisers, memory will thus leak. It's tempting to say "don't do that" to point #2. -- Ashley Yakeley

On 2008 Sep 6, at 18:25, Ashley Yakeley wrote:
1. Results from initialisers cannot be GC'd even if they become otherwise unreachable, because the dynamic loader might re-load the module (and then we'd need those original results).
2. If the dynamic loader loads an endless stream of different modules containing initialisers, memory will thus leak.
I think if the issue is this vs. not being able to guarantee any once- only semantics, i consider the former necessary overhead for proper program behavior. And that, given that there exists extra-program global state that one might want to access, once-only initialization is a necessity. (Whoever it was who jumped off this point to say Haskell should exit the real world misses the point: it's not even all that useful from a theoretical standpoint if it's not allowed to interact with anything outside itself, and stdin/stdout tend to require once-only initialization. You can't really hide from it.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sat, 6 Sep 2008, Brandon S. Allbery KF8NH wrote:
On 2008 Sep 6, at 18:25, Ashley Yakeley wrote:
2. If the dynamic loader loads an endless stream of different modules containing initialisers, memory will thus leak.
I think if the issue is this vs. not being able to guarantee any once-only semantics, i consider the former necessary overhead for proper program behavior.
Not leaking memory is an important part of proper program behaviour.
And that, given that there exists extra-program global state that one might want to access, once-only initialization is a necessity.
In what cases? In the case of buffered I/O there's no reason (in theory) you couldn't unload libc, do unbuffered I/O for a while, then reload libc and start again. Ganesh

On Sat, 6 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M.
When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M.
This sounds "reachable" to me, and therefore static overhead and not a leak.
You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
That what will happen?
That memory will be used and not ever be reclaimable. Suppose I am writing something that I intend to be used as part of a plug-in that is reloaded in different forms again and again. And I see module K which does something I want, so I use it. It so happens that K uses M, which has a <-. If I knew that using K in my plug-in would cause a memory leak, I would avoid doing so; but since the whole point of <- is to avoid making the need for some state visible in the API. Ganesh

On 2008 Sep 7, at 6:23, Ganesh Sittampalam wrote:
On Sat, 6 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
The set of ACIO expressions exp is the "static initialisers" of M. The RTS must note when each static initialiser is run, and cache its result val. Let's call this cache of vals the "static results cache" of M. When M is loaded, and a static results cache for M already exists, then it will be used for the vals of M. This sounds "reachable" to me, and therefore static overhead and not a leak. You can call it what you like, but it's still unacceptable behaviour, particularly since clients of M will have no way of telling from its API that it will happen.
That what will happen?
That memory will be used and not ever be reclaimable.
Suppose I am writing something that I intend to be used as part of a plug-in that is reloaded in different forms again and again. And I see module K which does something I want, so I use it. It so happens that K uses M, which has a <-. If I knew that using K in my plug-in would cause a memory leak, I would avoid doing so; but since the whole point of <- is to avoid making the need for some state visible in the API.
False, as it's in ACIO and therefore advertises that it will "leak memory" in the name of correct behavior. Since you consider memory leaks to be worse than correct behavior, you can avoid anything that uses ACIO. (But you might want to go look at that list of modules which do global variable initialization and therefore aren't entirely trustworthy unless something like ACIO exists.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 7 Sep 2008, Brandon S. Allbery KF8NH wrote:
plug-in that is reloaded in different forms again and again. And I see module K which does something I want, so I use it. It so happens that K uses M, which has a <-. If I knew that using K in my plug-in would cause a memory leak, I would avoid doing so; but since the whole point of <- is to avoid making the need for some state visible in the API.
False, as it's in ACIO and therefore advertises that it will "leak memory" in the name of correct behavior.
I thought ACIO was a restriction on the thing on the right hand side of the <-? How does the module itself advertise its use of this (transitively) to users?
Since you consider memory leaks to be worse than correct behavior,
Not leaking memory is *part* of correct behaviour. If <- is to be created at all, it should be created with restrictions that make it capable of guaranteeing correct behaviour.
(But you might want to go look at that list of modules which do global variable initialization and therefore aren't entirely trustworthy unless something like ACIO exists.)
We should fix them (and their interface) so this doesn't happen, rather than standardising something broken. Ganesh

On 2008 Sep 7, at 12:10, Ganesh Sittampalam wrote:
On Sun, 7 Sep 2008, Brandon S. Allbery KF8NH wrote:
Since you consider memory leaks to be worse than correct behavior,
Not leaking memory is *part* of correct behaviour. If <- is to be created at all, it should be created with restrictions that make it capable of guaranteeing correct behaviour.
(But you might want to go look at that list of modules which do global variable initialization and therefore aren't entirely trustworthy unless something like ACIO exists.)
We should fix them (and their interface) so this doesn't happen, rather than standardising something broken.
And we're right back to "so how do we do this when we aren't allowed to record that it has already been run?" You seem to think we must never insure that something will only be run once, that any program that does require this is broken. As such, the standard Haskell libraries (including some whose interfaces are H98) are unfixably broken and you'd better start looking elsewhere for your "correct behavior". -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, 7 Sep 2008, Brandon S. Allbery KF8NH wrote:
You seem to think we must never insure that something will only be run once, that any program that does require this is broken. As such, the standard Haskell libraries (including some whose interfaces are H98) are unfixably broken and you'd better start looking elsewhere for your "correct behavior".
Data.Unique might be unfixably broken, though perhaps some requirement that it not be unloaded while any values of type Unique are still around could solve the problem - though it's hard to see how this could be implemented sanely. But Data.Unique could (a) probably be replaced with something in terms of IORefs and (b) is pretty ugly anyway, since it forces you into IO. I'm sure that for many other examples, re-initialisation would be fine. For example Data.HashTable just uses a global for instrumentation for performance tuning, which could happily be reset if it got unloaded and then reloaded. System.Random could get a new StdGen. I haven't yet had time to go through the entire list that Adrian Hey posted to understand why they are being used, though. I'd also point out that if you unload and load libraries in C, global state will be lost and re-initialised. Ganesh

Ganesh Sittampalam wrote:
Suppose I am writing something that I intend to be used as part of a plug-in that is reloaded in different forms again and again. And I see module K which does something I want, so I use it. It so happens that K uses M, which has a <-. If I knew that using K in my plug-in would cause a memory leak, I would avoid doing so; but since the whole point of <- is to avoid making the need for some state visible in the API.
The results from the <- in M will only be stored once for the life of the RTS, no matter how many times your plug-ins are reloaded. -- Ashley Yakeley

On Sun, 7 Sep 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
Suppose I am writing something that I intend to be used as part of a plug-in that is reloaded in different forms again and again. And I see module K which does something I want, so I use it. It so happens that K uses M, which has a <-. If I knew that using K in my plug-in would cause a memory leak, I would avoid doing so; but since the whole point of <- is to avoid making the need for some state visible in the API.
The results from the <- in M will only be stored once for the life of the RTS, no matter how many times your plug-ins are reloaded.
Sorry, I keep forgetting that. OK, so you can't get an endless stream of leaks unless you use <- yourself, or modules on your system keep getting upgraded to new versions. Ganesh

On Wed, Sep 3, 2008 at 2:53 AM, Ashley Yakeley
It's worth mentioning that the current Data.Unique is part of the standard base library, while hs-plugins is rather experimental. Currently Data.Unique uses the "NOINLINE unsafePerformIO" hack to create its MVar. If hs-plugins duplicates that MVar, that's a bug in hs-plugins. It's up to a dynamic loader to get initialisation code correct.
Data.Unique describes itself as "experimental" and "non-portable". The
Haskell 98 report includes NOINLINE, but also states that environments
are not required to respect it. So hs-plugins wouldn't necessarily be
at fault if it didn't support Data.Unique.
--
Dave Menendez

Dave Menendez wrote:
The Haskell 98 report includes NOINLINE, but also states that environments are not required to respect it. So hs-plugins wouldn't necessarily be at fault if it didn't support Data.Unique.
Also, the definition of NOINLINE in the report doesn't preclude copying both the MVar *and* its use sites, which is what I am proposing should be considered generally safe. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Ashley Yakeley wrote:
Currently Data.Unique uses the "NOINLINE unsafePerformIO" hack to create its MVar. If hs-plugins duplicates that MVar, that's a bug in hs-plugins.
Sittampalam, Ganesh wrote:
Also, the definition of NOINLINE in the report doesn't preclude copying both the MVar *and* its use sites,
Right. It would not be a bug in hs-plugins. That is the most urgent problem right now. It is nice to discuss various proposed new language features. That is the way to solve the problem in the long term. But right now - there is no way to do this in Haskell at all. The NOINLINE unsafePerformIO hack doesn't really work. This is currently a major hole in Haskell in my opinion. For the short term - can we *please* get an ONLYONCE pragma that has the correct semantics? Thanks, Yitz

Yitzhak Gale wrote:
Right. It would not be a bug in hs-plugins. That is the most urgent problem right now. [...] For the short term - can we *please* get an ONLYONCE pragma that has the correct semantics?
So the purpose of this pragma would solely be so that you can declare hs-plugins buggy for not respecting it? Or do you have some way to "fix" hs-plugins so that it does do so? (Assuming that my belief about how hs-plugins works is correct, of course) Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

I wrote:
For the short term - can we *please* get an ONLYONCE pragma that has the correct semantics?
Sittampalam, Ganesh wrote:
So the purpose of this pragma would solely be so that you can declare hs-plugins buggy for not respecting it?
No, the hs-plugins problem - whether hypothetical or real - is only a symptom. There is no way to define global variables in Haskell right now. The NOINLINE hack is used, and most often works. But really it's broken and there are no guarantees, because NOINLINE does not have the right semantics. This is demonstrated by your hs-plugins example, but it's a general problem. Until a permanent solution is implemented and deployed in the compilers (if ever), can we please have a pragma that allows the current hack to really work? Thanks, Yitz

(apologies for misspelling your name when quoting you last time) Yitzchak Gale wrote:
For the short term - can we *please* get an ONLYONCE pragma that has
the correct semantics?
Until a permanent solution is implemented and deployed in the compilers (if ever), can we please have a pragma that allows the current hack to really work?
How do you propose that this pragma would be implemented? Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

For the short term - can we *please* get an ONLYONCE pragma that has the correct semantics?
Sittampalam, Ganesh wrote:
How do you propose that this pragma would be implemented?
As far as I know now, in GHC it could currently just be an alias for NOINLINE, but the GHC gurus could say for sure. Except it should require a monomorphic constant - otherwise the guarantee doesn't make sense. And it would have clear comments and documentation that state that it guarantees that the value will be computed at most once. That way, bugs could be filed against it if that ever turns out not to be true. Other applications and libraries that support the pragma - such as other compilers, and hs-plugins - would be required to respect the guarantee, and bugs could be filed against them if they don't. Thanks, Yitz

Yitzchak Gale wrote
Other applications and libraries that support the pragma - such as other compilers, and hs-plugins - would be required to respect the guarantee, and > bugs could be filed against them if they don't.
If hs-plugins were loading object code, how would it even know of the existence of the pragma? Given such knowledge, how would it implement it? Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

I wrote
Other applications and libraries that support the pragma - such as other compilers, and hs-plugins - would be required to respect the guarantee, and bugs could be filed against them if they don't.
Sittampalam, Ganesh wrote:
If hs-plugins were loading object code, how would it even know of the existence of the pragma? Given such knowledge, how would it implement it?
Good point. A compiler pragma is only that, in the end. This is just a hack, we can only do the best we can with it. Regards, Yitz

On Wed, Sep 3, 2008 at 9:30 AM, Yitzchak Gale
I wrote
Other applications and libraries that support the pragma - such as other compilers, and hs-plugins - would be required to respect the guarantee, and bugs could be filed against them if they don't.
Sittampalam, Ganesh wrote:
If hs-plugins were loading object code, how would it even know of the existence of the pragma? Given such knowledge, how would it implement it?
Good point. A compiler pragma is only that, in the end. This is just a hack, we can only do the best we can with it.
How does the FFI handle initialization? Presumably, we can link to
libraries that have internal state. Could someone, in principle, use
the FFI to create a global variable?
--
Dave Menendez

David Menendez wrote:
On Wed, Sep 3, 2008 at 2:53 AM, Ashley Yakeley
wrote: It's worth mentioning that the current Data.Unique is part of the standard base library, while hs-plugins is rather experimental. Currently Data.Unique uses the "NOINLINE unsafePerformIO" hack to create its MVar. If hs-plugins duplicates that MVar, that's a bug in hs-plugins. It's up to a dynamic loader to get initialisation code correct.
Data.Unique describes itself as "experimental" and "non-portable". The Haskell 98 report includes NOINLINE, but also states that environments are not required to respect it. So hs-plugins wouldn't necessarily be at fault if it didn't support Data.Unique.
I found this: "To solve this the hs-plugins dynamic loader maintains state storing a list of what modules and packages have been loaded already. If load is called on a module that is already loaded, or dependencies are attempted to load, that have already been loaded, the dynamic loader ignores these extra dependencies. This makes it quite easy to write an application that will allows an arbitrary number of plugins to be loaded." http://www.cse.unsw.edu.au/~dons/hs-plugins/hs-plugins-Z-H-6.html -- Ashley Yakeley

Ashley Yakeley wrote:
"To solve this the hs-plugins dynamic loader maintains state storing a list of what modules and packages have been loaded already. If load is called on a module that is already loaded, or dependencies are attempted to load, that have already been loaded, the dynamic loader ignores these extra dependencies. This makes it quite easy to write an application that will allows an arbitrary number of plugins to be loaded." http://www.cse.unsw.edu.au/~dons/hs-plugins/hs-plugins-Z-H-6.html
My recollection from using it a while ago is that if a module is used in the main program it will still be loaded once more in the plugin loader. This is because the plugin loader is basically an embedded copy of ghci without much knowledge of the host program's RTS. Cheers, Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Ganesh Sittampalam wrote:
If you want to standardise a language feature, you have to explain its behaviour properly. This is one part of the necessary explanation.
To be concrete about scenarios I was considering, what happens if:
- the same process loads two copies of the GHC RTS as part of two completely independent libraries? For added complications, imagine that one of the libraries uses a different implementation instead (e.g. Hugs)
- one Haskell program loads several different plugins in a way that allows Haskell values to pass across the plugin boundary
How do these scenarios work with use cases for <- like (a) Data.Unique and (b) preventing multiple instantiation of a sub-library?
That's a good question. But before you propose these scenarios, you must establish that they are sane for Haskell as it is today. In particular, would _local_ IORefs work correctly? After all, the memory allocator must be "global" in some sense. Could you be sure that different calls to newIORef returned separate IORefs? Perhaps this is the One True Global Scope: the scope in which refs from newIORef are guaranteed to be separate. It's the scope in which values from newUnique are supposed to be different, and it would also be the scope in which top-level <- would be called at most once. -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
If you want to standardise a language feature, you have to explain its behaviour properly. This is one part of the necessary explanation.
To be concrete about scenarios I was considering, what happens if:
- the same process loads two copies of the GHC RTS as part of two completely independent libraries? For added complications, imagine that one of the libraries uses a different implementation instead (e.g. Hugs)
- one Haskell program loads several different plugins in a way that allows Haskell values to pass across the plugin boundary
How do these scenarios work with use cases for <- like (a) Data.Unique and (b) preventing multiple instantiation of a sub-library?
That's a good question. But before you propose these scenarios, you must establish that they are sane for Haskell as it is today.
In particular, would _local_ IORefs work correctly? After all, the memory allocator must be "global" in some sense. Could you be sure that different calls to newIORef returned separate IORefs?
Yes, I would expect that. Allocation areas propagate downwards from the OS to the top-level of a process and then into dynamically loaded modules if necessary. Any part of this puzzle that fails to keep them separate (in some sense) is just broken.
Perhaps this is the One True Global Scope: the scope in which refs from newIORef are guaranteed to be separate.
Every single call to newIORef, across the whole world, returns a different ref. The "same" one as a previous one can only be returned once the old one has become unused (and GCed).
It's the scope in which values from newUnique are supposed to be different, and it would also be the scope in which top-level <- would be called at most once.
I don't really follow this. Do you mean the minimal such scope, or the maximal such scope? The problem here is not about separate calls to newIORef, it's about how many times an individual <- will be executed. Ganesh

Ganesh Sittampalam wrote:
Every single call to newIORef, across the whole world, returns a different ref.
How do you know? How can you compare them, except in the same Haskell expression?
The "same" one as a previous one can only be returned once the old one has become unused (and GCed).
Perhaps, but internally the IORef is a pointer value, and those pointer values might be the same. From the same perspective, one could say that every single call to newUnique across the whole world returns a different value, but internally they are Integers that might repeat.
It's the scope in which values from newUnique are supposed to be different, and it would also be the scope in which top-level <- would be called at most once.
I don't really follow this. Do you mean the minimal such scope, or the maximal such scope? The problem here is not about separate calls to newIORef, it's about how many times an individual <- will be executed.
Two IO executions are in the same "global scope" if their resulting values can be used in the same expression. Top-level <- declarations must execute at most once in this scope. -- Ashley Yakeley

On Sat, 30 Aug 2008, Ashley Yakeley wrote:
Ganesh Sittampalam wrote:
Every single call to newIORef, across the whole world, returns a different ref.
How do you know? How can you compare them, except in the same Haskell expression?
I can write to one and see if the other changes.
The "same" one as a previous one can only be returned once the old one has become unused (and GCed).
Perhaps, but internally the IORef is a pointer value, and those pointer values might be the same. From the same perspective, one could say that
How can they be the same unless the memory management system is broken? I consider different pointers on different machines or in different virtual address spaces different too; it's the fact that they don't alias that matters.
every single call to newUnique across the whole world returns a different value, but internally they are Integers that might repeat.
The thing about pointers is that they are managed by the standard behaviour of memory allocation. This isn't true of Integers. In fact this point suggests an implementation for Data.Unique that should actually be safe without global variables: just use IORefs for the actual Unique values. IORefs already support Eq, as it happens. That gives you process scope for free, and if you want bigger scopes you can pair that with whatever makes sense, e.g. process ID, MAC address, etc.
Two IO executions are in the same "global scope" if their resulting values can be used in the same expression. Top-level <- declarations must execute at most once in this scope.
This brings us back to the RPC question, and indeed to just passing values to somewhere else via FFI. I think you can work around some of that by talking about ADTs that aren't serialisable (e.g. ban the class Storable), but now we have different global scopes for different kinds of values, so which scope do we use to define <- ? Ganesh

Ganesh Sittampalam wrote:
How can they be the same unless the memory management system is broken? I consider different pointers on different machines or in different virtual address spaces different too; it's the fact that they don't alias that matters.
But the actual pointer value might repeat.
every single call to newUnique across the whole world returns a different value, but internally they are Integers that might repeat.
The thing about pointers is that they are managed by the standard behaviour of memory allocation. This isn't true of Integers.
But it could be. A global variable allows us to do the same thing as the memory allocator, and allocate unique Integers just as the allocator allocates unique pointer values. Now you can say that the same pointer value on different machines is different pointers; equally, you can say the same Integer in Unique on different machines is different Uniques: it's the fact that they don't alias that matters.
In fact this point suggests an implementation for Data.Unique that should actually be safe without global variables: just use IORefs for the actual Unique values. IORefs already support Eq, as it happens. That gives you process scope for free,
Isn't this rather ugly, though? We're using IORefs for something that doesn't involve reading or writing to them. Shouldn't there be a more general mechanism? -- Ashley Yakeley

Ashley Yakeley wrote:
I don't really follow this. Do you mean the minimal such scope, or the maximal such scope? The problem here is not about separate calls to newIORef, it's about how many times an individual <- will be executed.
Two IO executions are in the same "global scope" if their resulting values can be used in the same expression. Top-level <- declarations must execute at most once in this scope.
Better: Two newIORef executions are in the same "global scope" if their resulting refs can be used in the same expression. Top-level <- declarations must execute at most once in this scope. -- Ashley Yakeley

Lennart Augustsson wrote:
I don't think anyone has claimed that any interface can be implemented without globals. Of course some can't (just pick an interface that is the specification of a global variable). What I (and others) claims is that such interfaces are bad. Using a global variable makes an assumption that there's only ever going to be one of something, and that's just an inflexible assumption to make.
Not true, it's an idiom that comes up often enough. The "right" way to do these kinds of things is to provide some sort of context around the calling function. Something like withAcquiredResource $ \handle -> do ... You (and others) are right that this is better than trying to keep global state in the context of the called function. The problem is that it is not always possible. There are situations in which you simply cannot make demands on a prior context. One important example is when refactoring a single component within an existing mature system. Another is when writing a library for general use if such demands on the calling context seem onerous for the service that the library provides (this latter is the situation for Data.Unique, according to many opinions). I find that Haskell's composability properties help it to outshine any other development environment I know. Experience shows that this is eventually true even for IO related issues - but those generally take a lot more work to discover the right approach. I feel that here we are still working on "tackling the awkward squad". However we work that out, right now we need a working idiom to get out of trouble when this situation comes up. What we have is a hack that is not guaranteed to work. We are abusing the NOINLINE pragma and assuming things about it that are not part of its intended use. We are lucky that it happens to work right now in GHC. So my proposal is that, right now, we make the simple temporary fix of adding an ONLYONCE pragma that does have the proper guaranteed sematics. In the meantime, we can keep tackling the awkward squad. Thanks, Yitz

As I said earlier, global variables may be necessary when interfacing
with legacy things (software or hardware).
If Haskell had always taken the pragmatic path of adding what seems
easiest and most in line with imperative practice it would not be the
language it is today. It would be Perl, ML, or Java.
The Haskell philosophy has always been to stick it out until someone
comes up with the right solution to a problem rather than picking some
easy way out. So I'd rather keep global variables being eye sores (as
they are now) to remind us to keep looking for a nice way.
For people who don't like this philosophy there are plenty of other languages.
And this concludes my contributions on this matter. :)
-- Lennart
On Thu, Aug 28, 2008 at 11:06 PM, Yitzchak Gale
The "right" way to do these kinds of things is to provide some sort of context around the calling function. Something like withAcquiredResource $ \handle -> do ... You (and others) are right that this is better than trying to keep global state in the context of the called function.
The problem is that it is not always possible. There are situations in which you simply cannot make demands on a prior context. One important example is when refactoring a single component within an existing mature system. Another is when writing a library for general use if such demands on the calling context seem onerous for the service that the library provides (this latter is the situation for Data.Unique, according to many opinions).

On Thu, 2008-08-28 at 23:48 +0100, Lennart Augustsson wrote:
The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out. So I'd rather keep global variables being eye sores (as they are now) to remind us to keep looking for a nice way. For people who don't like this philosophy there are plenty of other languages.
Talking of which, we really ought to look at an IO typeclass or two (not
just our existing MonadIO) and rework the library ops to use it in
Haskell'. You're not the only one to want it, and if it's not fixed this
time it may never get fixed.
--
Philippa Cowderoy

Philippa Cowderoy wrote:
Talking of which, we really ought to look at an IO typeclass or two (not just our existing MonadIO) and rework the library ops to use it in Haskell'. You're not the only one to want it, and if it's not fixed this time it may never get fixed.
This could allow both the best of both worlds, as we could have a monad that one couldn't create global variables for, and a monad for which one could. -- Ashley Yakeley

Lennart Augustsson wrote:
As I said earlier, global variables may be necessary when interfacing with legacy things (software or hardware).
By "prior context" I didn't mean legacy languages. I meant logically prior - enclosing contexts. It will always be necessary on occasion to refactor code without having any access to the enclosing context. If that refactoring happens to include acquiring an external resource once, using it while our program is running, and releasing it at the end, it is currently an awkward situation for us. We're working on finding a fitting solution to this.
The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out. So I'd rather keep global variables being eye sores (as they are now) to remind us to keep looking for a nice way.
I agree. But the eyesores do need to be guaranteed to work. That is not currently the case. It's easy to fix the eyesores, so I think we should do that now. Regards, Yitz

Lennart Augustsson wrote:
The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
I understood from your previous remarks that you regarded this as a non-problem even in C. There's no justification for using them, at least if you have clean slate priveleges (no legacy issues). That kind of implies to me that we (or at least you) already have the right solution. What is it and why can't we use it right now in Haskell? (Again assuming we have clean slate and no legacy issues). Or can we..
So I'd rather keep global variables being eye sores (as they are now) to remind us to keep looking for a nice way.
Are you looking? I can't even figure out from your posts if you're even prepared to admit that there *is* a problem, other than there being so many people in the world who can't write proper code, in Haskell or C :-) Regards -- Adrian Hey

Lennart Augustsson wrote:
If Haskell had always taken the pragmatic path of adding what seems easiest and most in line with imperative practice it would not be the language it is today. It would be Perl, ML, or Java. The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
BTW, unsafePerformIO seems quite pragmatic and easy to me, so let's not get too snobby about this. (Sorry, I couldn't resist.) Regards -- Adrian Hey

On 2008 Aug 28, at 20:45, Adrian Hey wrote:
Lennart Augustsson wrote:
If Haskell had always taken the pragmatic path of adding what seems easiest and most in line with imperative practice it would not be the language it is today. It would be Perl, ML, or Java. The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
BTW, unsafePerformIO seems quite pragmatic and easy to me, so let's not get too snobby about this. (Sorry, I couldn't resist.)
It's anything but easy; there are specific rules you need to follow, including use of certain compiler pragmas, to insure it works properly. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On 2008 Aug 28, at 20:45, Adrian Hey wrote:
Lennart Augustsson wrote:
If Haskell had always taken the pragmatic path of adding what seems easiest and most in line with imperative practice it would not be the language it is today. It would be Perl, ML, or Java. The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
BTW, unsafePerformIO seems quite pragmatic and easy to me, so let's not get too snobby about this. (Sorry, I couldn't resist.)
It's anything but easy; there are specific rules you need to follow, including use of certain compiler pragmas, to insure it works properly.
Yes, of course. The worst thing about all this is that the single most common use case AFAICS (the one under discussion) isn't even a "safe" use. Just pointing out that this pseudo function is certainly not something one would expect from an organisation as dedicated to the persuit of perfection as Lennart would have us believe. It's an expedient hack. Not that I wish to seem ungrateful or anything :-) Regards -- Adrian Hey

On 2008 Aug 29, at 4:22, Adrian Hey wrote:
Brandon S. Allbery KF8NH wrote:
On 2008 Aug 28, at 20:45, Adrian Hey wrote:
Lennart Augustsson wrote:
If Haskell had always taken the pragmatic path of adding what seems easiest and most in line with imperative practice it would not be the language it is today. It would be Perl, ML, or Java. The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
BTW, unsafePerformIO seems quite pragmatic and easy to me, so let's not get too snobby about this. (Sorry, I couldn't resist.) It's anything but easy; there are specific rules you need to follow, including use of certain compiler pragmas, to insure it works properly.
Yes, of course. The worst thing about all this is that the single most common use case AFAICS (the one under discussion) isn't even a "safe" use. Just pointing out that this pseudo function is certainly not something one would expect from an organisation as dedicated to the persuit of perfection as Lennart would have us believe. It's an expedient hack. Not that I wish to seem ungrateful or anything :-)
...but, as he noted, we *do* that until we find the right way to do it. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On 2008 Aug 29, at 4:22, Adrian Hey wrote:
Brandon S. Allbery KF8NH wrote:
On 2008 Aug 28, at 20:45, Adrian Hey wrote:
Lennart Augustsson wrote:
If Haskell had always taken the pragmatic path of adding what seems easiest and most in line with imperative practice it would not be the language it is today. It would be Perl, ML, or Java. The Haskell philosophy has always been to stick it out until someone comes up with the right solution to a problem rather than picking some easy way out.
BTW, unsafePerformIO seems quite pragmatic and easy to me, so let's not get too snobby about this. (Sorry, I couldn't resist.) It's anything but easy; there are specific rules you need to follow, including use of certain compiler pragmas, to insure it works properly.
Yes, of course. The worst thing about all this is that the single most common use case AFAICS (the one under discussion) isn't even a "safe" use. Just pointing out that this pseudo function is certainly not something one would expect from an organisation as dedicated to the persuit of perfection as Lennart would have us believe. It's an expedient hack. Not that I wish to seem ungrateful or anything :-)
...but, as he noted, we *do* that until we find the right way to do it.
So what's the problem with doing it *safely*, that is at least until someone has found the mythic "right way to do it". Not that anybody has ever been able to offer any rational explanation of what's *wrong* with the current proposed solution AFAICS. Regards -- Adrian Hey

On 2008-08-28, Yitzchak Gale
However we work that out, right now we need a working idiom to get out of trouble when this situation comes up. What we have is a hack that is not guaranteed to work. We are abusing the NOINLINE pragma and assuming things about it that are not part of its intended use. We are lucky that it happens to work right now in GHC.
So my proposal is that, right now, we make the simple temporary fix of adding an ONLYONCE pragma that does have the proper guaranteed sematics.
In the meantime, we can keep tackling the awkward squad.
What keeps this a "temporary fix". Even now, industrial user demands keep us from making radical changes to the languages and libraries. If we adopt a not entirely satisfactory solution, it's never going away. If we keep the NOINLINE pragma hack, we can claim it was never supported and do away with it. If we don't have a real solution, perhaps in this case we haven't worn the hair shirt long enough? -- Aaron Denney -><-

On Thu, Aug 28, 2008 at 09:00:41AM +0100, Lennart Augustsson wrote:
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
And hence you need a safe way to use program-scope variables. It is true that there are many many programs that can be written without them. But those don't concern us, if there are _any_ programs that need them that we wish to write in haskell then we need a safe way in haskell to use them. The truth is, 'process scope' is a useful scope to attach information too. Many operating systems attach various resources to process scope, memory allocation, file descriptors, protection domains. this in and of itself makes it useful for any programs on these operating systems to be able to augment this process scope. I mean, why is it okay to use 'process scope' state provided by the operating system or haskell runtime, but _not_ be able to express such things in haskell itself? I mean, lets look at the items provided for by plain haskell 98 which involve entities that are process scope or bigger: getStdGen, setStdGen, getEnv, getArgs, stdin,stdout,stderr, cpuTimePrecision, isEOF, getCurrentDirectory, setCurrentDirectory, system, exitWith, exitFailure, getProgName, getClockTime, probably others in more subtle ways. do we really want to say that these are all wrong or _must_ be provided by C or the operating system because implementing them in haskell would somehow be unclean? Why shouldn't we be able to implement the concept of a 'current directory' in haskell when we are perfectly happy to use the OS provided one? What if you have an exokernel, where it is expected these things _will_ be implemented in the userspace code. why shouldn't that part of the exokernel be written in haskell? John -- John Meacham - ⑆repetae.net⑆john⑈

On 2008 Aug 28, at 17:01, John Meacham wrote:
On Thu, Aug 28, 2008 at 09:00:41AM +0100, Lennart Augustsson wrote:
I'm certain you can write a kernel in Haskell where the only use of global variables is those that hardware interfacing forces you to use.
OS provided one? What if you have an exokernel, where it is expected these things _will_ be implemented in the userspace code. why shouldn't that part of the exokernel be written in haskell?
What's stopping it? Just wrap it in a state-carrying monad representing a context. That way you can also keep multiple contexts if necessary (and I think it is often necessary, or at least desirable, with most exokernel clients). -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Thu, Aug 28, 2008 at 07:21:48PM -0400, Brandon S. Allbery KF8NH wrote:
OS provided one? What if you have an exokernel, where it is expected these things _will_ be implemented in the userspace code. why shouldn't that part of the exokernel be written in haskell?
What's stopping it? Just wrap it in a state-carrying monad representing a context. That way you can also keep multiple contexts if necessary (and I think it is often necessary, or at least desirable, with most exokernel clients).
That is exactly what I want to do, with the 'IO' monad. but I would like the IO primitives to be implementable in haskell _or_ C transparently and efficiently. It should not matter how the primitives are implemented. John -- John Meacham - ⑆repetae.net⑆john⑈

Lennart Augustsson wrote:
No hardware drivers use global variables.
No problem, write your hardware drivers in a different monad. Thus IO is the type for code that can use global variables, and H (or whatever) is the type for code that must not. -- Ashley Yakeley

On Tue, 2008-08-26 at 18:34 +0100, Adrian Hey wrote:
I have a feeling this is going to be a very long thread so I'm trying to go to Haskell cafe again (without mucking it up again).
Derek Elkins wrote:
Haskell should be moving -toward- a capability-like model, not away from it.
Could you show how to implement Data.Random or Data.Unique using such a model, or any (preferably all) of the use cases identified can be implemented? Like what about implementing the socket API starting with nothing but primitives to peek/poke ethernet mac and dma controller registers?
Data.Random and Data.Unique are trivial. Already the immutable interfaces are fine. You could easily pass around a mutable object holding the state if you didn't want to be curtailed into a State monad. If you have full access to the DMA controller your language is not even memory safe. This is not a common situation for most developers. I have no trouble requiring people who want to hack OSes having to use implementation-specific extensions as they have to do today in any other language. However, this is only a problem for capabilities (as the capability model requires memory safety,) not for a language lacking top-level mutable state. Access to the DMA controller and the Ethernet interface can still be passed in, it doesn't need to be a top-level action. There are entire operating systems built around capability models, so it is certainly possible to do these things.
Why should Haskell should be moving -toward- a capability-like model and why does top level <- declarations take us away from it?
Answering the second question first: mutable global variables are usually -explicitly- disallowed from a capability model. To answer your first question: safety, security, analyzability, encapsulation, locality are all things that Haskell strives for. Personally, I think that every language should be moving in this direction as much as possible, but the Haskell culture, in particular, emphasizes these things. It's notable that O'Haskell and Timber themselves moved toward a capability model.
participants (21)
-
Aaron Denney
-
Adrian Hey
-
Ashley Yakeley
-
Brandon S. Allbery KF8NH
-
Bryan O'Sullivan
-
Bulat Ziganshin
-
Curt Sampson
-
Dan Doel
-
Dan Weston
-
Daniel Fischer
-
David Menendez
-
David Roundy
-
Derek Elkins
-
Ganesh Sittampalam
-
Johannes Waldmann
-
John Meacham
-
Jonathan Cast
-
Lennart Augustsson
-
Philippa Cowderoy
-
Sittampalam, Ganesh
-
Yitzchak Gale