
Are the values of infinite lists _|_ (bottom)? In section 1.3, the Haskell 98 report said as follows: Errors in Haskell are semantically equivalent to _|_. Technically, they are not distinguishable from nontermination, so the language includes no mechanism for detecting or acting upon errors. Therefore, the value of the following infinity is _|_. Right? data Nat = Zero | Succ Nat infinity = Succ infinity What about infinite lists? For example, is the value of [1 ..] also _|_? Thanks.

Deokhwan Kim wrote:
Are the values of infinite lists _|_ (bottom)?
Depends on what you mean by "value". If you define "value" to mean "normal form", then yes. If you define "value" to mean "weak head normal form", then no. The former is common in strict programming languages. In nonstrict functional programming, the latter is more useful.

On Wed, May 10, 2006 at 02:00:20PM +0900, Deokhwan Kim wrote:
To: haskell-cafe@haskell.org From: Deokhwan Kim
Date: Wed, 10 May 2006 14:00:20 +0900 Subject: [Haskell-cafe] The values of infinite lists Are the values of infinite lists _|_ (bottom)?
In section 1.3, the Haskell 98 report said as follows:
Errors in Haskell are semantically equivalent to _|_. Technically, they are not distinguishable from nontermination, so the language includes no mechanism for detecting or acting upon errors.
type theoreticians talk like that ;-). this paragraph is more about the "error" function than about infinite data structures. it means that whenever you trigger an "error ..." line, you make the program non-terminating, just like if you try to access the last element of an infinite list. the program still *ends* with an error message (notifying you that it won't *terminate*) because it is nice and it knows you don't like to wait forever. (-:
Therefore, the value of the following infinity is _|_. Right?
data Nat = Zero | Succ Nat
infinity = Succ infinity
same as with lists: as long as you don't get lost in an infinitely distant point in the data structure, you are fine: you can do this ghci> case infinity of Succ _ -> "not there yet."; Zero -> "the end." ghci> take 1000 [1..] actually one of the things that i still find most amazing about haskell is that i can express so many algorithms by starting from the declaration of an infinite data structure and then traversing a finite part of it. the king paper on graph algorithms in haskell has some nice examples on this, too. cheers, matthias

Are the values of infinite lists _|_ (bottom)?
No. _|_ represents total lack of information about the result, whereas in a lazy language like Haskell an infinite list contains a lot of information: you can observe arbitrary parts of such a list, access them, and compute with them.
In section 1.3, the Haskell 98 report said as follows:
Errors in Haskell are semantically equivalent to _|_. Technically, they are not distinguishable from nontermination, so the language includes no mechanism for detecting or acting upon errors.
The formulation in the Haskell report is sloppy to say the least, and clearly misleading as witnessed by your mail. Nontermination is not the precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language. I think one should consider reformulating the paragraph above in future versions of the Haskell report.
Therefore, the value of the following infinity is _|_. Right?
data Nat = Zero | Succ Nat
infinity = Succ infinity
No. Consider the following function: f Zero = 0 f (Succ _) = 17 We then have f infinity = f (Succ infinity) = 17, whereas f _|_ = _|_. Thus, f distinguishes infinity and _|_, so they can not be the same.
What about infinite lists? For example, is the value of [1 ..] also _|_?
No, see above. Björn Lisper

On Wed, 2006-05-10 at 23:29 +0900, Deokhwan Kim wrote:
Bjorn Lisper wrote:
precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language.
What kinds of nontermination are modeled by _|_ in Haskell?
let f = f in f 3 length [0..] Duncan

Deokhwan Kim:
Bjorn Lisper wrote:
precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language.
What kinds of nontermination are modeled by _|_ in Haskell?
Nonterminating computations that never return anything. For instance, inf = inf Björn Lisper

On Wednesday 10 May 2006 12:30 pm, Brian Hulley wrote:
Bjorn Lisper wrote:
Nontermination is not the precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language.
What kinds of non-termination are *not* modelled by _|_ in Haskell?
Non-termination that is "doing something". For example consider: ] ones = 1 : ones If I try to take its length, I get _|_. So: ] main = print (length ones) Will churn my CPU forever without producing any output. However, if I print each item sequentially: ] main = mapM print ones I'll get a never-ending stream of '1' on my console. This is not the same as bottom because it's "doing something". Now, obviously this definition is pretty imprecise, but maybe it helps you get the idea. Now for the corner cases. What about: ] main = sequence_ repeat (return ()) ? I'd personally say it is _not_ bottom. Even though "return ()" is a completely useless action, I'm inclined to say its "doing something" in some theoretical sense (mostly because I think of _|_ as being a property of the functional part of Haskell).

Robert Dockins wrote:
On Wednesday 10 May 2006 12:30 pm, Brian Hulley wrote:
Bjorn Lisper wrote:
Nontermination is not the precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language.
What kinds of non-termination are *not* modelled by _|_ in Haskell?
Non-termination that is "doing something".
For example consider:
] ones = 1 : ones
If I try to take its length, I get _|_. So:
] main = print (length ones)
Will churn my CPU forever without producing any output.
However, if I print each item sequentially:
] main = mapM print ones
I'll get a never-ending stream of '1' on my console. This is not the same as bottom because it's "doing something".
I can see what you're getting at, but I don't know if I agree with the idea that "doing" should affect whether or not one sees the result of the above computation as bottom or not. With a hypothetical implementation of runIO :: IO a -> RealWorld -> (RealWorld, a) I could write: ] (r',_) = runIO (mapM print ones) realWorld and this computation, even though some printing would be observable, still evaluates to bottom, because r' will never be bound.
Now, obviously this definition is pretty imprecise, but maybe it helps you get the idea. Now for the corner cases. What about:
] main = sequence_ repeat (return ())
? I'd personally say it is _not_ bottom. Even though "return ()" is a completely useless action, I'm inclined to say its "doing something" in some theoretical sense (mostly because I think of _|_ as being a property of the functional part of Haskell).
I thought everything in Haskell is purely functional - surely that is the whole point of using monads? :-) I'd have thought that "doing" is simply a projection of the purely functional "being" into the stream of time and therefore cannot be part of the discourse regarding the nature of bottom... Regards, Brian.

On Wednesday 10 May 2006 02:49 pm, you wrote:
Robert Dockins wrote:
On Wednesday 10 May 2006 12:30 pm, Brian Hulley wrote:
Bjorn Lisper wrote:
Nontermination is not the precisely the same as _|_. Only certain kinds of nontermination can be modeled by _|_ in a non-strict language.
What kinds of non-termination are *not* modelled by _|_ in Haskell?
Non-termination that is "doing something".
For example consider:
] ones = 1 : ones
If I try to take its length, I get _|_. So:
] main = print (length ones)
Will churn my CPU forever without producing any output.
However, if I print each item sequentially:
] main = mapM print ones
I'll get a never-ending stream of '1' on my console. This is not the same as bottom because it's "doing something".
I can see what you're getting at, but I don't know if I agree with the idea that "doing" should affect whether or not one sees the result of the above computation as bottom or not. With a hypothetical implementation of
runIO :: IO a -> RealWorld -> (RealWorld, a)
I could write:
] (r',_) = runIO (mapM print ones) realWorld
and this computation, even though some printing would be observable, still evaluates to bottom, because r' will never be bound.
Humm... how do you define observable? If r' is never bound, how can I observe any intermediate printing? More generally, if you want the possibility of implementing 'runIO' as a pure function (the world-state-transformer view of Haskell IO), you are forced to make a closed-world assumption. I don't believe that concurrency can be given a nice story in this view;you pretty much have to do something ugly like calculate the result of all possible interleavings (yuck!). And your world is still closed. The world-state-transformer idea is nice as a didactic tool, but I don't think its the right world-view for serious thinking about Haskell's semantics.
Now, obviously this definition is pretty imprecise, but maybe it helps you get the idea. Now for the corner cases. What about:
] main = sequence_ repeat (return ())
? I'd personally say it is _not_ bottom. Even though "return ()" is a completely useless action, I'm inclined to say its "doing something" in some theoretical sense (mostly because I think of _|_ as being a property of the functional part of Haskell).
I thought everything in Haskell is purely functional - surely that is the whole point of using monads? :-)
Sure. But in that world-view then you don't think of the IO actions as "running" at all, so you can't discuss their termination properties. This is more or less what all accounts (at least the ones I've seen) of Haskell's semantics do -- they provide a denotational semantics for the lambda terms basicaly ignore the meaning of IO actions.
I'd have thought that "doing" is simply a projection of the purely functional "being" into the stream of time and therefore cannot be part of the discourse regarding the nature of bottom...
My favorite view of Haskell semantics is of a coroutining abstract machine which alternates between evaluating lambda terms to obtain the terms of a process calculus, and then reducing those process calculus terms; some process calculus reduction rules call into the lambda reduction engine to grow more (process calculus) terms to reduce. The observable behavior of the program is defined in terms of the sequence of reductions undertaken by the process calculus engine. In this view "bottom" is the denotion of all (lambda) terms which make the lambda portion of the machine fail to terminate, and never return control to the process calculus part -- thus no further observations will be generated by the program.
Regards, Brian.
Rob Dockins

Robert Dockins wrote:
On Wednesday 10 May 2006 02:49 pm, you wrote:
I could write:
] (r',_) = runIO (mapM print ones) realWorld
and this computation, even though some printing would be observable, still evaluates to bottom, because r' will never be bound.
Humm... how do you define observable? If r' is never bound, how can I observe any intermediate printing?
I meant observable in the sense that the printing is just like a debug trace of evaluation, and so is irrelevant when using the world-state transformer point of view.
[snip] The world-state-transformer idea is nice as a didactic tool, but I don't think its the right world-view for serious thinking about Haskell's semantics.
I thought everything in Haskell is purely functional - surely that is the whole point of using monads? :-)
Sure. But in that world-view then you don't think of the IO actions as "running" at all, so you can't discuss their termination properties. This is more or less what all accounts (at least the ones I've seen) of Haskell's semantics do -- they provide a denotational semantics for the lambda terms basicaly ignore the meaning of IO actions.
I think from the world-state transformer pont of view, all non-terminating computations must be seen as evaluating to _|_, but as you point out, this point of view isn't that useful in practice.
My favorite view of Haskell semantics is of a coroutining abstract machine which alternates between evaluating lambda terms to obtain the terms of a process calculus, and then reducing those process calculus terms; some process calculus reduction rules call into the lambda reduction engine to grow more (process calculus) terms to reduce. The observable behavior of the program is defined in terms of the sequence of reductions undertaken by the process calculus engine.
In this view "bottom" is the denotion of all (lambda) terms which make the lambda portion of the machine fail to terminate, and never return control to the process calculus part -- thus no further observations will be generated by the program.
Thanks for pointing out the process calculi. I found a good page about them at http://en.wikipedia.org/wiki/Process_calculus It would be good if there was a more detailed description of Haskell semantics along the lines of what you've said above, so that it would be clear that the running of an IO action is something very different from the running of other monads. (Not too mathematical a description but a tutorial style description) As I understand it, the IO monad shares with other monads the fact that first a composite action is built from primitive actions using >>= and return etc, and this composition is purely functional, even though >>= and return are special built-in functions for the IO monad. (As an aside, I must point out that the word "do" is ultra confusing in this respect, because "do" does not actually run the action, it is just syntactic sugar for composing actions which might be "done" later.) Then, in contrast to all other monadic actions, which are "run" by simply evaluating something hidden inside them (eg applying a function hidden inside the monad to some initial state), the running of an IO action involves the quite complicated interaction you've described above, and really does go outside the functional realm. The problem with the RealWorld view of IO, is that it is quite wrong, yet this is the view propounded everywhere, so that anyone who thinks there is something complicated about IO assumes it is because they themselves haven't understood something properly since books/tutorials keep saying how simple and purely functional it all is! ;-) (Just like in primary school when teachers make out associativity and commutivity of addition/multiplication are trivial thus leading countless would-be mathematicians who immediately see these are complicated but think this is their fault alone, to abandon all hope ;-)) Therefore I humbly submit a call for authors of books/tutorials on IO to come clean and admit the fact that IO completely changes the language from being purely functional into a functional + process calculus language :-) Best regards, Brian.

Brian Hulley wrote:
Therefore I humbly submit a call for authors of books/tutorials on IO to come clean and admit the fact that IO completely changes the language from being purely functional into a functional + process calculus language :-)
As an author of such a book, I'm not willing to do this. Or at least, if we omit concurrency and impure operations such as unsafePerformIO, Haskell is a purely functional, sequential, and deterministic language, whose semantics, including that of IO, can be explained via conventional equational reasoning. I'm sure that it can also be explained via a suitable process calculus, but that is an overkill -- such calculi are best used for describing non-deterministic / concurrent languages. -Paul Brian Hulley wrote:
Robert Dockins wrote:
On Wednesday 10 May 2006 02:49 pm, you wrote:
I could write:
] (r',_) = runIO (mapM print ones) realWorld
and this computation, even though some printing would be observable, still evaluates to bottom, because r' will never be bound.
Humm... how do you define observable? If r' is never bound, how can I observe any intermediate printing?
I meant observable in the sense that the printing is just like a debug trace of evaluation, and so is irrelevant when using the world-state transformer point of view.
[snip] The world-state-transformer idea is nice as a didactic tool, but I don't think its the right world-view for serious thinking about Haskell's semantics.
I thought everything in Haskell is purely functional - surely that is the whole point of using monads? :-)
Sure. But in that world-view then you don't think of the IO actions as "running" at all, so you can't discuss their termination properties. This is more or less what all accounts (at least the ones I've seen) of Haskell's semantics do -- they provide a denotational semantics for the lambda terms basicaly ignore the meaning of IO actions.
I think from the world-state transformer pont of view, all non-terminating computations must be seen as evaluating to _|_, but as you point out, this point of view isn't that useful in practice.
My favorite view of Haskell semantics is of a coroutining abstract machine which alternates between evaluating lambda terms to obtain the terms of a process calculus, and then reducing those process calculus terms; some process calculus reduction rules call into the lambda reduction engine to grow more (process calculus) terms to reduce. The observable behavior of the program is defined in terms of the sequence of reductions undertaken by the process calculus engine.
In this view "bottom" is the denotion of all (lambda) terms which make the lambda portion of the machine fail to terminate, and never return control to the process calculus part -- thus no further observations will be generated by the program.
Thanks for pointing out the process calculi. I found a good page about them at http://en.wikipedia.org/wiki/Process_calculus
It would be good if there was a more detailed description of Haskell semantics along the lines of what you've said above, so that it would be clear that the running of an IO action is something very different from the running of other monads. (Not too mathematical a description but a tutorial style description)
As I understand it, the IO monad shares with other monads the fact that first a composite action is built from primitive actions using >>= and return etc, and this composition is purely functional, even though >>= and return are special built-in functions for the IO monad. (As an aside, I must point out that the word "do" is ultra confusing in this respect, because "do" does not actually run the action, it is just syntactic sugar for composing actions which might be "done" later.)
Then, in contrast to all other monadic actions, which are "run" by simply evaluating something hidden inside them (eg applying a function hidden inside the monad to some initial state), the running of an IO action involves the quite complicated interaction you've described above, and really does go outside the functional realm.
The problem with the RealWorld view of IO, is that it is quite wrong, yet this is the view propounded everywhere, so that anyone who thinks there is something complicated about IO assumes it is because they themselves haven't understood something properly since books/tutorials keep saying how simple and purely functional it all is! ;-)
(Just like in primary school when teachers make out associativity and commutivity of addition/multiplication are trivial thus leading countless would-be mathematicians who immediately see these are complicated but think this is their fault alone, to abandon all hope ;-))
Therefore I humbly submit a call for authors of books/tutorials on IO to come clean and admit the fact that IO completely changes the language from being purely functional into a functional + process calculus language :-)
Best regards, Brian.

Brian Hulley wrote:
Therefore I humbly submit a call for authors of books/tutorials on IO to come clean and admit the fact that IO completely changes the language from being purely functional into a functional + process calculus language :-)
I agree, but for the "completely changes" part: the really nice thing about Haskell is that the functional and interaction parts are clearly separated, with (almost) no ill effects of the latter on reasoning in the former, and descriptions of interactions being subject to functional evaluation just as any other kind of data structures. (*) Paul Hudak wrote:
As an author of such a book, I'm not willing to do this. Or at least, if we omit concurrency and impure operations such as unsafePerformIO, Haskell is a purely functional, sequential, and deterministic language, whose semantics, including that of IO, can be explained via conventional equational reasoning.
I'm very surprised to hear you say this, and I certainly cannot agree. a language that contains elements that are not best expressed as functions is not "purely functional" anymore, even when its design carefully ensures that it is still pure, and mainly functional, and can be reasoned about equationally. the element that falls outside the remit of functions is the interaction with the runtime context (operating system/other processes/users/external world/..). Haskell's approach to this issue is mostly functional and clearly separates functional part from the part that is "out of its hands": functionally compute an interaction description, have that interaction performed under outside control, have control returned to functional evaluation with a representation of the interaction result, repeat until done. (an informal recipe like this may be even more suitable for learners than either process calculus rules or claims about being purely functional in principle). if you wanted to model that middle part functionally, you'd have to cover all of the external world as well as scheduling. one nice thing about a process calculus style operational semantics is the modular description; you only need to model how Haskell programs fit into the external world, not the external world itself: assuming that world to be modelled in the same style, we need a miniscule amount of process calculus rules to describe the i/o interactions, falling back to functional-only reasoning for the vast majority of the language.
I'm sure that it can also be explained via a suitable process calculus, but that is an overkill -- such calculi are best used for describing non-deterministic / concurrent languages.
using a process calculus framework does not imply that each process has to be non-deterministic / concurrent -- it just makes it easier to show how the "purely functional, sequential and deterministic" evaluation inside a process running Haskell is embedded into and influenced by interactions with the rest of the framework. attempts to ignore that external framework tend to cloud the issues. and as Brian points out, that is more confusing for learners of the language than having to take a tiny bit of process calculus with your mostly functional prescription!-) cheers, claus (*) it is rather dated by now, and certainly not up to the demands of pragmatic Haskell programmers today, but I discussed some of the various functional i/o styles in this way in chapter 3 of http://www.cs.kent.ac.uk/people/staff/cr3/publications/phd.html

Hi Claus -- I think that you're asking for a semantics of the entire OS, i.e. the entire outside world, and for that I agree that something other than equational reasoning is needed to reason about it. However, I would argue that that is outside the mandate of a book on Haskell. But maybe that's the point -- i.e. others feel otherwise. My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate. The original Haskell report in fact had an appendix with a semantics for I/O written as a Haskell program with a single non-deterministic merge operator. The reason for the non-deterministic merge was to account for SEVERAL Haskell programs interacting with the OS, but for a single program it can go away. I claim that such a semantics is still possible for Haskell, and equational reasoning is sufficient to reason about it. If you disagree, then please tell me which features in Haskell (a particular I/O command, perhaps?) cannot be modelled as a function. I'm not familiar with your thesis, but I note in the abstract that you "extend an existing, purely functional language with facilities for input/output and modular programming". If those extensions cannot be modelled as pure functions, then I would agree that a process calculus (say) would be the right way to go. But as far as i know, Haskell doesn't have such features. -Paul Claus Reinke wrote:
Paul Hudak wrote:
As an author of such a book, I'm not willing to do this. Or at least, if we omit concurrency and impure operations such as unsafePerformIO, Haskell is a purely functional, sequential, and deterministic language, whose semantics, including that of IO, can be explained via conventional equational reasoning.
I'm very surprised to hear you say this, and I certainly cannot agree.
a language that contains elements that are not best expressed as functions is not "purely functional" anymore, even when its design carefully ensures that it is still pure, and mainly functional, and can be reasoned about equationally. the element that falls outside the remit of functions is the interaction with the runtime context (operating system/other processes/users/external world/..). Haskell's approach to this issue is mostly functional and clearly separates functional part from the part that is "out of its hands": functionally compute an interaction description, have that interaction performed under outside control, have control returned to functional evaluation with a representation of the interaction result, repeat until done. (an informal recipe like this may be even more suitable for learners than either process calculus rules or claims about being purely functional in principle).
if you wanted to model that middle part functionally, you'd have to cover all of the external world as well as scheduling. one nice thing about a process calculus style operational semantics is the modular description; you only need to model how Haskell programs fit into the external world, not the external world itself: assuming that world to be modelled in the same style, we need a miniscule amount of process calculus rules to describe the i/o interactions, falling back to functional-only reasoning for the vast majority of the language.
I'm sure that it can also be explained via a suitable process calculus, but that is an overkill -- such calculi are best used for describing non-deterministic / concurrent languages.
using a process calculus framework does not imply that each process has to be non-deterministic / concurrent -- it just makes it easier to show how the "purely functional, sequential and deterministic" evaluation inside a process running Haskell is embedded into and influenced by interactions with the rest of the framework.
attempts to ignore that external framework tend to cloud the issues. and as Brian points out, that is more confusing for learners of the language than having to take a tiny bit of process calculus with your mostly functional prescription!-)
cheers, claus
(*) it is rather dated by now, and certainly not up to the demands of pragmatic Haskell programmers today, but I discussed some of the various functional i/o styles in this way in chapter 3 of http://www.cs.kent.ac.uk/people/staff/cr3/publications/phd.html

On May 23, 2006, at 9:50 AM, Paul Hudak wrote:
Hi Claus --
I think that you're asking for a semantics of the entire OS, i.e. the entire outside world, and for that I agree that something other than equational reasoning is needed to reason about it. However, I would argue that that is outside the mandate of a book on Haskell. But maybe that's the point -- i.e. others feel otherwise.
My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate.
The original Haskell report in fact had an appendix with a semantics for I/O written as a Haskell program with a single non- deterministic merge operator. The reason for the non-deterministic merge was to account for SEVERAL Haskell programs interacting with the OS, but for a single program it can go away. I claim that such a semantics is still possible for Haskell, and equational reasoning is sufficient to reason about it.
If you disagree, then please tell me which features in Haskell (a particular I/O command, perhaps?) cannot be modelled as a function. I'm not familiar with your thesis, but I note in the abstract that you "extend an existing, purely functional language with facilities for input/output and modular programming". If those extensions cannot be modelled as pure functions, then I would agree that a process calculus (say) would be the right way to go. But as far as i know, Haskell doesn't have such features.
IO.hGetContents? I suspect the result of using hGetContents on a file you have open for writing in the same program can't be modeled as a pure function; at best you can say it depends on the order of evaluation which isn't defined. Not that it's necessarily a huge blow to your argument (hGetContents is viewed with some suspicion anyway), but it is a Haskell98 feature. Things obviously get more complicated in the presence of concurrency. I'm not certain, but I believe some of the memory consistency models being discussed for Haskell' are not expressible using a non-deterministic merge, which basically assumes that all program actions are serializable. http://www.haskell.org//pipermail/haskell-prime/2006-March/001168.html
-Paul
Claus Reinke wrote:
As an author of such a book, I'm not willing to do this. Or at least, if we omit concurrency and impure operations such as unsafePerformIO, Haskell is a purely functional, sequential, and deterministic language, whose semantics, including that of IO, can be explained via conventional equational reasoning. I'm very surprised to hear you say this, and I certainly cannot agree. a language that contains elements that are not best expressed as functions is not "purely functional" anymore, even when its design carefully ensures that it is still pure, and mainly functional, and can be reasoned about equationally. the element that falls outside the remit of functions is the interaction with the runtime context (operating system/other
Paul Hudak wrote: processes/users/external world/..). Haskell's approach to this issue is mostly functional and clearly separates functional part from the part that is "out of its hands": functionally compute an interaction description, have that interaction performed under outside control, have control returned to functional evaluation with a representation of the interaction result, repeat until done. (an informal recipe like this may be even more suitable for learners than either process calculus rules or claims about being purely functional in principle).
I'm sure that it can also be explained via a suitable process calculus, but that is an overkill -- such calculi are best used for describing non-deterministic / concurrent languages. using a process calculus framework does not imply that each
if you wanted to model that middle part functionally, you'd have to cover all of the external world as well as scheduling. one nice thing about a process calculus style operational semantics is the modular description; you only need to model how Haskell programs fit into the external world, not the external world itself: assuming that world to be modelled in the same style, we need a miniscule amount of process calculus rules to describe the i/o interactions, falling back to functional-only reasoning for the vast majority of the language. process has to be non-deterministic / concurrent -- it just makes it easier to show how the "purely functional, sequential and deterministic" evaluation inside a process running Haskell is embedded into and influenced by interactions with the rest of the framework. attempts to ignore that external framework tend to cloud the issues. and as Brian points out, that is more confusing for learners of the language than having to take a tiny bit of process calculus with your mostly functional prescription!-) cheers, claus (*) it is rather dated by now, and certainly not up to the demands of pragmatic Haskell programmers today, but I discussed some of the various functional i/o styles in this way in chapter 3 of http://www.cs.kent.ac.uk/people/staff/cr3/publications/phd.html
Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Robert Dockins wrote:
On May 23, 2006, at 9:50 AM, Paul Hudak wrote:
If you disagree, then please tell me which features in Haskell (a particular I/O command, perhaps?) cannot be modelled as a function.
IO.hGetContents? I suspect the result of using hGetContents on a file you have open for writing in the same program can't be modeled as a pure function; at best you can say it depends on the order of evaluation which isn't defined. Not that it's necessarily a huge blow to your argument (hGetContents is viewed with some suspicion anyway), but it is a Haskell98 feature.
Ah yes, operations involving file handles are a good example, but I think the problem here is that we don't actually have a formal semantics for what's supposed to happen. I recall seeing threads about this in the past, where in particular the interactions of lazy evaluation with opening and closing files was being debated. But, I would argue that in principle we COULD (and probably should) specify a deterministic semantics if someone were willing to sit down and spell it out.
Things obviously get more complicated in the presence of concurrency. I'm not certain, but I believe some of the memory consistency models being discussed for Haskell' are not expressible using a non-deterministic merge, which basically assumes that all program actions are serializable.
http://www.haskell.org//pipermail/haskell-prime/2006-March/001168.html
Yes, I agree, although I specifically excluded concurrency from my argument. -Paul

Hi Paul,
I think that you're asking for a semantics of the entire OS, i.e. the entire outside world, and for that I agree that something other than equational reasoning is needed to reason about it.
I was about to reply "no, only of the interface between Haskell and the OS", but perhaps that comes to nearly the same in the end? at least, I'd like to see a simplified "Haskell view of the OS" added to the picture, which seems to imply that one needs to take at least one step beyond the "Haskell only" picture, into the domain in which different reasoning tools may be more appropriate. without having to go into all of the OS, such a step should widen the horizon sufficiently to explain the difference between "i/o is purely functional" (wrong, but popular) and "haskell's role in i/o is purely functional, but other roles need to be taken into account to complete the picture" (better).
However, I would argue that that is outside the mandate of a book on Haskell. But maybe that's the point -- i.e. others feel otherwise.
I think I do, and obviously (judging from how often this topic reappears) Haskell learners find the current presentations confusing. if a Haskell book shies away from the topic of i/o, or defers it to some late chapter, beginners get the impression that it is more difficult than it should be while advanced learners are frustrated by the lack of coverage. if a Haskell book shies away from explaining at least the basics of how the interactions with the OS go beyond the functional world, even though the Haskell part of it is still functional, beginners go away with a wrong idea (often repeated as a dogma to other newcomers) and advanced learners struggle with their intuition, which tells them that there must be something else going on. the latter was the case Brian was making, I think.
My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate.
I assume you mean the Haskell program interacts deterministically with non-deterministic inputs, rather than that the OS interacts deterministically with the Haskell program.
The original Haskell report in fact had an appendix with a semantics for I/O written as a Haskell program with a single non-deterministic merge operator. The reason for the non-deterministic merge was to account for SEVERAL Haskell programs interacting with the OS, but for a single program it can go away. I claim that such a semantics is still possible for Haskell, and equational reasoning is sufficient to reason about it.
non-deterministic merge is necessary there, and beyond the purely functional domain. and unless you let all your Haskell programs run in the dark, there will always be more than one agent interacting with shared resources: that Haskell program, you, OS daemons, etc. the merge idea stems from a time when i/o was described in terms of infinite streams and their transformations, which turned out to be rather difficult to compose in practice. the transition to step-wise interactions and their composition is what makes the process-calculus style more natural these days. not to mention that it prepares readers for other adventures in these modern times - in fact, future readers may be more familiar with process interactions from their other interests, and therefore confused by any attempt to avoid these ideas. apart from that, yes, it might be sufficient to write a simplified OS as a meta interpreter, with access to resources and other agents, and in control of scheduling. that interpreter would evaluate Haskell code and execute its interactions with the OS. as with the old sorting-office/ non-deterministic merge, such an artifact would allow the author to explain how the non-determinism is needed, but outside the Haskell code, and only permitted to influence the Haskell code in strictly controlled ways. cheers, claus

Points well taken Claus. See further comments below. Claus Reinke wrote:
Hi Paul,
I think that you're asking for a semantics of the entire OS, i.e. the entire outside world, and for that I agree that something other than equational reasoning is needed to reason about it.
I was about to reply "no, only of the interface between Haskell and the OS", but perhaps that comes to nearly the same in the end? at least, I'd like to see a simplified "Haskell view of the OS" added to the picture, which seems to imply that one needs to take at least one step beyond the "Haskell only" picture, into the domain in which different reasoning tools may be more appropriate. without having to go into all of the OS, such a step should widen the horizon sufficiently to explain the difference between "i/o is purely functional" (wrong, but popular) and "haskell's role in i/o is purely functional, but other roles need to be taken into account to complete the picture" (better).
Yes, I could agree with that point of view.
However, I would argue that that is outside the mandate of a book on Haskell. But maybe that's the point -- i.e. others feel otherwise.
I think I do, and obviously (judging from how often this topic reappears) Haskell learners find the current presentations confusing. if a Haskell book shies away from the topic of i/o, or defers it to some late chapter, beginners get the impression that it is more difficult than it should be while advanced learners are frustrated by the lack of coverage. if a Haskell book shies away from explaining at least the basics of how the interactions with the OS go beyond the functional world, even though the Haskell part of it is still functional, beginners go away with a wrong idea (often repeated as a dogma to other newcomers) and advanced learners struggle with their intuition, which tells them that there must be something else going on. the latter was the case Brian was making, I think.
Well, in defense of such an approach, there are very few (if any?) textbooks in ANY language that take a formal approach to defining what I/O does. But your point is still well taken, in that I think that the EXPECTATIONS for books on FP to do this are higher because (a) many of them take a formal approach to explaining the internal meaning of programs, and thus readers expect the same for I/O, and (b) the model of I/O that they use (say, monads) is sometimes so different from "conventional wisdom" that SOMETHING needs to be said, lest the reader be left in the dark!
My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate.
I assume you mean the Haskell program interacts deterministically with non-deterministic inputs, rather than that the OS interacts deterministically with the Haskell program.
Yes.
The original Haskell report in fact had an appendix with a semantics for I/O written as a Haskell program with a single non-deterministic merge operator. The reason for the non-deterministic merge was to account for SEVERAL Haskell programs interacting with the OS, but for a single program it can go away. I claim that such a semantics is still possible for Haskell, and equational reasoning is sufficient to reason about it.
non-deterministic merge is necessary there, and beyond the purely functional domain. and unless you let all your Haskell programs run in the dark, there will always be more than one agent interacting with shared resources: that Haskell program, you, OS daemons, etc.
I think that I'm saying something a little stronger, namely that if you capture in a black box the entire universe EXCEPT for the one Haskell program that you're trying to reason about, then the interactions between that black box and the Haskell program can be explained purely functionally and deterministically. That may be an over-simplified view of things -- but it's at least one line to draw when deciding "how much" about a language's semantics should be explained to the user.
the merge idea stems from a time when i/o was described in terms of infinite streams and their transformations, which turned out to be rather difficult to compose in practice. the transition to step-wise interactions and their composition is what makes the process-calculus style more natural these days. not to mention that it prepares readers for other adventures in these modern times - in fact, future readers may be more familiar with process interactions from their other interests, and therefore confused by any attempt to avoid these ideas.
Interesting. One might rewrite your first two sentences as: "the merge idea stems from a time when i/o was described in terms of infinite streams and their transformations, which turned out to be rather difficult to compose in practice. the transition to step-wise interactions and their composition is what makes MONADS more natural these days." :-) Seriously, I think that it would be a useful exercise to write a meta-interpreter of Haskell I/O, in Haskell. That's basically what the appendix of the Haskell Report was many years ago, but it used a stream model of I/O. And I think that this is consistent with your final point below.
apart from that, yes, it might be sufficient to write a simplified OS as a meta interpreter, with access to resources and other agents, and in control of scheduling. that interpreter would evaluate Haskell code and execute its interactions with the OS. as with the old sorting-office/ non-deterministic merge, such an artifact would allow the author to explain how the non-determinism is needed, but outside the Haskell code, and only permitted to influence the Haskell code in strictly controlled ways.
-Paul

Thanks a lot for your positive reply, Paul. we seem to agree on most points, or at least understand our different preferences, but there is one remaining point I'd like to clear up.
Well, in defense of such an approach, there are very few (if any?) textbooks in ANY language that take a formal approach to defining what I/O does. But your point is still well taken, in that I think that the EXPECTATIONS for books on FP to do this are higher because (a) many of them take a formal approach to explaining the internal meaning of programs, and thus readers expect the same for I/O, and (b) the model of I/O that they use (say, monads) is sometimes so different from "conventional wisdom" that SOMETHING needs to be said, lest the reader be left in the dark!
I would agree with (a), although an informal, unambiguous explanation might also do, and (b) has been the topic of most beginner's i/o questions and answers on these lists, so that sounds right as well. I would add, however, that while i/o in Haskell uses the general monadic interface popular for so many problems, monadic i/o is still different from other uses of monads in Haskell, and that difference tends to get lost when the usual explanations focus on issues of monads in general. let me try to explain what I mean: the usual way of reasoning about Haskell programs is, as you say, equational, but more than that, the equations are *context-independent*, ie, they equate programs in all (valid) contexts. this also works for monads, and still applies to the monadic aspects of Haskell i/o, but those monadic aspects *only account for construction of i/o-"scripts" and for some structural properties required of those scripts*. in other words, we can use context-free equations to reason about different ways to construct the same script, and we can use the context-free monad laws to establish structural equivalences between syntactically different scripts. but we can not use context-free equations between Haskell programs to reason about the execution of those scripts, because the very essence of i/o (at least how I explained it to myself in that thesis chapter;-) is interaction with the program's runtime context. so, if we want to reason about these interactions as well, we need to take the context into account. as you say, we could do that by simply enlarging our scope, and define an interpreter for both the Haskell program and the OS, but that quickly gets complex, especially once we start taking multiple interacting programs into account (and how many programs these days aren't interactive in that sense?). the alternative is to use *context-dependent* equations, but to abstract away as much of the context as possible (Felleisen's evaluation contexts for operational semantics). Haskell's standard context-free reasoning rules then simply fall out as rules that hold "for all contexts" and we are back in our old familiar world, but we can now use the very same reasoning framework to talk about interactions as well, and about why an i/o-script "at the top-level" (result of Main.main) is different from an i/o-script nested anywhere else in the program. the difference is that the two have different contexts: only at the top-level (empty context, or leftmost innermost position in the top-level >>=-tree) does an i/o-script become directly accessible to the i/o-loop or OS or meta-interpreter or whatever we may call it, and with the help of that external entity, and under external control, it can be said to have access to and to interact with resources outside the Haskell program text, inside the runtime context (about which context-free reasoning can't tell us anything).
I think that I'm saying something a little stronger, namely that if you capture in a black box the entire universe EXCEPT for the one Haskell program that you're trying to reason about, then the interactions between that black box and the Haskell program can be explained purely functionally and deterministically. That may be an over-simplified view of things -- but it's at least one line to draw when deciding "how much" about a language's semantics should be explained to the user.
I don't see how the whole of black box and Haskell program could be captured purely functionally and deterministically. even if Stoye limited the necessary extensions to the purely functional world view to just one non-deterministic merge, only in his sorting office, he needed that extension of the purely functional world, and once you have a black box with such a merge in it, you cannot guarantee that the non-determinism won't be visible at the black box interface (in fact, that would be the whole point of introducing it in the first place), so even if you don't know anything else about that black box, you cannot assume that it will be a pure function. which means, as far as I can see, that even if the Haskell program is purely functional, the combined system of Haskell program and black box is not. you and I may know what you are doing when taking that over-simplified view of things, but I vaguely remember my struggles with the intricacies of the various functional i/o systems, and from those memories I claim that you would not help your readers if you chose not to explain or even to hide those intricacies. you can always tell the beginner that they can, for many cases, ignore those details - they will look into it briefly, then go away happy that they don't need to understand it immediately. but you do need to help the advanced learners by giving them the option to look into those details once their intuition develops far enough to want better answers. [that is just my view of things, naturally, but I remember going through the library shelves to find books that would suit me, and if I saw any "skimming over" interesting details, I dropped those books faster than any group of populistic authors could publish them; of course, I had to go to the original papers in the end, but I did prefer those books that, instead of hiding advanced material, guided the reader initially around, and eventually into them:-]
:-) Seriously, I think that it would be a useful exercise to write a meta-interpreter of Haskell I/O, in Haskell. That's basically what the appendix of the Haskell Report was many years ago, but it used a stream model of I/O. And I think that this is consistent with your final point below.
I wrote my first substantial Haskell program at the time of about Haskell 1.2, and I agree: that appendix was useful for understanding Haskell's i/o story at the time (request/response streams, if I recall correctly; before monads;-). cheers, claus

I agree with most of your comments Claus. I think that the remaining difference of opinion is just how much of the I/O semantics one might expect to see in a textbook on FP (more specifically, Haskell). My concern is two-fold: First, to cover ALL of I/O would be an enormous undertaking; at best we might expect basic operations to be explained, to get the general principles across. Second, if explaining I/O means introducing things outside of the normal formal semantics of a language, then I would get nervous. By "normal formal semantics" I mean a conventional operational or denotational semantics, say. By things "outside of the semantics" I mean things needed to explain an operating system, like non-determinism, concurrency, etc. If a language already has non-determinism or concurrency then perhaps I wouldn't be as concerned. I think that you would like the second point, at least, taken care of, somehow. I agree that in an ideal world that would be nice, but the formal semantics of most languages doesn't extend into the OS, so I wouldn't know where to start. And in any case it also might be a huge undertaking. So, in the case of Haskell, my hope is that a suitable description of the I/O monad in terms of equational reasoning would be adequate to get the general principles across, if not the finer details. I wouldn't even object if it had, say, one non-functional feature like a non-deterministic merge, as long as the intuitions were right. (So, maybe I need to do that in the next edition of my book :-) -Paul Claus Reinke wrote:
Thanks a lot for your positive reply, Paul. we seem to agree on most points, or at least understand our different preferences, but there is one remaining point I'd like to clear up.
Well, in defense of such an approach, there are very few (if any?) textbooks in ANY language that take a formal approach to defining what I/O does. But your point is still well taken, in that I think that the EXPECTATIONS for books on FP to do this are higher because (a) many of them take a formal approach to explaining the internal meaning of programs, and thus readers expect the same for I/O, and (b) the model of I/O that they use (say, monads) is sometimes so different from "conventional wisdom" that SOMETHING needs to be said, lest the reader be left in the dark!
I would agree with (a), although an informal, unambiguous explanation might also do, and (b) has been the topic of most beginner's i/o questions and answers on these lists, so that sounds right as well. I would add, however, that while i/o in Haskell uses the general monadic interface popular for so many problems, monadic i/o is still different from other uses of monads in Haskell, and that difference tends to get lost when the usual explanations focus on issues of monads in general.
let me try to explain what I mean: the usual way of reasoning about Haskell programs is, as you say, equational, but more than that, the equations are *context-independent*, ie, they equate programs in all (valid) contexts. this also works for monads, and still applies to the monadic aspects of Haskell i/o, but those monadic aspects *only account for construction of i/o-"scripts" and for some structural properties required of those scripts*. in other words, we can use context-free equations to reason about different ways to construct the same script, and we can use the context-free monad laws to establish structural equivalences between syntactically different scripts. but we can not use context-free equations between Haskell programs to reason about the execution of those scripts, because the very essence of i/o (at least how I explained it to myself in that thesis chapter;-) is interaction with the program's runtime context.
so, if we want to reason about these interactions as well, we need to take the context into account. as you say, we could do that by simply enlarging our scope, and define an interpreter for both the Haskell program and the OS, but that quickly gets complex, especially once we start taking multiple interacting programs into account (and how many programs these days aren't interactive in that sense?). the alternative is to use *context-dependent* equations, but to abstract away as much of the context as possible (Felleisen's evaluation contexts for operational semantics). Haskell's standard context-free reasoning rules then simply fall out as rules that hold "for all contexts" and we are back in our old familiar world, but we can now use the very same reasoning framework to talk about interactions as well, and about why an i/o-script "at the top-level" (result of Main.main) is different from an i/o-script nested anywhere else in the program. the difference is that the two have different contexts: only at the top-level (empty context, or leftmost innermost position in the top-level >>=-tree) does an i/o-script become directly accessible to the i/o-loop or OS or meta-interpreter or whatever we may call it, and with the help of that external entity, and under external control, it can be said to have access to and to interact with resources outside the Haskell program text, inside the runtime context (about which context-free reasoning can't tell us anything).
I think that I'm saying something a little stronger, namely that if you capture in a black box the entire universe EXCEPT for the one Haskell program that you're trying to reason about, then the interactions between that black box and the Haskell program can be explained purely functionally and deterministically. That may be an over-simplified view of things -- but it's at least one line to draw when deciding "how much" about a language's semantics should be explained to the user.
I don't see how the whole of black box and Haskell program could be captured purely functionally and deterministically. even if Stoye limited the necessary extensions to the purely functional world view to just one non-deterministic merge, only in his sorting office, he needed that extension of the purely functional world, and once you have a black box with such a merge in it, you cannot guarantee that the non-determinism won't be visible at the black box interface (in fact, that would be the whole point of introducing it in the first place), so even if you don't know anything else about that black box, you cannot assume that it will be a pure function. which means, as far as I can see, that even if the Haskell program is purely functional, the combined system of Haskell program and black box is not.
you and I may know what you are doing when taking that over-simplified view of things, but I vaguely remember my struggles with the intricacies of the various functional i/o systems, and from those memories I claim that you would not help your readers if you chose not to explain or even to hide those intricacies. you can always tell the beginner that they can, for many cases, ignore those details - they will look into it briefly, then go away happy that they don't need to understand it immediately. but you do need to help the advanced learners by giving them the option to look into those details once their intuition develops far enough to want better answers.
[that is just my view of things, naturally, but I remember going through the library shelves to find books that would suit me, and if I saw any "skimming over" interesting details, I dropped those books faster than any group of populistic authors could publish them; of course, I had to go to the original papers in the end, but I did prefer those books that, instead of hiding advanced material, guided the reader initially around, and eventually into them:-]
:-) Seriously, I think that it would be a useful exercise to write a meta-interpreter of Haskell I/O, in Haskell. That's basically what the appendix of the Haskell Report was many years ago, but it used a stream model of I/O. And I think that this is consistent with your final point below.
I wrote my first substantial Haskell program at the time of about Haskell 1.2, and I agree: that appendix was useful for understanding Haskell's i/o story at the time (request/response streams, if I recall correctly; before monads;-).
cheers, claus
-- Professor Paul Hudak Department of Computer Science Office: (203) 432-1235 Yale University FAX: (203) 432-0593 P.O. Box 208285 email: paul.hudak@yale.edu New Haven, CT 06520-8285 WWW: www.cs.yale.edu/~hudak

Paul Hudak wrote:
Hi Claus --
I think that you're asking for a semantics of the entire OS, i.e. the entire outside world, and for that I agree that something other than equational reasoning is needed to reason about it. However, I would argue that that is outside the mandate of a book on Haskell. But maybe that's the point -- i.e. others feel otherwise.
My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate.
I think the problem is that to understand something you need a lot more than just the capability to reason about it. For example, given laws such as: x * (y + z) == (x * y) + (x * z) x + (y + z) = (x + y) + z I can reason that x * (y + (z + w)) = (x * y + x * z) + x * w But this does *not* mean that therefore I *understand* it. I think understanding is a much deeper process. I have to grapple with the underlying shape, the gesture, the motion of the symbolic transformation and really *feel* the lawfulness of it as a profound inner life experience. So to get back to the question of understanding monads, yes I can reason about them equationally using pure functions but to understand Haskell I need to understand how it is situated in my own human experience and my human experience seems to me to be more than just a deterministic sequential function of Unique -> Time -> SenseInput. Regards, Brian.

Brian Hulley wrote:
Paul Hudak wrote:
... My main point it that if we're reasoning about a single Haskell program (with no impure features), then the entire world, with all its non-determinism internal to it, can be modelled as a black box -- i.e. a function -- that interacts with the single Haskell program in a completely sequential, deterministic manner. And for that, equational reasoning is perfectly adequate.
I think the problem is that to understand something you need a lot more than just the capability to reason about it. For example, given laws such as:
x * (y + z) == (x * y) + (x * z) x + (y + z) = (x + y) + z
I can reason that
x * (y + (z + w)) = (x * y + x * z) + x * w
But this does *not* mean that therefore I *understand* it. I think understanding is a much deeper process. I have to grapple with the underlying shape, the gesture, the motion of the symbolic transformation and really *feel* the lawfulness of it as a profound inner life experience.
So to get back to the question of understanding monads, yes I can reason about them equationally using pure functions but to understand Haskell I need to understand how it is situated in my own human experience ...
I agree with all of this. But then you say:
and my human experience seems to me to be more than just a deterministic sequential function of Unique -> Time -> SenseInput.
This seems like a value judgement. Someone else might very much like the functional model of things, and might not like some other model. And one might argue that if the functional view is the same as the view that one is using to reason about the rest of the program, then that is a Good Thing. So I'm not saying that there aren't other approaches (denotational semantics, operational semantics, process calculi, etc.) but they might each have the problem of "understanding how it is situated in my own human experience". -Paul

Paul Hudak wrote:
Brian Hulley wrote:
[snip] So to get back to the question of understanding monads, yes I can reason about them equationally using pure functions but to understand Haskell I need to understand how it is situated in my own human experience ...
I agree with all of this. But then you say:
and my human experience seems to me to be more than just a deterministic sequential function of Unique -> Time -> SenseInput.
This seems like a value judgement. Someone else might very much like the functional model of things, and might not like some other model.
I see what you mean. Certainly I cannot find any justification for rejecting the view that one's human experience could indeed be regarded as such a function, so thanks for correcting me here.
And one might argue that if the functional view is the same as the view that one is using to reason about the rest of the program, then that is a Good Thing.
One issue here is that I think there is often not enough clarity about the two different kinds of functional activity going on ie: 1) The actual construction of the monadic action using >>= and return (and fail) 2) The functional interpretation of the action thus constructed 1) is accomplished by evaluating main 2) happens secretly afterwards My current understanding of this is that lazy evaluation allows you to consider that these two different activities really do happen in the sequence above, the lazyness just acting as a helpful optimization that avoids wasting time constructing bits of action that are never going to get executed (ie lazyness pushes the future back into the present). Lazyness otherwise seems to be unnecessary as long as >> is always defined in terms of >>= and never directly by itself, so that the order of function applications in 2) is fully determined by the data flow in the monad, but I haven't yet found a definitive explanation of the exact role lazyness plays in the monadic picture.
So I'm not saying that there aren't other approaches (denotational semantics, operational semantics, process calculi, etc.) but they might each have the problem of "understanding how it is situated in my own human experience".
I agree. I think it would be good if some tutorials were written also with these other approaches in mind, so that people who have difficulty understanding monadic IO from the purely functional perspective would be able to explore the subject from multiple perspectives and thereby hopefully find one that situates it properly within their own experience. Also, someone coming from a non-functional background may initially find it easier to understand monads given a very low level implementation description eg a simplified virtual machine interpreted by a C switch statement in a loop (I'm thinking of Hassan Ait Kaci's excellent WAM reconstruction tutorial for Prolog), and only later gradually come to understand them from a higher level point of view. Regards, Brian.

One issue here is that I think there is often not enough clarity about the two different kinds of functional activity going on ie: 1) The actual construction of the monadic action using >>= and return (and fail) 2) The functional interpretation of the action thus constructed
I agree that this distinction is important, and the latter stage is where the context-sensitivity of i/o comes in - unlike all other uses of monads in Haskell, i/o actions are interpreted by an external entity (see my other email).
My current understanding of this is that lazy evaluation allows you to consider that these two different activities really do happen in the sequence above, the lazyness just acting as a helpful optimization that avoids wasting time constructing bits of action that are never going to get executed (ie lazyness pushes the future back into the present).
(the picture is slightly less simple, as 1/2 are repeated, hence construction and interpretation are interleaved, and later construction may depend on the results of earlier interpretation, but I assume you know that;) you don't actually need lazyness. you don't even need >>= to be non-strict in its second parameter, because that second parameter will be a continuation function, and even strict functional languages tend not to evaluate under lambdas (so wrapping expressions in extra lambdas or eta-extending functions to make their outer lambda explicit is a standard technique to delay evaluation in those languages). the standard papers do not make this very clear, but neither lazyness nor static typing are needed to add a monadic i/o system to a purely functional language (it took me a while to figure that out when I had to decide how to add an i/o system to a purely functional language without either feature..). cheers, claus

Claus Reinke wrote:
[snip] (the picture is slightly less simple, as 1/2 are repeated, hence construction and interpretation are interleaved, and later construction may depend on the results of earlier interpretation, but I assume you know that;)
This is one of the issues that I was confused about, but I think it's getting clearer day by day :-)
you don't actually need lazyness.
(except if you tried to define >> directly instead of using >>= as the primitive because then the second parameter would be an action directly instead of a continuation)
you don't even need >>= to be non-strict in its second parameter, because that second parameter will be a continuation function, and even strict functional languages tend not to evaluate under lambdas (so wrapping expressions in extra lambdas or eta-extending functions to make their outer lambda explicit is a standard technique to delay evaluation in those languages).
Thanks! This is *absolutely* the central problem in my understanding since I never know whether I am allowed to assume that evaluation does not happen under a lambda. For example, if I wrote a C function: int foo(int x, int y){ int z = 24 + 67 + 108 + x; int w = 200 + y; return z + w; } I would be very disappointed if the compiler didn't evaluate 24 + 67 + 108 + 200 at compilation time, whereas with: foo x y = let z = 24 + 67 + 108 + x w = 200 + y in z + w a = foo 5 if I understand you correctly I should assume that the expression (24 + 67 + 108 + 5) + (200 + y) is left alone until the y argument is supplied to a, rather then a = foo 5 causing a to be bound to \y -> 404 + y
the standard papers do not make this very clear, but neither lazyness nor static typing are needed to add a monadic i/o system to a purely functional language (it took me a while to figure that out when I had to decide how to add an i/o system to a purely functional language without either feature..).
Regards, Brian. ---------------------------------- "Have you the lion's part written? Pray you, if it be, give it me; for I am slow of study" -- Midsummer Night's Dream

you don't actually need lazyness. (except if you tried to define >> directly instead of using >>= as the primitive because then the second parameter would be an action directly instead of a continuation)
yes, you'd want >> to be non-strict in its second parameter.
Thanks! This is *absolutely* the central problem in my understanding since I never know whether I am allowed to assume that evaluation does not happen under a lambda.
well, assumptions like that are on shaky ground in the presence of optimizing compilers, but you can look at it this way: most functional languages, strict and non-strict, assume evaluation of functions to weak head normal form, stopping at the outermost lambda. formally, they probably ought to evaluate at least a little further, to head normal form (lots of lambdas, then a variable or an application with a variable in function position), but they don't -- (\x->undefined) and (\x->(\y->undefined)) are distinguishable in Haskell. so, while an optimizing compiler will try to share expensive computations in the body instead of reevaluating them every time a lambda is applied, it is up to the compiler optimization writer to ensure that such evaluation does not produce different effects than not evaluating under lambdas (apart from improved resource utilization).
For example, if I wrote a C function:
int foo(int x, int y){ int z = 24 + 67 + 108 + x; int w = 200 + y; return z + w; }
I would be very disappointed if the compiler didn't evaluate 24 + 67 + 108 + 200 at compilation time, whereas with:
foo x y = let z = 24 + 67 + 108 + x w = 200 + y in z + w
a = foo 5
if I understand you correctly I should assume that the expression
(24 + 67 + 108 + 5) + (200 + y)
is left alone until the y argument is supplied to a, rather then a = foo 5 causing a to be bound to \y -> 404 + y
as it stands, the compiler can't know the type of y, so it can't know whether (+) is even associative. if that is known, such optimization may or may not happen, depending on the implementation. but if such optimizations do take place, the resulting code must not terminate less often or raise more errors than the original version. cheers, claus
participants (9)
-
Antti-Juhani Kaijanaho
-
Bjorn Lisper
-
Brian Hulley
-
Claus Reinke
-
Deokhwan Kim
-
Duncan Coutts
-
Matthias Fischmann
-
Paul Hudak
-
Robert Dockins