Re: [Haskell-cafe] Preventing sharing

-fno-full-laziness fixes the space leak issue in your iterative deepening example. Yes, and I think it has been mentioned that the flag is a blunt weapon as it affects the whole module...
This isn't a problem with laziness. It's a problem with performing a time optimization which is a space pessimization. In the absence of the "optimization" there is no problem.
How come it isn't the problem with laziness?! Recall, that pure call-by-name calculus is observationally undistinguishable from the call-by-need (i.e., lazy) calculus. The only reason to have laziness is to avoid recomputations of argument computations should an argument be used more than once -- at the cost of taking memory to store the result of the first evaluation. Thus "performing a time optimization which is a space pessimization" is exactly what laziness is all about -- as the article mentioned earlier argued. Laziness isn't an absolute good -- it is a time-space trade-off, which is not always beneficial.

On Mon, Dec 21, 2015 at 08:55:05PM +0900, Oleg wrote:
-fno-full-laziness fixes the space leak issue in your iterative deepening example. Yes, and I think it has been mentioned that the flag is a blunt weapon as it affects the whole module...
Agreed.
This isn't a problem with laziness. It's a problem with performing a time optimization which is a space pessimization. In the absence of the "optimization" there is no problem.
How come it isn't the problem with laziness?! Recall, that pure call-by-name calculus is observationally undistinguishable from the call-by-need (i.e., lazy) calculus. The only reason to have laziness is to avoid recomputations of argument computations should an argument be used more than once -- at the cost of taking memory to store the result of the first evaluation. Thus "performing a time optimization which is a space pessimization" is exactly what laziness is all about -- as the article mentioned earlier argued. Laziness isn't an absolute good -- it is a time-space trade-off, which is not always beneficial.
I don't agree at all. To my mind you are assigning blame to the wrong thing. The operational semantics of f () = let x = <a large lazy structure> in ... are perfectly clear. x is reallocated each time, and is free to be released between calls to f. It's only when an "optimization" rewrites this to x = <a large lazy structure> f () = ... that there is a space leak. Exactly the same applies if the language is strict. Tom

Thus "performing a time optimization which is a space pessimization" is exactly what laziness is all about -- as the article mentioned earlier argued. Laziness isn't an absolute good -- it is a time-space trade-off, which is not always beneficial.
I don't agree at all. To my mind you are assigning blame to the wrong thing. The operational semantics of
f () = let x = <a large lazy structure> in ...
are perfectly clear. x is reallocated each time, and is free to be released between calls to f. It's only when an "optimization" rewrites this to
x = <a large lazy structure>
f () = ...
that there is a space leak. Exactly the same applies if the language is strict.
Let us take a step back. The article on my web page noted the great difficulty of writing AI search program in Haskell because the search tree is evaluated lazily: whenever a node is evaluated, its result is memoized for further use. That is precisely the wrong thing to do for such problems. Again, this problem is precisely of lazy evaluation (as defined in the book below). The obvious way to avoid memoization was to introduce thunks -- which didn't work. The article then developed a more involved solution. Yes, -no-full-laziness would have worked just as well. However, the solution in the article works on the case-by-case basis whereas -no-full-laziness affects the whole compilation unit. It is for that reason that I pointed out the article in this discussion. Let us turn to the problem of the "optimization" that we are discussing. Does it have anything to do with laziness? Yes, it does. Please see Chap 15 of the excellent book http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/ which explains the full laziness. Once I mention the book I must point out Chap 23 of the book (near the end). It should be the required read. The introduction to the section contains the following emphasized statement (emphasized by the author, Simon Peyton Jones): A major weakness of functional languages is the difficulty of reasoning about their space and time behavior. The last paragraph of the introduction says ``No good solutions are yet known to most of these problems.'' This is true even today. Sec 23.3 ``Space behavior'' is the excellent collection of the problems with lazy evaluation and full laziness. The book was published in 1987. The problems are still with us.

On Tue, Dec 22, 2015 at 7:24 PM, Oleg
difficulty of writing AI search program in Haskell because the search tree is evaluated lazily: whenever a node is evaluated, its result is memoized for further use. That is precisely the wrong thing to do for such problems. Again, this problem is precisely of lazy evaluation (as defined in the book below). The obvious way to avoid memoization was to introduce thunks -- which didn't work. The article then developed a more involved solution. Yes, -no-full-laziness would have worked just as well. However, the solution in the article works on the case-by-case basis whereas -no-full-laziness affects the whole compilation unit. It is for that reason that I pointed out the article in this discussion.
Dear Oleg: First you paint with a wide brush that laziness is "precisely the wrong thing to do for such problems." Then you implicitly acknowledge that there are different levels of laziness, namely that non-full-laziness is less lazy than full laziness. Call them laziness Levels 1 and 2, respectively Finally, you cite your local solution as an improvement to the blunt one of throttling the whole module to mere laziness Level 1. Therefore, the clever Level 1 localization is an improvement only if laziness Level 2 is useful in other parts of the module, yes? How can laziness be so bad, as the shrillness of your emails convey, if a laziness Level /2/ -- never mind Level 1 -- is actually useful elsewhere in your code? Notwithstanding your eagerness to warn of the pitfalls of laziness, your true position on laziness is undoubtedly nuanced in a manner that befits your discernment and decades of programming experience. Unfortunately, here you don't express that nuance clearly, and we are left in the dark. -- Kim-Ee

On Tue, Dec 22, 2015 at 09:24:37PM +0900, Oleg wrote:
Once I mention the book I must point out Chap 23 of the book (near the end). It should be the required read. The introduction to the section contains the following emphasized statement (emphasized by the author, Simon Peyton Jones):
A major weakness of functional languages is the difficulty of reasoning about their space and time behavior.
Let me start by saying that I, too, am very skeptical of the value of lazy-evaluation-by-default. However, regardless of my support for your conclusion, I don't agree with your line of reasoning in this case.
Let us take a step back. The article on my web page noted the great difficulty of writing AI search program in Haskell because the search tree is evaluated lazily: whenever a node is evaluated, its result is memoized for further use. That is precisely the wrong thing to do for such problems. Again, this problem is precisely of lazy evaluation (as defined in the book below).The obvious way to avoid memoization was to introduce thunks -- which didn't work. The article then developed a more involved solution. Yes, -no-full-laziness would have worked just as well. However, the solution in the article works on the case-by-case basis whereas -no-full-laziness affects the whole compilation unit. It is for that reason that I pointed out the article in this discussion.
Let us turn to the problem of the "optimization" that we are discussing. Does it have anything to do with laziness? Yes, it does. Please see Chap 15 of the excellent book
http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/
which explains the full laziness.
In order to demonstrate that the problem you are describing is unique to lazy-by-default languages, you need to explain why it does not occur in strict-by-default languages. The motivating example in the excellent book is that in the program f = g 4 g x y = y + (sqrt x) (f 1) + (f 2) the (sqrt 4) expression is evaluated twice. Thus the "full laziness transformation" (the name is misleading!) rewrites the program to (essentially) f = g 4 g x = let sqrtx = sqrt x in \y = y + sqrtx (f 1) + (f 2) To prove your point you now need to explain why 1. the full laziness transformation (again: misleading name!) is required in lazy-by-default languages 2. the full laziness transformation (don't be misled by the name) is not required in strict-by-default languages Personally I can't see why either 1 or 2 is true. Can you help me out? Or could you answer an essentially equivalent question? * If Haskell had never had the full laziness transformation (very misleading name!) you would not have seen the space leak in your iterative deepening implementation. We would have gained some predictability in space usage. But what would have been lost? My answer to the question is "we would have lost pretty much nothing". Anyone who wants the full laziness transformation (poor name) can implement it herself. What's your answer? Tom

On Mon, 21 Dec 2015 12:55:05 +0100, Oleg
The only reason to have laziness is to avoid recomputations of argument computations should an argument be used more than once -- at the cost of taking memory to store the result of the first evaluation. Thus "performing a time optimization which is a space pessimization" is exactly what laziness is all about -- as the article mentioned earlier argued. Laziness isn't an absolute good -- it is a time-space trade-off, which is not always beneficial.
In paper "Why Functional Programming Matters"[0], John Hughes shows how lazy functional programming can be used for better modularity. A more precise title for the paper would be "Why Lazy Functional Programming Matters". Regards, Henk-Jan van Tuyl [0] http://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf -- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/ http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --

On Mon, Dec 21, 2015 at 10:50 PM, Henk-Jan van Tuyl
lazy functional programming can be used for better modularity. A more precise title for the paper would be "Why Lazy Functional Programming Matters".
This is Oleg. He's perfectly aware of the paper. The point he's making is not that laziness is bad, but that it shouldn't be the default. And if you note the recent work on -XStrict, there are good arguments about bolting laziness on top of strictness and not doing a nasty -- and thus necessarily heroic -- shoehorn in the reverse direction. See: https://www.reddit.com/r/programming/comments/3sux1d/strict_haskell_xstrict_... However, the record remains that Oleg has offered little by way of elegant bolting. His lazy programs based on a strict language tend to be cluttered with lazy and force functions that uglify previously elegant code. His arguments would persuade many more folks if, for instance, he could offer lazy-over-strict translations of Doug McIlroy's power serious one-liners with no loss in elegance: http://www.cs.dartmouth.edu/~doug/powser.html -- Kim-Ee

Kim-Ee Yeoh wrote:
However, the record remains that Oleg has offered little by way of elegant bolting. His lazy programs based on a strict language tend to be cluttered with lazy and force functions that uglify previously elegant code.
His arguments would persuade many more folks if, for instance, he could offer lazy-over-strict translations of Doug McIlroy's power serious one-liners with no loss in elegance:
I wanted to write about ordinary and (inappropriately named, to some) full laziness and about how much elegance is in the eye of the beholder and difficult to evaluate, and about many other things. The quoted paragraph gave a much better topic: a concrete and easy to evaluate challenge: can we re-write Doug McIlroy's code in a strict language without ever using lazy and force or thunks everywhere -- basically maintaining the same structure of the code. I took the challenge and re-wrote the powser code in OCaml, a strict language. Haskell and OCaml differ in many respects, not just in the order of evaluation. OCaml is in general a little bit more verbose. Also, it does not have overloading. That's why you see not just + but also +. (for float addition) and +% (which I define for infinite series addition). Here are several examples: Haskell: series f = f : repeat 0 OCaml: let series f = I.cons f I.repeat 0. Things prefixed with I are library functions (just as repeat in Doug McIlroy's code is a library function). Haskell (using the tying-the-knot trick!) This is probably the most complex code (f:ft) / (g:gt) = qs where qs = f/g : series(1/g)*(ft-qs*gt) OCaml: let ( /% ) fs gs = let (f,ft) = I.decon fs and (g,gt) = I.decon gs in I.fix @@ I.cons (f /. g) (fun qs -> series (1. /. g) *% (ft -% qs *% gt)) Integration in Haskell: int fs = 0 : zipWith (/) fs [1..] and OCaml: let integ = I.cons 0. (fun fs -> I.zip_with (/.) fs (iota 1.)) Tangent as a polynomial in Haskell: tans = revert(int(1/(1:0:1))) and OCaml let tans = revert @@ integ (int 1 /% from_list [1.;0.;1.]) It seems the OCaml code (executed in the bytecode interpreter) is a bit faster than Haskell (in ghci). The complete code is available for inspection at http://okmij.org/ftp/ML/powser.ml There is no lazy/force in sight. Lazy/force could be used to implement the infinite stream data type (what has been prefixed with I) but no user of the stream library needs to know how exactly it is implemented. In particular, all McIlroy's code is written without any use of lazy/force. His code in OCaml has essentially the same structure as Haskell code, considering the syntactic differences between the two languages. The point is that well-chosen combinators are far more important than strict/lazy evaluation differences. If a language is expressive enough to support mini-languages (embedded DSLs), we can write infinite series and other DSL code without much trouble. Strict evaluation is not an obstacle to infinite data structures, on demand evaluation, etc. Once we can define a DSL, we can make it follow any evaluation strategy we want. Regarding the elegance, I can't help but quote a paragraph from Doug McIlroy's powser's page: Extensions to handle polynomials make a practical package, doubled in size, not as pretty, but much faster and capable of feats like pascal. To see the dramatic speedup, try a bigger test like take 20 tans. Note "not as pretty, but much faster". I'm afraid I'll have to significantly delay commenting on other points raised in this discussion: I have several deadlines coming up.

On Sat, Dec 26, 2015 at 12:26:30AM +0900, Oleg wrote:
Kim-Ee Yeoh wrote:
However, the record remains that Oleg has offered little by way of elegant bolting. His lazy programs based on a strict language tend to be cluttered with lazy and force functions that uglify previously elegant code.
His arguments would persuade many more folks if, for instance, he could offer lazy-over-strict translations of Doug McIlroy's power serious one-liners with no loss in elegance:
[...]
can we re-write Doug McIlroy's code in a strict language without ever using lazy and force or thunks everywhere -- basically maintaining the same structure of the code.
I took the challenge and re-wrote the powser code in OCaml, a strict language. [...] There is no lazy/force in sight. Lazy/force could be used to implement the infinite stream data type (what has been prefixed with I) but no user of the stream library needs to know how exactly it is implemented. In particular, all McIlroy's code is written without any use of lazy/force. His code in OCaml has essentially the same structure as Haskell code, considering the syntactic differences between the two languages. [...] The point is that well-chosen combinators are far more important than strict/lazy evaluation differences.
Thanks for that, Oleg. I am very much in agreement with you that we can have access to laziness within strict languages with no less elegance than we have access to IO in pure languages. Tom

Henk-Jan van Tuyl wrote:
In paper "Why Functional Programming Matters"[0], John Hughes shows how lazy functional programming can be used for better modularity. A more precise title for the paper would be "Why Lazy Functional Programming Matters".
The first paragraph of our paper (published 3 years ago) http://okmij.org/ftp/ftp/continuations/PPYield/index.html#introduction is as follows Lazy evaluation is regarded as one of the main reasons why functional programming matters \cite{hughes:matters-cj}. Lazy evaluation lets us write \emph{producers} and \emph{consumers} separately, whereas the two are inextricably intertwined in a call-by-value language. This separation allows a modular style of programming, in which a variety of producers, consumers, and transformers can readily be ``plugged together.'' Lazy evaluation is also an elegant implementation of a form of coroutines, suspending and resuming computations based on the demand for values, giving us memory-efficient, incremental computation `for free' \cite{McIlroy:1999:PSP:968592.968597,Bird:1984:UCP,AG-embed}. But do read the next paragraph and the rest of the paper, and other articles on the web site http://okmij.org/ftp/continuations/PPYield/index.html Our conclusion is that the modularity benefit of lazy evaluation can be held without lazy evaluation, gaining the predictability of the space and time behavior.

Am 22.12.2015 um 13:31 schrieb Oleg:
But do read the next paragraph and the rest of the paper, and other articles on the web site http://okmij.org/ftp/continuations/PPYield/index.html
I have to say I almost stopped reading at "worst in lazy evaluation: its incompatibility with effects". Incompatibility with side effects is a good thing, so that's not "worst", and there are frameworks for doing effects (most notably IO), so there goes "incompatibility".
Our conclusion is that the modularity benefit of lazy evaluation can be held without lazy evaluation, gaining the predictability of the space and time behavior.
I just skimmed the paper, so correct me if I'm wrong. It seems to show how one can transform a specific class of lazy functions into generators. This seems to miss the point of laziness. Having lazy evaluation means that when writing a function, you don't know (and don't care) how much of the returned data structure is ever explored, that's the caller's decision. This means you do not ever transform your code as a generator, because you don't need to. As I said, I didn't do more than a quick scan, and I might be in error thinking it's a transformation just for a specific class of lazy functions. More importantly, I might have missed some essential point about how this isn't really a transformation, or that one does not need to transform the code to get a transformer out. So... please correct the omissions :-) Regards, Jo

On Tue, Dec 22, 2015 at 12:31 PM, Joachim Durchholz
It seems to show how one can transform a specific class of lazy functions into generators. This seems to miss the point of laziness. Having lazy evaluation means that when writing a function, you don't know (and don't care) how much of the returned data structure is ever explored, that's the caller's decision. This means you do not ever transform your code as a generator, because you don't need to.
To play the celestial advocate: the ability to not care about strictness/laziness when writing a function is precisely what causes it to be hard to reason about the space/time costs of that function. Yes, it's nice to be able to abstract over evaluation semantics, but that abstraction does not come for free. Where the balance between cost and benefit tilts is, imo, less important than realizing the nature of the tradeoff. For, there is no single optimum; the objective function for "good programming" must weight many different concerns, and those weights necessarily differ by context and goal. -- Live well, ~wren

Am 23.12.2015 um 04:00 schrieb wren romano:
To play the celestial advocate: the ability to not care about strictness/laziness when writing a function is precisely what causes it to be hard to reason about the space/time costs of that function.
Sure. I'm not disputing well-known facts, I'm just wondering that the paper highlights just the problems and does not put them into proportion to the advantages.
participants (7)
-
Albert Y. C. Lai
-
Henk-Jan van Tuyl
-
Joachim Durchholz
-
Kim-Ee Yeoh
-
Oleg
-
Tom Ellis
-
wren romano