
I was looking at the real time queues in [1] and I wanted to see what would happen if I tried to write one in Haskell. The easy part was translating the real time queue from [1], p43 into Haskell. The hard part is testing to see if the rotations really happen what they should. Basically, I decided to use Debug.Trace.trace to see when rotations were actually occurring. I pushed the numbers 1 to 10 into the queue, and then I popped the queue ten times. What I found is that none of the rotations would actually take place until the first time I actually tried to /display the value/ of a popped element. What I realized is that my test driver is lazy. I figured out that I could put a bunch of 'seq' functions in the test driver to get the rotations to happen. My demonstration code is in: http://hpaste.org/8080 This leads to two questions: 1. If I find a laziness leak, is 'seq' the appropriate way to plug it? 2. Is there any way to systematically search for or detect laziness leaks?

oddron:
I was looking at the real time queues in [1] and I wanted to see what would happen if I tried to write one in Haskell. The easy part was translating the real time queue from [1], p43 into Haskell.
The hard part is testing to see if the rotations really happen what they should. Basically, I decided to use Debug.Trace.trace to see when rotations were actually occurring.
I pushed the numbers 1 to 10 into the queue, and then I popped the queue ten times. What I found is that none of the rotations would actually take place until the first time I actually tried to /display the value/ of a popped element. What I realized is that my test driver is lazy. I figured out that I could put a bunch of 'seq' functions in the test driver to get the rotations to happen.
Right, you have a requirement that things are evaluated earlier, but you're representing this with a lazy list which will evaluate them as late as possible. Using something like weak strictness (seq) or deep strictness (rnf), to enforce the requried level of evaluation prior to pushing things onto the queue is one way. Making the queue a stricter structure would also work.
My demonstration code is in: http://hpaste.org/8080
This leads to two questions:
1. If I find a laziness leak, is 'seq' the appropriate way to plug it?
* Strict data structures, for where you require stricness i.e. data Queue a = Nil | Node !a (Queue a) * Bang patterns on variables: foo !x = ... push x .... * Seq, foo x = x `seq` ... x ... * Deep seq: foo x = x `rnf` ... x ... Are the main ways.
2. Is there any way to systematically search for or detect laziness leaks?
Profiling, and looking at the Core. Being explicit about the evaluation strategy you want is a fine idea though. -- Don

Ronald Guida wrote:
I was looking at the real time queues in [1] and I wanted to see what would happen if I tried to write one in Haskell. The easy part was translating the real time queue from [1], p43 into Haskell.
The hard part is testing to see if the rotations really happen what they should. Basically, I decided to use Debug.Trace.trace to see when rotations were actually occurring.
I pushed the numbers 1 to 10 into the queue, and then I popped the queue ten times. What I found is that none of the rotations would actually take place until the first time I actually tried to /display the value/ of a popped element. What I realized is that my test driver is lazy. I figured out that I could put a bunch of 'seq' functions in the test driver to get the rotations to happen.
My demonstration code is in: http://hpaste.org/8080
This leads to two questions:
1. If I find a laziness leak, is 'seq' the appropriate way to plug it?
2. Is there any way to systematically search for or detect laziness leaks? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I want to point out several non-causes of laziness. For a no-magic, no-surprise understanding, it is important to know both what causes laziness and what does not. A. The lazy list at rtqRear, which is originally a strict list. It is an invariant that every thunk you put there is [] or x:y. (As a result, it is unnecessary to make RTQueue a strict record, since you want two of the fields to be lazy, and the remaining field rtqRear is de facto strict.) B. rtqueue's thunked call to rotate, i.e., rtqueue f r [] = let f' = rotate f r [] in RTQueue f' [] f' Originally f' is forced before the record is returned (SML convention). In Haskell the rotate call is thunked as f' and the record is returned. But there will not be an accumulation of such rotate thunks. For example if you snoc twice in succession and the first instance builds a rotate thunk: snoc p (snoc q (RTQueue f r [])) -> snoc p (rtqueue f (q:r) []) -> snoc p (RTQueue f' [] f') where f' = thunk -> rtqueue f' p:[] f' where f' = thunk -> f' is now forced by rtqueue's pattern matching on the 3rd param So if one snoc builds a rotate thunk, the next snoc kills it. And similarly any interleaving of queue operations. (head and tail add their extra pattern matching.) Thus it can be proved that Haskell lags behind the original by just one step in this regard. This is ok unless you're done with reducing asymptotic cost and start looking at constant factor cost. A true cause of laziness is in accumulating a chain of tail's and snocs without intermediate forcing, as observed.

Don Stewart wrote:
2. Is there any way to systematically search for or detect laziness leaks?
Profiling, and looking at the Core. Being explicit about the evaluation strategy you want is a fine idea though.
Albert Y. C. Lai wrote
A true cause of laziness is in accumulating a chain of tail's and snocs without intermediate forcing, as observed.
So I just thought of something. If laziness leads to laziness leaks, then is there such a thing as a strictness leak? I realized that the answer is yes. A lazy leak is a situation where I'm wasting resources to delay a sequence of calculations instead of just doing them now. But in a strict language, I might waste resources to compute things that I'll never need. I would call that a strictness leak. Now I could ask the dual question, "How do I detect strictness leaks," and I would probably get the same answers: profiling, looking at object code, and being explicit about the evaluation strategy. Both types of leaks share a lot in common. In both cases, I'm wasting resources. If I have a real-time system, then either type of leak can cause me to a miss a deadline. With a strict evaluation strategy, I might miss a nearby deadline because I'm busy calculating things that I don't need until the distant future. With a lazy evaluation strategy, I might miss a distant deadline because I'm lazily putting off a calculation now. If I were a college student, then this would be a laziness leak: Professor X assigns a report, due in a month. Two days before the report is due, I'll head to the drugstore, load up on caffeine, and work for 48 hours straight to get it done. And this would be a strictness leak: Professor X assigns a report, due in a month. As soon as I leave the classroom, I'll head to the drugstore, load up on caffeine, and work for 48 hours straight to get it done. And this would be an effective solution: Professor X assigns a report, due in a month. I'll select 15 days, spaced out over the next month, and allocate four hours per day to work on the report. By default, a lazy language will procrastinate. By default, a strict language will "anticrastinate". Either way, I can waste resources by blindly accepting the default time management plan. So the real question is "How do I avoid laziness leaks or strictness leaks" and apparently, the real answers are (1) learn how to recognize when the default plan will waste resources, and (2) learn how to express reasonable evaluation strategies in code. I would ask, "how do I examine the evaluation order of my code", but the answer is already available: use a debugger. Haskell already has debugging tools that do exactly what I need. (http://www.haskell.org/haskellwiki/Debugging) In particular, HOOD looks extremely interesting.

On Tue, 2008-06-03 at 20:12 -0400, Ronald Guida wrote:
Don Stewart wrote:
2. Is there any way to systematically search for or detect laziness leaks?
Profiling, and looking at the Core. Being explicit about the evaluation strategy you want is a fine idea though.
Albert Y. C. Lai wrote
A true cause of laziness is in accumulating a chain of tail's and snocs without intermediate forcing, as observed.
So I just thought of something. If laziness leads to laziness leaks, then is there such a thing as a strictness leak? I realized that the answer is yes.
I ask a similar question here albeit for a much narrower and more technical notion: http://lambda-the-ultimate.org/node/2273#comment-40156 and come to the same conclusion (at least for an even more narrow notion).

On 04/06/2008, at 10:12 AM, Ronald Guida wrote:
I would ask, "how do I examine the evaluation order of my code", but the answer is already available: use a debugger. Haskell already has debugging tools that do exactly what I need. (http://www.haskell.org/haskellwiki/Debugging)
In particular, HOOD looks extremely interesting.
I would recommend the GHCi debugger for looking at the evaluation order of code. Single stepping can be very illuminating. Bernie.

Ronald Guida wrote: [snip]
By default, a lazy language will procrastinate. By default, a strict language will "anticrastinate". Either way, I can waste resources by blindly accepting the default time management plan.
Nice analysis. Would you like to put that (the whole thing, not just that last para) on the wiki? Jules

Ronald Guida wrote:
So I just thought of something. If laziness leads to laziness leaks, then is there such a thing as a strictness leak? I realized that the answer is yes.
A lazy leak is a situation where I'm wasting resources to delay a sequence of calculations instead of just doing them now. But in a strict language, I might waste resources to compute things that I'll never need. I would call that a strictness leak.
Now I could ask the dual question, "How do I detect strictness leaks," and I would probably get the same answers: profiling, looking at object code, and being explicit about the evaluation strategy.
Both types of leaks share a lot in common. In both cases, I'm wasting resources. If I have a real-time system, then either type of leak can cause me to a miss a deadline.
I haven't heard the terms "laziness leak" and "strictness leak" before, imho they sound a bit spooky because it's not clear to me what the situation without leak would be. (Time vs Space? Is an O(n) algorithm a strictness leak compared to an O(log n) algorithm?) Note that lazy evaluation never wastes time; evaluating a term with lazy evaluation will always take less reduction steps than doing so eagerly or partly eagerly. But it can waste space (-> "space leak"), for instance by accumulating a big expression like (..) -> ((..)+1) -> (((..) + 1) + 1) -> etc. instead of evaluating x+1 immediately 5 -> 6 -> 7 -> etc. However, this would be wasted time in case the whole expression will not be evaluated but just thrown away. So, it's a trade-off. The effect you have in mind only appears in real-time systems, where lazy evaluation procrastinates everything by default. So, trying to implement a real-time system in a lazy language is more or less a paradox :) as Okasaki already points out in his book. Eager evaluation may "waste" both time and space compared to alternative course of reduction. Regards, apfelmus PS: The reduction strategies we compare to don't evaluate under lambdas.

apfelmus wrote:
I haven't heard the terms "laziness leak" and "strictness leak" before, imho they sound a bit spooky because it's not clear to me what the situation without leak would be. (Time vs Space? Is an O(n) algorithm a strictness leak compared to an O(log n) algorithm?)
"Leak" refers to a surprise. You didn't expect Firefox to use Omega(n) memory where n is the number of times you refresh a rather plain static HTML page, but it does, and you call it a memory "leak". You didn't expect foldl to lump a big thunk, but it does, and you call it a lazy "leak". Therefore "leak" refers to a program failing your expectation - even if you yourself wrote the program. ("Leak", "bug", "issue"... We surely are very creative in how to avoid calling a shovel a shovel, or an error an error.) The solution is better education, better reasoning, and Intelligent Design. As you write every line of code, you should already know what to expect. No magic, no surprise, just science.

"Albert Y. C. Lai"
I haven't heard the terms "laziness leak" and "strictness leak" before
"Leak" refers to a surprise.
I the meaning of "leak" is in a bit of flux. Originally, I believe it refers to a memory leak, where the programmer forgot to call free() before losing the pointer, thus making the program consume memory it can't recover, and can't use. With automatic memory management, this doesn't happen, so "memory leak" then started to mean retaining objects longer than necessary. (Aside: am I the only one who is shocked by the memory consumption of modern programs? I use a simple time tracker (gnotime), a calendar with a handful of entries (evolution), and they both typically consume half to one gigabyte of memory. In all fairness, it seems to be much better under Ubuntu 8.04 than 7.10, but again, they haven't been running for very long yet.) I'm not sure I'll use terms like strictness and laziness leak, I think it's hard to see what's being lost here, and if you have a laziness leak, it is unclear if it's too much laziness or too little?
("Leak", "bug", "issue"... We surely are very creative in how to avoid calling a shovel a shovel, or an error an error.)
At least one text insisted on "defect". Too much laziness or strictness may be harmful to performance, but it's only a defect if it means the program fails to meet its requirements. -k -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde wrote:
I the meaning of "leak" is in a bit of flux. Originally, I believe it refers to a memory leak, where the programmer forgot to call free() before losing the pointer, thus making the program consume memory it can't recover, and can't use.
With automatic memory management, this doesn't happen, so "memory leak" then started to mean retaining objects longer than necessary.
I agree. This definition fits the "space leak" foldl (+1) 0 [1..10000] -> (((...)+1)+1) in the sense that the unevaluated expressions are retained in memory longer than necessary; the difference being of course that it's not garbage collection but beta-reduction that frees the memory in question. On the other hand, I think that the situation of foldl (+1) 0 [1..10000] in a strict language does not fit this definition of leak because evaluating the list [1..10000] eagerly does not retain memory longer than necessary, it consumes memory earlier than necessary. So, this notion of leak is spot-on.
I'm not sure I'll use terms like strictness and laziness leak, I think it's hard to see what's being lost here, and if you have a laziness leak, it is unclear if it's too much laziness or too little?
Me too, I don't see a reason to muddy the word with other meanings. "Space leak" is a good word in the sense that "space" describes the "leak"; it's the space that leaks and goes down the drain. Neither laziness nor strictness can leak and be washed away with the rain.
(Aside: am I the only one who is shocked by the memory consumption of modern programs? I use a simple time tracker (gnotime), a calendar with a handful of entries (evolution), and they both typically consume half to one gigabyte of memory. In all fairness, it seems to be much better under Ubuntu 8.04 than 7.10, but again, they haven't been running for very long yet.)
Yeah :( When a piece of softwares wastes time and memory, they should have written it in Haskell, so that at least the other bugs wouldn't plague me as well. Regards, apfelmus

On 6/4/08, apfelmus
Note that lazy evaluation never wastes time; evaluating a term with lazy evaluation will always take less reduction steps than doing so eagerly or partly eagerly.
True, but you can still have a "time leak"; this is particularily relevant in soft-real-time apps (like almost every app you use on a regular basis, from your editor to games); a time leak is when a computation that would take X time if evaluated every "time step" is left for many timesteps without being evaluated, leading to a hitch in responsiveness when it eventually is demanded N frames later taking N*X time. Eager applications almost never have this sort of "time leak", but it's easy for it to happen with lazy evaluation. A simple example: consider a variable that holds the number of timesteps since the app launched, for example. Every time step it gets incremented by 1. If the result is evaluated every time step, it takes a constant amount of time per timestep. But if you go a long time without evaluating it, you end up with both a space leak (as the +1 thunks build up) but a time leak as well--when you eventually evaluate it, it takes O(n) time, where n is the number of frames since the variable was last evaluated. -- ryan

"Ryan Ingram"
On 6/4/08, apfelmus
wrote: Note that lazy evaluation never wastes time; evaluating a term with lazy evaluation will always take less reduction steps than doing so eagerly or partly eagerly.
True, but you can still have a "time leak"; this is particularily relevant in soft-real-time apps (like almost every app you use on a regular basis, from your editor to games); a time leak is when a computation that would take X time if evaluated every "time step" is left for many timesteps without being evaluated, leading to a hitch in responsiveness when it eventually is demanded N frames later taking N*X time.
Eager applications almost never have this sort of "time leak", but it's easy for it to happen with lazy evaluation.
A simple example: consider a variable that holds the number of timesteps since the app launched, for example. Every time step it gets incremented by 1. If the result is evaluated every time step, it takes a constant amount of time per timestep. But if you go a long time without evaluating it, you end up with both a space leak (as the +1 thunks build up) but a time leak as well--when you eventually evaluate it, it takes O(n) time, where n is the number of frames since the variable was last evaluated.
There won't ever be a space leak without a time leak nor a time leak without a space leak. I'd just call it a leak. You don't come across space-leaks in strict programs often because data is usually allocated statically even if execution is non-strict. Piping /dev/zero into a program that just sleeps does leak space, though. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.

Achim Schneider wrote:
You don't come across space-leaks in strict programs often because data is usually allocated statically even if execution is non-strict.
Piping /dev/zero into a program that just sleeps does leak space, though.
It only leaks 8K or whatever size your system buffers pipes until it suspends the writer though... or do I misunderstand your analogy?

Jules Bean
Achim Schneider wrote:
You don't come across space-leaks in strict programs often because data is usually allocated statically even if execution is non-strict.
Piping /dev/zero into a program that just sleeps does leak space, though.
It only leaks 8K or whatever size your system buffers pipes until it suspends the writer though... or do I misunderstand your analogy?
I don't think so. I would say that it leaks 8k memory and infinite time. I did not intend the example to be implementation-dependent, but then that kind of proves my point. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.

Achim Schneider wrote:
There won't ever be a space leak without a time leak nor a time leak without a space leak. I'd just call it a leak.
Actually I think you can have a space leak without a time leak. For instance if every time around the main loop I cons data onto a linked list that never gets freed then I have a space leak. If the list never gets used (or more realistically, if the program only ever uses the first N entries) then there is no time leak. I agree that a time leak implies a space leak though. Paul.

Paul Johnson
Achim Schneider wrote:
There won't ever be a space leak without a time leak nor a time leak without a space leak. I'd just call it a leak.
Actually I think you can have a space leak without a time leak. For instance if every time around the main loop I cons data onto a linked list that never gets freed then I have a space leak. If the list never gets used (or more realistically, if the program only ever uses the first N entries) then there is no time leak.
Sure there is: you leaked time while constructing the list. The whole topic seems to degenerate into nit-picking for border cases. One could define a leak as "a property of an error-free program resulting in non-optimal performance", or much more concise, "a trap no sufficiently smart programmer runs into". -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.

2008/6/4 apfelmus
[...] But it can waste space (-> "space leak"), for instance by accumulating a big expression like
(..) -> ((..)+1) -> (((..) + 1) + 1) -> etc.
instead of evaluating x+1 immediately
5 -> 6 -> 7 -> etc.
So, it is called a "space leak" because the asymptotic space required is greater in a lazy setting compared to a strict setting. Then, what about: sum . map (\x -> 2 * x + 42) $ [1..9999999] Provided sum works in constant space (needs proper tail calls and the right level of stricness), this runs in constant space in Haskell, deforestation or not. However, the equivalent code in, say, Ocaml, will work in linear space in the absence of deforestation. So, in this case, I find legitimate to talk about a "strictness space leak". Well, is it? Of course it uses linear space, there is no "leak"! hummm. It feels like only lazy evaluation can be accused of space leak, while strict evaluation cannot. Are we biased in favour of strict evaluation? I mean, if we take strict evaluation as the default (because it's mainstream), it is easy to think that lazy evaluation is cool when better, but unacceptable when worse. Hence the word "leak". Just my two cents, Loup
participants (12)
-
Achim Schneider
-
Albert Y. C. Lai
-
apfelmus
-
Bernie Pope
-
Derek Elkins
-
Don Stewart
-
Jules Bean
-
Ketil Malde
-
Loup Vaillant
-
Paul Johnson
-
Ronald Guida
-
Ryan Ingram