help understanding lazy evaluation

I am learning Haskell with "Programming in Haskell" (an excellent book BTW). I have background in several languages but none of them has lazy evaluation. By now I am getting along with the intuitive idea that things are not evaluated until needed, but there's an example I don't understand (which means the intuitive idea needs some revision :-). We have factors(), defined on page 39 like this[*]: factors :: Int -> [Int] factors n = [x | x <- [1..n], n `mod` x == 0] and we base prime() on it this way: prime :: Int -> Bool prime n = factors n == [1, n] Now, the books says prime does not necessarily compute all of the factors of n because of lazy evaluation. Meaning that if n is composite as soon as some non-trivial divisor appears we halt computation and return False. My vague intuition said "we either need factors or we don't, we do because we need to perform the test, so we compute it". That's wrong, so a posteriori the explanation must be something like this: 1. someone knows we want factor() to perform an equality test 2. someone knows an equality test between lists is False as soon as we have a mismatch, left to right 3. thus, instead of evaluating factors completely we are going to build sublists of the result and perform the tests on those ones against [1, n]. That's a lot of *context* about that particular evaluation of factors, in particular step puzzles me. Can anyone explain how lazy evaluation fits there? I suspect the key is the implementation of == together with the fact that list comprehensions are lazy themselves, is that right? -- fxn [*] Which notation do you use for functions in text? is f() ok?

Hi
factors :: Int -> [Int] factors n = [x | x <- [1..n], n `mod` x == 0]
prime :: Int -> Bool prime n = factors n == [1, n]
My vague intuition said "we either need factors or we don't, we do because we need to perform the test, so we compute it". That's wrong, so a posteriori the explanation must be something like this:
The key point is that factors doesn't either compute all or none, it may compute part of the value. It does this by computing something like: _:_ 1:_ 1:_:_ where _ is the unevaluated bit, i.e. it computes one bit of the result at a time. Equals also has this property, it can be defined as: a:as == b:bs = a == b && as == bs [] == [] = True _ == _ = False If you have (1:_) == (2:_) then the match will fail instantly.
That's a lot of *context* about that particular evaluation of factors, in particular step puzzles me. Can anyone explain how lazy evaluation fits there? I suspect the key is the implementation of == together with the fact that list comprehensions are lazy themselves, is that right?
Everything is lazy, to all subparts. You might get along better with a reasoning more of the form "to compute anything, this expression will demand this expression" - rather than your "someone knows we'll need". If you follow example derivations you'll see that there is always a very clear idea of what needs to happen next for the computation to proceed, which explains the laziness quite naturally.
[*] Which notation do you use for functions in text? is f() ok?
Sure, although a little unusual for Haskell where f() means f applied to the empty tuple. Some people use |f| (generally those who use latex), but generally it can be inferred from the context what is a function Thanks Neil

[*] Which notation do you use for functions in text? is f() ok?
Sure, although a little unusual for Haskell where f() means f applied to the empty tuple. Some people use |f| (generally those who use latex), but generally it can be inferred from the context what is a function
Neil's answer was complete I just want to elaborate on this last point. f is the most logical (and for Haskell things appropriate) notation. It's an odd historical quirk of mathematical notation that using f(x) for f is still so common.

You people rock. Responses were really helpful and I understand how the computation goes now. I see I need to reprogram my eyes to expect lazy evaluation in places where I am used to one-shot results. I see lazy evaluation is all around in Haskell builtins. From a formal point of view how does this work? The specifications in http://www.haskell.org/onlinereport/standard-prelude.html include the lazy aspect of the definitions albeit they do not require that exact implementation, right? On the other hand, where does Haskell 98 say == is lazy on lists? -- fxn

On Thu, Aug 23, 2007 at 10:00:00AM +0200, Xavier Noria wrote:
You people rock. Responses were really helpful and I understand how the computation goes now.
I see I need to reprogram my eyes to expect lazy evaluation in places where I am used to one-shot results. I see lazy evaluation is all around in Haskell builtins.
From a formal point of view how does this work?
As is usual for mathematical things, there are many equivalent definitions. My two favorites are: 1. Normal order reduction In the λ-calculus, lazy evaluation can be defined as the (unique up to always giving the same answer) evaluation method, which, if *any* evaluation order would finish, works. This is a rather subtle theorem, not a trivial definition, so it could be hard to use. 3. Closed partial ordering Add a value ⊥ to every type, recursively (so Bool contains ⊥, True, False; while [()] contains ⊥, [], (⊥:⊥), (():⊥), (⊥:[]), ...) The semantics for case contain a new rule: case ⊥ of { ... } ===> ⊥ Define a partial order ≤ by: ⊥ ≤ x for all x (C a b c) ≤ (D e f g) if C = D and a ≤ e, b ≤ f, c ≤ g It is easily verified that this is a closed partial order with bottom element ⊥. It can be easily verified that all functions definable using these operations are monotonic (a ≤ b implies f a ≤ f b). Now, define: let x = y in z as: (λx. z) (fix (λx. y)) where fix :: (α -> α) -> α is the operation which returns a minimal fixed point of its argument, which by Kleene's Fixed-point Theorem is guaranteed to exist and be unique. In many ways this is much less obvious than even the previous one, but it happens to be very useful. Strictness analysis is trivial to understand once you grok the application of Kleene's Theorem to lazy programs.
The specifications in
http://www.haskell.org/onlinereport/standard-prelude.html
include the lazy aspect of the definitions albeit they do not require that exact implementation, right?
Indeed, you've caught on an important technical distinction. Lazy: Always evaluating left-outermost-first. Non-strict: Behaving as if it were lazy. Haskell's semantics are non-strict in nature, but there is no requirement for laziness. All optimizing haskell compilers are non-lazy.
On the other hand, where does Haskell 98 say == is lazy on lists?
Chapter 8, Standard Prelude. The code, when intepreted using a non-strict semantics, implements a non-strict ==. Stefan

Stefan O'Rear wrote:
As is usual for mathematical things, there are many equivalent definitions. My two favorites are:
1. Normal order reduction
In the λ-calculus, lazy evaluation can be defined as the (unique up to always giving the same answer) evaluation method, which, if *any* evaluation order would finish, works. This is a rather subtle theorem, not a trivial definition, so it could be hard to use.
Hudak in his HSoE book gives a less precise, but more concrete definition of normal-order reduction (shamelessly quoted here): “If a rule or rules can be applied to more than one position in an expression, use the rule corresponding to the *outermost* position in the expression.” Bulat Ziganshin wrote elsewhere:
simple example: consider evaluation of "head [1..]".
With the rule above, we have: head [1..] {- We cannot apply the definition of head yet, because the first element of the list ist not visible, so we apply the definition of [..] -} = head (1 : [(succ 1)..]) {- Now we could apply either the definition of succ 1, or [..], or head. But head is the outermost, so: -} = 1. Malte

Stefan O'Rear
Indeed, you've caught on an important technical distinction.
Lazy: Always evaluating left-outermost-first.
I think most people would rather use the term "normal order¨ for that; lazy means evaluating in normal order /and/ not evaluating the same expression twice. normal order: (\a -> a + a) (2+2) => (2+2) + (2+2) => 4 + (2+2) => 4 + 4 => 8 lazy: (\a -> a + a) (2+2) => (2+2) + (2+2) -- but we still know that both these (2+2)s are the same => 4 + 4 => 8 That might be slightly confusing because I've used (+), which is strict. It might have been better to use Ss and Ks, but less succinct...
Non-strict: Behaving as if it were lazy.
non-strict: giving the same answers as if it were normal order. A haskell implementation could, in principle, conform to the non-strict requirement without doing any of the update in place that makes laziness.
Haskell's semantics are non-strict in nature, but there is no requirement for laziness. All optimizing haskell compilers are non-lazy.
???? -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Xavier, First off, we don't put the () after function names in Haskell. What's happening is this (experts please correct any mistakes here): 1) You call prime on a number (e.g. 42). 2) In order to evaluate this further, (factors 42) must be evaluated at least partially to give input to == in prime. So (factors 42) is evaluated to give the first list value, which is 1. 3) The == in prime compares the 1 at the head of the list generated by (factors 42) (whose tail hasn't been evaluated yet) with the 1 at the head of [1, 42], and finds that they match, so == can continue processing. 4) (factors 42) resumes to compute another list value, which is 2 (a factor of 42). 5) == in prime compares 2 with 42, finds they don't match, and thus (primes 42) returns False. So not all the factors have been computed; in fact, only two of them were needed to prove that 42 is not prime. The interesting aspect of all this is that laziness allowed us to modularize the problem of primality testing into two separate (simpler) functions, instead of having to interleave generation of factors and testing of factors. This makes code easier to write and more modular. Hughes' paper "Why Functional Programming Matters" is a must-read for more on this. Lazy evaluation can be very tricky to wrap your head around, and there are lots of subtle issues that crop up where you think something is lazy but it's not, or you think something is strict but it's not. There are ways to force lazy/strict behavior, but they're somewhat more advanced. HTH, Mike Xavier Noria wrote:
I am learning Haskell with "Programming in Haskell" (an excellent book BTW).
I have background in several languages but none of them has lazy evaluation. By now I am getting along with the intuitive idea that things are not evaluated until needed, but there's an example I don't understand (which means the intuitive idea needs some revision :-).
We have factors(), defined on page 39 like this[*]:
factors :: Int -> [Int] factors n = [x | x <- [1..n], n `mod` x == 0]
and we base prime() on it this way:
prime :: Int -> Bool prime n = factors n == [1, n]
Now, the books says prime does not necessarily compute all of the factors of n because of lazy evaluation. Meaning that if n is composite as soon as some non-trivial divisor appears we halt computation and return False.
My vague intuition said "we either need factors or we don't, we do because we need to perform the test, so we compute it". That's wrong, so a posteriori the explanation must be something like this:
1. someone knows we want factor() to perform an equality test
2. someone knows an equality test between lists is False as soon as we have a mismatch, left to right
3. thus, instead of evaluating factors completely we are going to build sublists of the result and perform the tests on those ones against [1, n].
That's a lot of *context* about that particular evaluation of factors, in particular step puzzles me. Can anyone explain how lazy evaluation fits there? I suspect the key is the implementation of == together with the fact that list comprehensions are lazy themselves, is that right?
-- fxn
[*] Which notation do you use for functions in text? is f() ok?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

The insight is that the functions called can be lazy internally; the
computation doesn't proceed by either fully evaluating something or not, but
rather by rewriting parts of the computation graph. Here is a trace of the
evaluation of "prime" for integers n > 1:
prime n
=> factors n == [1,n]
=> factors n == 1 : n : [] -- remove syntactical sugar from the list
=> [x | x <- [1..n], n `mod` x == 0] == 1 : n : []
=>* 1 : [x | x <- [2..n], n `mod` x == 0] == 1 : n : [] -- n mod 1 is always
0
=> 1 == 1 && [x | x <- [2..n], n `mod` x == 0] == n : []
=> True && [x | x <- [2..n], n `mod` x == 0] == n : []
=> [x | x <- [2..n], n `mod` x == 0] == n : []
This much computation happens for every single n. One of two things happens
here; either we have another factor < N, or we don't.
Case 1: There is at least one other factor < n. Let z be the smallest such
factor:
=>* z : [x | x <- [(z+1)..n], n `mod` x == 0] == n : []
=> z == n && [x | x <- [(z+1)..n], n `mod` x == 0] == []
=> False && [x | x <- [(z+1)..n], n `mod` x == 0] == []
=> False
Case 2: There are no other factors < n.
=>* n : [x | x <- [], n `mod` x == 0] == n : []
=> n == n && [x | x <- [], n `mod` x == 0] == []
=> True && [x | x <- [], n `mod` x == 0] == []
=> [x | x <- [], n `mod` x == 0] == []
=> [] == []
=> True
Exercise: Annotate each line in the above trace with a description of why
the reduction is valid.
To be totally clear with this trace, you would need to desguar the list
comprehension and some of the more primitive functions. The lines marked
with =>* are a bit handwavy because they skip evaluation. Here's the
desugaring of this list comprehension and full definitions of the other
functions in used in this example:
[x | x <- [1..n], n `mod` x == 0] =>
concatMap ok [1..n]
where ok x = if n `mod` x == 0 then [x] else []
concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f [] = []
concatMap f (x:xs) = f x ++ concatMap f xs
(++) :: [a] -> [a] -> [a]
(x:xs) ++ ys = x : (xs ++ ys)
[] ++ ys = ys
(&&) :: Bool -> Bool -> Bool
True && x = x
False && _ = False
(==) :: [Int] -> [Int] -> Bool -- specialized via typeclass "Eq [Int]"
(x:xs) == (y:ys) = x == y && xs == ys
[] == [] = True
_ == _ = False
All other functions used are primitives.
Exercise: Write out a full execution trace for n == 3 and n == 4 with the
desugaring and Prelude functions given above.
On 8/22/07, Xavier Noria
I am learning Haskell with "Programming in Haskell" (an excellent book BTW).
I have background in several languages but none of them has lazy evaluation. By now I am getting along with the intuitive idea that things are not evaluated until needed, but there's an example I don't understand (which means the intuitive idea needs some revision :-).
We have factors(), defined on page 39 like this[*]:
factors :: Int -> [Int] factors n = [x | x <- [1..n], n `mod` x == 0]
and we base prime() on it this way:
prime :: Int -> Bool prime n = factors n == [1, n]
Now, the books says prime does not necessarily compute all of the factors of n because of lazy evaluation. Meaning that if n is composite as soon as some non-trivial divisor appears we halt computation and return False.
My vague intuition said "we either need factors or we don't, we do because we need to perform the test, so we compute it". That's wrong, so a posteriori the explanation must be something like this:
1. someone knows we want factor() to perform an equality test
2. someone knows an equality test between lists is False as soon as we have a mismatch, left to right
3. thus, instead of evaluating factors completely we are going to build sublists of the result and perform the tests on those ones against [1, n].
That's a lot of *context* about that particular evaluation of factors, in particular step puzzles me. Can anyone explain how lazy evaluation fits there? I suspect the key is the implementation of == together with the fact that list comprehensions are lazy themselves, is that right?
-- fxn
[*] Which notation do you use for functions in text? is f() ok?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I'm trying to understand lazy evaluation as well. I created an example for myself and I'm wondering if I've got it right.
let adder n = \x -> n + x in (map (adder 12) [1,2,3]) !! 1
So the first thing I did is draw a graph to represent my expression. (map (adder 12) [1,2,3]) !! 1 = | (!!) ----/ \ / 1 map / \ / \ adder (:)--(:)--(:) | | | | \ 12 1 2 3 [] In order to proceed, I need definitions for adder, map and (!!). adder n = \x -> n + x map f [] = [] map f (x:xs) = (f x) : xs (x:xs) !! 0 = x (x:xs) !! n = xs !! (n-1) Here is how I think the evaluation would proceed: 0 => (map (adder 12) [1,2,3]) !! 1 The top node is (!!). In order to evaluate (!!), I need to expand the left hand side to get at least the first node of a list. I'll expand "map". 1 => ( (adder 12 1) : (map (adder 12) [2,3]) ) !! 1 I evaluated "map" once to get a list with an expression for the head and an expression for the tail. head: adder 12 1 tail: map (adder 12) [2,3] I proceed to evaluate (!!) for one step; this time n /= 0 so I extract the tail of the list and recursively call (!!). 2 => ( (map (adder 12) [2,3]) ) !! 0 The top node is (!!) again. I need to expand "map" again. 3 => ( (adder 12 2) : (map (adder 12) [3]) ) !! 0 I evaluate (!!) and this time n == 0 so I match the base case for (!!) and take the head of the list. 4 => adder 12 2 => (adder 12) 2 In order to proceed, I need to expand (adder 12). The adder function take one argument, "12", and produces a closure. I'll express it as a "let" expressions. Note: It is at this point (steps 4 to 7) that I'm confused about what's supposed to happen with closures and let expressions. 5 => (let n = 12 in \x -> n + x) 2 I'll substitute "2" into the let statement. 6 => (let n = 12 in n + 2) I'll substitute "12" for "n". 7 => 12 + 2 8 => 14 Can anyone tell me if I've got this right? -- Ron

Ronald Guida wrote:
Can anyone tell me if I've got this right?
Yes, you got. The let-statement you introduce that embodies the sharing of the argument n = 12 probably should be present in the first parts, too. But this doesn't really matter, the formalities of graph reduction vary with the formalizer :) Regards, apfelmus

Hello Xavier, Thursday, August 23, 2007, 3:08:25 AM, you wrote:
I am learning Haskell with "Programming in Haskell" (an excellent book BTW).
scheme of lazy evaluation called "graph reduction" you may consider it as repetitive replacing right parts of function definitions with their left parts. as far as some part of graph isn't required to compute final result, it's abandoned and not computed down to final value simple example: consider evaluation of "head [1..]". [1..] may be represented with the following recursive function: list n = n:list n+1 so we have "head (list 1)" where head defined as head (x:_) = x let's evaluate expression: head (list 1) => head (1:list 2) => 1 as you see, "list 2" was just dropped during evaluation. the same applies to your case - as far as "==" found different values in list, it skips its further evaluation, so rest of divisors remains uncalculated -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (11)
-
apfelmus
-
Bulat Ziganshin
-
Derek Elkins
-
Jon Fairbairn
-
Malte Milatz
-
Michael Vanier
-
Neil Mitchell
-
Ronald Guida
-
Ryan Ingram
-
Stefan O'Rear
-
Xavier Noria