
Gabi
Hi Stephen, Thanks for the answer. Now I got confused :)
1. What is seq used for if it doesn't force strictness ? 2. Why the accumulator is important ? In other words why the following is much slower ? Isn't it RT too ?
slowSum :: [Integer] -> Integer slowSum[] = 0 slowSum (x:xs) = x `seq` x + slowSum xs
Hello Gabi, The seq function may not behave as you expect it to. The following GHCi session may help you understand this: Prelude> const 3 (seq undefined 0) 3 Prelude> snd (seq undefined 0, 1) 1 Prelude> fst (seq undefined 0, 1) *** Exception: Prelude.undefined When the result of 'seq x y' is demanded, x is evaluated and then y is returned as that result. In the above session the first two expressions never evaluate the seq call. Haskell is lazy, so if you demand only the second value of a tuple, the first one is never evaluated, even if it contains 'seq x y'. However, _if_ it is evaluated, it will evaluate x first, then return y. Your statement, x `seq` x + slowSum xs, can be rewritten in the following way, seq x (x + slowSum xs), which makes clearer what it means: Upon evaluating x + slowSum xs, the value of x should be evaluated. In other words, your function builds the following result thunk: slowSum [1,2,3] = 1 + 2 + 3 + 0 instead of: slowSum [1,2,3] = THUNK + THUNK + THUNK + 0 But it still builds a thunk, which will have to actually perform all the additions. Why? Because nothing demands that sum until after all those recursive calls return. You are right in that a 'seq' is needed somewhere, but not where you put it. In fact the slowSum function cannot be optimized any further, because it is not tail-recursive: slowSum [1,2,3] => 1 + slowSum [2,3] Using prefix notation shows why the recursive call to slowSum is not the last function call and hence is no tail recursion: (+) 1 (slowSum [2,3]) The problem of non-tail recursion is that the recursive calls have no access to the "current result". You cannot force the addition from slowSum, because it's out of scope. If you introduce an accumulation parameter, things change: fastSum :: Num a => a -> [a] -> a fastSum s [] = s fastSum s (x:xs) = fastSum (s+x) xs Read it like: 'fastSum s xs' is the sum of the elements of xs plus the value of s. Reading it that way the code is self-explanatory, and clearly it is tail-recursive, giving the function full access to the current intermediary result (named s). It builds the following result: fastSum 0 [1,2,3] => fastSum (0 + THUNK) [2,3] => fastSum (0 + THUNK + THUNK) [3] => fastSum (0 + THUNK + THUNK + THUNK) [] => 0 + THUNK + THUNK + THUNK Well, now it is tail-recursive, so it doesn't eat stack space anymore, but it still builds that result expression instead of just giving the end result. This is where seq comes into play. Now that you have access to the intermediary result, you can force it to be evaluated along the way: superFastSum :: Num a => a -> [a] -> a superFastSum s [] = s superFastSum s (x:xs) = s `seq` superFastSum (s+x) xs Note how semantics have changed now: superFastSum 0 [1,2,3] => 0 `seq` superFastSum (0+1) [2,3] => superFastSum (0+1) [2,3] => (0+1) `seq` superFastSum ((0+1)+2) [3] => superFastSum (1+2) [3] => (1+2) `seq` superFastSum ((1+2)+3) [] => superFastSum (3+3) [] => 3+3 The result of superFastSum on a non-empty list depends on a recursive call to itself, but it clearly states (through seq) that _before_ that call is evaluated, the value of s is evaluated. So each time it attempts to build a result expression, that expression is immediately reduced to a value, to which the next list element is added. The foldl' function can be used to encode such a recursion pattern conveniently. It's just like the superFastSum function, but takes an arbitrary accumulation function instead of (+): import Data.List foldl' :: (a -> b -> a) -> a -> [b] -> a foldl' f z [] = z foldl' f z (x:xs) = z `seq` foldl' f (f z x) xs superFastSum = foldl' (+) 0 superFastProduct = foldl' (*) 1 superFastLength = foldl' (\x y -> x + 1) 0 I hope that helps. Supplemental note #1: Haskell compilers generally don't do common subexpression elimination (CSE), because that may change language semantics. The following code will not behave as you may expect it to: f x `seq` print (f x) With CSE in mind, you would expect the seq not to change anything and just print the result, but this is wrong. In fact it will be twice as slow! The reason is that to the compiler the first 'f x' is not the same as the second 'f x'. If you want them to be the same, you need to make this explicit by giving them a common name: let y = f x in y `seq` print y This is called memoizing and will work as expected, i.e. the seq will not slow things down. You can avoid some lets and wheres using the strict function application operator: print $! f x Supplemental note #2: A better, memoizing foldl' implementation: foldl' :: (a -> b -> a) -> a -> [b] -> a foldl' f z [] = z foldl' f z (x:xs) = let r = f z x in r `seq` foldl' f r xs Supplemental note #3: I noticed that recent versions of GHC are smart enough to know where it makes no semantic difference to force certain calculations. That means you can write the implementations of superFastSum and foldl' without seq: superFastSumGHC = fastSum foldl' = foldl However, for compatibility reasons you should still use seq. Greets Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://blog.ertes.de/