How to think about this? (profiling)

This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-) I wanted to improve on the following little example code: foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3) This is obviously going to run into problems for large values of `n` so I introduced a state to keep intermediate results in: foo :: Int -> State (UArray Int Int) Int foo 0 = return 0 foo 1 = return 1 foo 2 = return 2 foo n = do c <- get if (c ! n) /= -1 then return $ c ! n else do r <- liftM3 (\ a b c -> a + b + c) (foo $ n - 1) (foo $ n - 2) (foo $ n - 3) modify (\ s -> s // [(n, r)]) return r Then I added a convenience function and called it like this: createArray :: Int -> UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1))) main = do (n:_) <- liftM (map read) getArgs print $ evalState (foo n) (createArray n) Then I thought that this still looks pretty deeply recursive, but if I call the function for increasing values of `n` then I'll simply build up the state, sort of like doing a for-loop in an imperative language. I could then end it with a call to `foo n` and be done. I replaced `main` by: main = do (n:_) <- liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n) Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.) Anyway, I ran it with `n=35` and got foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1 How should I think about this in order to predict this behaviour in the future? /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus Haskell is an even 'redder' pill than Lisp or Scheme. -- PaulPotts

2008/12/16 Magnus Therning
This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-)
I wanted to improve on the following little example code:
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)
This is obviously going to run into problems for large values of `n` so I introduced a state to keep intermediate results in:
foo :: Int -> State (UArray Int Int) Int foo 0 = return 0 foo 1 = return 1 foo 2 = return 2 foo n = do c <- get if (c ! n) /= -1 then return $ c ! n else do r <- liftM3 (\ a b c -> a + b + c) (foo $ n - 1) (foo $ n - 2) (foo $ n - 3) modify (\ s -> s // [(n, r)]) return r
Then I added a convenience function and called it like this:
createArray :: Int -> UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1)))
main = do (n:_) <- liftM (map read) getArgs print $ evalState (foo n) (createArray n)
Then I thought that this still looks pretty deeply recursive, but if I call the function for increasing values of `n` then I'll simply build up the state, sort of like doing a for-loop in an imperative language. I could then end it with a call to `foo n` and be done. I replaced `main` by:
main = do (n:_) <- liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n)
Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.)
Anyway, I ran it with `n=35` and got
foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1
How should I think about this in order to predict this behaviour in the future?
Immutable arrays are duplicated every time you write to them. Making lots of small updates is going to be /very/ expensive. You have the right idea, though. Saving intermediate results is the right thing to do but arrays aren't the right way to do it. In this case, a lazy list will perform much better.
ack n = ackList !! n where ackList = 0:1:2:zipWith3 (\a b c -> a+b+c) ackList (drop 1 ackList) (drop 2 ackList)
-- Cheers, Lemmih

On Mon, Dec 15, 2008 at 11:33 PM, Lemmih
2008/12/16 Magnus Therning
: This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-)
I wanted to improve on the following little example code:
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)
This is obviously going to run into problems for large values of `n` so I introduced a state to keep intermediate results in:
foo :: Int -> State (UArray Int Int) Int foo 0 = return 0 foo 1 = return 1 foo 2 = return 2 foo n = do c <- get if (c ! n) /= -1 then return $ c ! n else do r <- liftM3 (\ a b c -> a + b + c) (foo $ n - 1) (foo $ n - 2) (foo $ n - 3) modify (\ s -> s // [(n, r)]) return r
Then I added a convenience function and called it like this:
createArray :: Int -> UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1)))
main = do (n:_) <- liftM (map read) getArgs print $ evalState (foo n) (createArray n)
Then I thought that this still looks pretty deeply recursive, but if I call the function for increasing values of `n` then I'll simply build up the state, sort of like doing a for-loop in an imperative language. I could then end it with a call to `foo n` and be done. I replaced `main` by:
main = do (n:_) <- liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n)
Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.)
Anyway, I ran it with `n=35` and got
foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1
How should I think about this in order to predict this behaviour in the future?
Immutable arrays are duplicated every time you write to them. Making lots of small updates is going to be /very/ expensive. You have the right idea, though. Saving intermediate results is the right thing to do but arrays aren't the right way to do it. In this case, a lazy list will perform much better.
ack n = ackList !! n where ackList = 0:1:2:zipWith3 (\a b c -> a+b+c) ackList (drop 1 ackList) (drop 2 ackList)
Ah, OK that does explain it. I understand your solution, but AFAICS it's geared towards limited recursion in a sense. What if I want to use memoization to speed up something like this foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = sum [foo i | i <- [0..n - 1]] That is, where each value depends on _all_ preceding values. AFAIK list access is linear, is there a type that is a more suitable state for this changed problem? I realise this particular function can be written using scanl: foo :: Int -> Int foo n = ackList !! n where ackList = 0:1:2:(drop 2 $ scanl1 (+) ackList) but I guess it's not always that easy to construct a solution based on scanl. Cheers, M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

On Tue, Dec 16, 2008 at 1:07 PM, Magnus Therning
On Mon, Dec 15, 2008 at 11:33 PM, Lemmih
wrote: 2008/12/16 Magnus Therning
: This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-)
I wanted to improve on the following little example code:
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)
This is obviously going to run into problems for large values of `n` so I introduced a state to keep intermediate results in:
foo :: Int -> State (UArray Int Int) Int foo 0 = return 0 foo 1 = return 1 foo 2 = return 2 foo n = do c <- get if (c ! n) /= -1 then return $ c ! n else do r <- liftM3 (\ a b c -> a + b + c) (foo $ n - 1) (foo $ n - 2) (foo $ n - 3) modify (\ s -> s // [(n, r)]) return r
Then I added a convenience function and called it like this:
createArray :: Int -> UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1)))
main = do (n:_) <- liftM (map read) getArgs print $ evalState (foo n) (createArray n)
Then I thought that this still looks pretty deeply recursive, but if I call the function for increasing values of `n` then I'll simply build up the state, sort of like doing a for-loop in an imperative language. I could then end it with a call to `foo n` and be done. I replaced `main` by:
main = do (n:_) <- liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n)
Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.)
Anyway, I ran it with `n=35` and got
foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1
How should I think about this in order to predict this behaviour in the future?
Immutable arrays are duplicated every time you write to them. Making lots of small updates is going to be /very/ expensive. You have the right idea, though. Saving intermediate results is the right thing to do but arrays aren't the right way to do it. In this case, a lazy list will perform much better.
ack n = ackList !! n where ackList = 0:1:2:zipWith3 (\a b c -> a+b+c) ackList (drop 1 ackList) (drop 2 ackList)
Ah, OK that does explain it.
I understand your solution, but AFAICS it's geared towards limited recursion in a sense. What if I want to use memoization to speed up something like this
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = sum [foo i | i <- [0..n - 1]]
That is, where each value depends on _all_ preceding values. AFAIK list access is linear, is there a type that is a more suitable state for this changed problem?
You could use a Map or a mutable array. However, this kind of problem comes up a lot less often than you'd think. -- Cheers, Lemmih

On Tue, Dec 16, 2008 at 12:14 PM, Lemmih
You could use a Map or a mutable array. However, this kind of problem comes up a lot less often than you'd think.
Well, I happen to have a problem just like it right now, hence my interest :-) In order to better understand the different options I thought I'd look at solving simpler problems with similar "shape". Thanks for pointing me in the direction of mutable arrays, I haven't explored those before. /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

This is actually a perfect case for lazy immutable arrays, if you use a circular program:
import Data.Array
foo' :: Array Int Int -> Int -> Int foo' arr 0 = 0 foo' arr 1 = 1 foo' arr 2 = 2 foo' arr n = arr!(n-1) + arr!(n-2) + arr!(n-3)
foo :: Int -> Int foo n = arr ! n where assocs = [(i, foo' arr i) | i <- [0..n]] arr = array (0,n) assocs
But I haven't checked its performance against your version, so I don't know how good it is. / Emil Magnus Therning skrev:
On Tue, Dec 16, 2008 at 12:14 PM, Lemmih
wrote: You could use a Map or a mutable array. However, this kind of problem comes up a lot less often than you'd think.
Well, I happen to have a problem just like it right now, hence my interest :-) In order to better understand the different options I thought I'd look at solving simpler problems with similar "shape".
Thanks for pointing me in the direction of mutable arrays, I haven't explored those before.
/M

Lemmih wrote:
On Tue, Dec 16, 2008 at 1:07 PM, Magnus Therning
wrote: I understand your solution, but AFAICS it's geared towards limited recursion in a sense. What if I want to use memoization to speed up something like this
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = sum [foo i | i <- [0..n - 1]]
That is, where each value depends on _all_ preceding values. AFAIK list access is linear, is there a type that is a more suitable state for this changed problem?
You could use a Map or a mutable array. However, this kind of problem comes up a lot less often than you'd think.
And, that example is also easily memoizable by forward chaining (i.e. accumulators). The reason is because `sum` is such a simple function that you can decompose it as total_{t} = total_{t-1} + x_{t}; and given that x_{t} is defined to be the same as total_{t-1} we have: > foo i = foos !! i > where > foos = let s x = x : (s $! x + x) in 0 : 1 : 2 : s 3 You'd need to come up with an example where you replace `sum` with a function that is a non-decomposable combination of its input values. Lists are beautiful for being decomposable, which is why we have such nice functions as foldr and friends, so anything intuitive based on a list is most likely going to be decomposable as well. Coming up with a variadic function which isn't decomposable is extremely uncommon in practice. -- Live well, ~wren

2008/12/16 Magnus Therning
That is, where each value depends on _all_ preceding values. AFAIK list access is linear, is there a type that is a more suitable state for this changed problem?
I realise this particular function can be written using scanl: [...] but I guess it's not always that easy to construct a solution based on scanl.
You can always write something like
foo :: Int -> Int foo = (vals !!) where vals = map foo' [0..] foo' 0 = 0 foo' 1 = 1 foo' 2 = 2 foo' n = sum $ map foo [0..n-1]
which doesn't prevent you from using whatever recursive case you want. Note that if your recursive case depends on all preceding values, then you can't do better than using a linear access data structure like a list unless you need random access. I should point out as well that there are some packages on Hackage for memoization, like: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/data-memocombinat... http://hackage.haskell.org/cgi-bin/hackage-scripts/package/MemoTrie -- Felipe.

Hello, On Tuesday 16 December 2008 13:19, Felipe Lessa wrote:
2008/12/16 Magnus Therning
: That is, where each value depends on _all_ preceding values. AFAIK list access is linear, is there a type that is a more suitable state for this changed problem?
I realise this particular function can be written using scanl: [...] but I guess it's not always that easy to construct a solution based on scanl.
You can always write something like
foo :: Int -> Int foo = (vals !!) where vals = map foo' [0..] foo' 0 = 0 foo' 1 = 1 foo' 2 = 2 foo' n = sum $ map foo [0..n-1]
which doesn't prevent you from using whatever recursive case you want. Note that if your recursive case depends on all preceding values, then you can't do better than using a linear access data structure like a list unless you need random access.
Another possibility would be:
g n = t!n where t = array (0,max 2 n) $ (0,0):(1,1):(2,2):[ (i,t!(i-3) + t!(i-2) + t!(i-1)) | i <- [3..n] ]
using your original example. As noted in the Haskell 98 report, section 16.1 Array Construction, the array function is non-strict in the values of the association list, making this recurrence possible.
...
Best regards Thorkil

Or if you don't want to pay for laziness at all you could build your memo array imperatively (but purely):
import Data.Array.IArray(elems,(!),inRange) import Data.Array.MArray(newArray_,writeArray,readArray) import Data.Array.Unboxed(UArray) import Data.Array.ST(runSTUArray,STUArray) import Control.Monad(forM_) import Data.List(zipWith3)
ackMemoSize :: Int ackMemoSize = 12;
ackList :: [Int] ackList = 0:1:2:zipWith3 (\ i j k -> i+j+k) ackList (tail ackList) (tail (tail ackList))
ackMemo :: UArray Int Int ackMemo = runSTUArray $ do -- the $ works with ghc 6.10, hooray a <- newArray_ (0,ackMemoSize) writeArray a 0 0 writeArray a 1 1 writeArray a 2 2 let op i x | i > ackMemoSize = return () | otherwise = do writeArray a i x y <- readArray a (i-3) op (succ i) $! (2*x-y) -- could use (2*x) intead op 3 (0+1+2) return a
ack :: Int -> Int ack i | inRange (0,ackMemoSize) i = ackMemo ! i | otherwise = error "outsize memorized range for ack"
test = (take (succ ackMemoSize) ackList) == (elems ackMemo) && (ackList !! ackMemoSize) == (ack ackMemoSize)
Which should have very good performance in building ackMemo (the first time it is used). By changing the (2*x-y) to (2*x) I think you get the sum-of-all-previous-entries behavior. Cheers, Chris

Magnus Therning wrote:
This behaviour by Haskell seems to go against my intuition, I'm sure I just need an update of my intuition ;-)
I wanted to improve on the following little example code:
foo :: Int -> Int foo 0 = 0 foo 1 = 1 foo 2 = 2 foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)
Two more ideas: How about -- "loop" keeping the last three elements of the sequence -- O(n) per call, constant memory foo' :: Int -> Int foo' n = go n 0 1 2 where go 0 a _ _ = a go n a b c = go (n - 1) b c (a + b + c) or -- analogue of the folklore fibonacci definition: -- fibs = 0 : 1 : zipWith (+) fibs (tail fibs) foos :: [Int] foos = 0 : 1 : 2 : zipWith3 (\a b c -> a + b + c) foos (tail foos) (tail (tail foos)) [snip]
Then I added a convenience function and called it like this:
createArray :: Int -> UArray Int Int createArray n = array (0, n) (zip [0..n] (repeat (-1)))
main = do (n:_) <- liftM (map read) getArgs print $ evalState (foo n) (createArray n)
[snip]
main = do (n:_) <- liftM (map read) getArgs print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n)
Then I started profiling and found out that the latter version both uses more memory and makes far more calls to `foo`. That's not what I expected! (I suspect there's something about laziness I'm missing.)
Anyway, I ran it with `n=35` and got
foo n : 202,048 bytes , foo entries 100 mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1
The number of function calls is to be expected: to evaluate foo n for the first time, you need to call foo (n-1), foo (n-2) and foo (n-3), making 4 calls per evaluated value. 36*4 = 144 is pretty close to 135. (The "missing" 9 calls correspond to foo 0, foo 1 and foo 2) The difference of 35 can be explained in the same way: the first version makes 35 fewer explicit calls to 'foo'. Bertram
participants (8)
-
Bertram Felgenhauer
-
ChrisK
-
Emil Axelsson
-
Felipe Lessa
-
Lemmih
-
Magnus Therning
-
Thorkil Naur
-
wren ng thornton