Simple Moving Average of a list of real numbers

Hello ! Could anybody explain me how to calculate simple moving average of a list ? I have found several examples in internet but I completely don't understand how it works. Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list. I just can't imagine how to do it without loops. -- Best regards, Alex

On Mon, Nov 25, 2013 at 4:28 PM, Alexandr M
Hello !
Could anybody explain me how to calculate simple moving average of a list ?
I have found several examples in internet but I completely don't understand how it works.
Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list.
I just can't imagine how to do it without loops.
Who needs loops when you have recursion, or more to the point good combinators ! You can do it in a few lines using only splitAt, length (to check the result of splitAt), zip, and scanl. (and sum, +, - and /, of course). Ask again, if that's not enough. -- Jedaï

It reminds me of k-means.
On Tue, Nov 26, 2013 at 2:33 AM, Chaddaï Fouché
On Mon, Nov 25, 2013 at 4:28 PM, Alexandr M
wrote: Hello !
Could anybody explain me how to calculate simple moving average of a list ?
I have found several examples in internet but I completely don't understand how it works.
Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list.
I just can't imagine how to do it without loops.
Who needs loops when you have recursion, or more to the point good combinators ! You can do it in a few lines using only splitAt, length (to check the result of splitAt), zip, and scanl. (and sum, +, - and /, of course).
Ask again, if that's not enough.
-- Jedaï
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

In functional programming, loops are implemented using recursion and tailcalls. You have the power of loops beyond what "for" and "while" can express. On Tuesday, November 26, 2013, Alexandr M wrote:
Hello !
Could anybody explain me how to calculate simple moving average of a list ?
I have found several examples in internet but I completely don't understand how it works.
Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list.
I just can't imagine how to do it without loops.
-- Best regards, Alex

Hi Alex.
On 25 November 2013 15:28, Alexandr M
Could anybody explain me how to calculate simple moving average of a list ?
I have found several examples in internet but I completely don't understand how it works.
Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list.
I just can't imagine how to do it without loops.
Maybe we can use "inits" from "Data.List". Prelude> import Data.List Prelude Data.List> mapM_ print $ inits [1..10] [] [1] [1,2] [1,2,3] [1,2,3,4] [1,2,3,4,5] [1,2,3,4,5,6] [1,2,3,4,5,6,7] [1,2,3,4,5,6,7,8] [1,2,3,4,5,6,7,8,9] [1,2,3,4,5,6,7,8,9,10] Not a bad start. It gives us all the *initial segments* of a list. Now we need to keep the last n items for each of these. Let's see. Maybe we can define a function using "drop" and "length" as follows. Prelude Data.List> let lastN n xs = drop (length xs - n) xs This function should drop everything but the last n items in a list. If there are less than n items, it will return the list unchanged. Try it on a few inputs to see why this is the case. Also try "drop" on a few inputs. And the type of this function seems sensible enough: Prelude Data.List> :t lastN lastN :: Int -> [a] -> [a] Now: Prelude Data.List> mapM_ print $ map (lastN 3) $ inits [1..10] [] [1] [1,2] [1,2,3] [2,3,4] [3,4,5] [4,5,6] [5,6,7] [6,7,8] [7,8,9] [8,9,10] This seems pretty close to what you want. For each item in the list, we have a corresponding list of n items that are coming just before it. If only we had a function to calculate the average of a list of numbers, then we could: Prelude Data.List> mapM_ print $ map average $ map (lastN 3) $ inits [1..10] NaN 1.0 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 I've defined average locally here to demonstrate the use of it. I'll leave the definition of it as an exercise to you. Of course, depending on what you want to do with these averages, you can implement the same functionality in different ways. Maybe you only need the last one only or something else. I don't know. Also you can *fuse* the two maps and use only one map, but I wanted to keep things explicit. Hope this helps, Ozgur.

On 11/25/2013 10:28 AM, Alexandr M wrote:
Hello !
Could anybody explain me how to calculate simple moving average of a list ?
I have found several examples in internet but I completely don't understand how it works.
Basically it's necessary to iterate over the list of real numbers and calculate average values over last n items in the list.
I just can't imagine how to do it without loops.
When you need to know the position of something in a list, one easy way to get it is to "zip" the original list with another list of "positions", [1,2,3,...]. Then when you're processing the list, you're not just dealing with some element somewhere in the list, you have the ordered pair (item, position) which you probably know what to do with. Inline below is some code that uses this idea along with a "scan" (similar to a fold) to accomplish the moving average. Here it is in action: Prelude> :l moving_average.hs [1 of 1] Compiling Main ( moving_average.hs, interpreted ) Ok, modules loaded: Main. *Main> let samples = [1,2,3,4,5] :: [Double] *Main> let mavg = moving_average samples *Main> print mavg [(1.0,1.0),(1.5,2.0),(2.0,3.0),(2.5,4.0),(3.0,5.0)] If you only want the averages (and not the positions), the positions are easy to drop: *Main> print $ map fst mavg [1.0,1.5,2.0,2.5,3.0] Code below. I've tried to avoid line wrapping, but buyer beware. module Main where -- | Create a list whose nth entry is a pair containing the average of -- the first n elements of the given samples, along with the -- position n. We use a "scan" to do this; a scan is like a fold -- except it keeps track of the intermediate values and not just the -- final one. -- -- We can use this to our advantage; the nth average is just (n-1) -- times the previous average, added to the current item, and then -- divided by n. So we will thread a pair, (average, position), -- through the list computing each average from the previous one as -- we go. -- -- The result should be a list of all averages we generated along -- the way. If you want to drop the positions from the result, you -- can simply call `map fst` on it. -- moving_average :: (Fractional a) => [a] -> [(a,a)] moving_average [] = [] moving_average samples = -- scanl1 uses the first entry of samples_pos as the first entry in -- the result. Since we don't need to do any averaging of the first -- item (it gets divided by one), this is what we want. scanl1 average samples_pos where -- The indices we'll use for the positions in the list. We use -- fromIntegral on all of them because we want to be able to -- divide by them, so they need to be converted to something -- Fractional (GHC will figure it out). positions = map fromIntegral [1..] -- The list samples_pos contains the original samples "zipped" with -- their position in the list. The entries of samples_pos should be -- pairs (x,y) where x was te original entry in the list, and y -- was its position. This lets us know which number we need to -- divide by for the average samples_pos = zip samples positions -- This is the function that does the work. It takes the previous -- (average, position) pair and the new (value, position) pair and -- uses them to generate the new (average, position). average :: (Fractional b) => (b,b) -> (b,b) -> (b,b) average (prev_avg, prev_pos) (sample, pos) = (new_avg, pos) where prev_sum = prev_avg * prev_pos new_avg = (prev_sum + sample) / pos

On Wed, Nov 27, 2013 at 12:11 AM, Michael Orlitzky
Inline below is some code that uses this idea along with a "scan" (similar to a fold) to accomplish the moving average.
Hi Michael, When I read OP's request, I understood it as what Ozgur had interpreted, i.e. last N items for a /constant/ N, also known as a moving window average. Your running average solution is interesting too! -- Kim-Ee

On 11/26/2013 12:54 PM, Kim-Ee Yeoh wrote:
Hi Michael,
When I read OP's request, I understood it as what Ozgur had interpreted, i.e. last N items for a /constant/ N, also known as a moving window average.
Your running average solution is interesting too!
Oh, sorry. Statistics is down the hall =) You need to modify the scanl1 solution a little bit, but it should still work somewhat-efficiently only making one pass through the list and not using any partial functions. If the "window size" n is fixed, we need more information at each point in the list: 1. The item itself 2. What number we should subtract from the previous sum 3. What the previous divisor was 4. What the new divisor is (I'm assuming you want the first n items to be the running average, i.e. my first implementation.) The divisor that we'll use at each point is, [1, 2, ..., n, n, n...] and that can be zipped with the samples just like before. The other thing that we need is the item that will be "dropped" from the average. If the window is fixed, we'll be dropping one number and adding one number every time we move to a new item in the list (once we've processed n of them, at least). Before we've processed n of them, we don't want to subtract anything from the previous sum -- we want the running average. We can accomplish this by sticking n-1 zeros on the head of the samples, and zipping *that* with the samples. I haven't commented this version, but if you understand the last one, you can figure out what it does. Here's the example Ozgur used: *Main> :l moving_average.hs [1 of 1] Compiling Main ( moving_average.hs, interpreted ) Ok, modules loaded: Main. *Main> let samples = [1,2,3,4,5,6,7,8,9,10] :: [Float] *Main> moving_average 3 samples [1.0,1.5,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0] (They agree.) It can do a million items pretty quickly considering we're in ghci: *Main> let samples = [1..1000000] :: [Float] *Main> :set +s *Main> length $ moving_average 10 samples 1000000 (1.08 secs, 545400640 bytes) -------- module Main where fst3 :: (a,b,c) -> a fst3 (x, _, _) = x moving_average :: (Fractional a) => Int -> [a] -> [a] moving_average _ [] = [] moving_average n samples | n <= 0 = [] | otherwise = map fst3 $ scanl1 average sample_triples where divisors = map fromIntegral $ [1..n] ++ (repeat n) n_agos = (map fromIntegral (replicate (n-1) 0)) ++ samples sample_triples = zip3 samples divisors n_agos average :: (Fractional b) => (b,b,b) -> (b,b,b) -> (b,b,b) average (prev_avg, prev_div, dropme) (sample, divisor, n_ago) = (new_avg, divisor, n_ago) where prev_sum = prev_avg * prev_div new_avg = (prev_sum + sample - dropme) / divisor

Oh, sorry. Statistics is down the hall =)
There is another often used method which might be worth considering. If you don't need the absolute accuracy that the "boxcar" moving average gives, then perhaps a exponentially weighted moving average (aka: exponential filter) will do. It has the advantage of being easier to code, requiring less CPU power (normally), and needing only one variable for long term storage. There a good description at: http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average Simply put, you save only the result, not the individual sample. To update it with a new sample you first remove the "equivalent" of one sample then add your new sample. How do you remove the oldest sample? You can't, since you didn't save it. Instead you remove 1/n'th (for averaging over n samples) of the previous result, then add the new sample. If you think about it, the longer a sample has been part of the result, the less effect it has on the result - exponentially less. This is why I said it isn't as accurate as the boxcar method. But for many situations (most of the ones I've run into) it suffices and indeed can be better since it dampens (filters) step changes. As an equation: average_new = average_old - (average_old / n) + (sample_new / n) Or tweaking for efficiency: average_new = average_old + ( (sample_new - average_old) / n ) Note: You need to use floating point numbers for the calculation and resulting average, even if the samples are integers. Also, choosing a binary number for n should help speed up the division, if the math pack takes advantage of it. I realize that this doesn't directly address the question which Alex posed, but I thought that perhaps it might be of interest as an alternative. John John Souvestre - New Orleans LA - (504) 454-0899
participants (8)
-
Alexandr M
-
Chaddaï Fouché
-
John Souvestre
-
Kim-Ee Yeoh
-
Michael Orlitzky
-
Ozgur Akgun
-
Patrick Redmond
-
yi lu