Rolling means

Rolling means (sometimes called moving averages) are used in time series analysis to smooth out noise. The value at each time point is replaced with the mean of the values at nearby time points (the window).

A natural way of writing this function is

rollmean1 <- function(x, window = 3) {
  n <- length(x)
  res <- rep(NA, n)
  for(i in seq(window, n)) {
    res[i] <- mean(x[seq(i - window + 1, i)])
  }
  res
}

This calls mean() many times, which is inefficient. One solution is to use a total variable so that at each iteration of the loop you remove the element you no longer need and add the new one.

rollmean2 <- function(x, window = 3){
  n <- length(x)
  res <- rep(NA, n)
  total <- sum(head(x, window))
  res[window] <- total / window
  for(i in seq(window + 1, n)) {
    total <- total + x[i] - x[i - window]
    res[i] <- total / window
  }
  res
}

Either way, it's much more natural to write looping code than vectorized code, which can reduce the performance. Both versions above are inefficient in their own way. Before doing the C++ version in the next exercise, let's write a version that uses vectorization. rollmean1(), rollmean2(), and a random vector, x, are available in your workspace. You will now complete the function definition of rollmean3() and benchmark the performance of these functions.

Calculate the sum of the first window elements of x.
Calculate the other_totals as the initial_total plus the cumulative sum (cumsum()) of the lasts minus the firsts.
Complete the output vector with the initial total divided by the window, and the other totals divided by the window.

Introduction

Functions and Control Flow

Vector classes

Case Studies

Exercise

Rolling means

Instructions 1/2