Rolling means
Rolling means (sometimes called moving averages) are used in time series analysis to smooth out noise. The value at each time point is replaced with the mean of the values at nearby time points (the window).
A natural way of writing this function is
rollmean1 <- function(x, window = 3) {
n <- length(x)
res <- rep(NA, n)
for(i in seq(window, n)) {
res[i] <- mean(x[seq(i - window + 1, i)])
}
res
}
This calls mean()
many times, which is inefficient. One solution is to use a total
variable so that at each iteration of the loop you remove the element you no longer need and add the new one.
rollmean2 <- function(x, window = 3){
n <- length(x)
res <- rep(NA, n)
total <- sum(head(x, window))
res[window] <- total / window
for(i in seq(window + 1, n)) {
total <- total + x[i] - x[i - window]
res[i] <- total / window
}
res
}
Either way, it's much more natural to write looping code than vectorized code, which can reduce the performance. Both versions above are inefficient in their own way. Before doing the C++ version in the next exercise, let's write a version that uses vectorization. rollmean1()
, rollmean2()
, and a random vector, x
, are available in your workspace. You will now complete the function definition of rollmean3()
and benchmark the performance of these functions.
This exercise is part of the course
Optimizing R Code with Rcpp
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Complete the definition of rollmean3()
rollmean3 <- function(x, window = 3) {
# Add the first window elements of x
initial_total <- ___(head(x, window))
# The elements to add at each iteration
lasts <- tail(x, - window)
# The elements to remove
firsts <- head(x, - window)
# Take the initial total and add the
# cumulative sum of lasts minus firsts
other_totals <- ___ + ___(___ - firsts)
# Build the output vector
c(
rep(NA, window - 1), # leading NA
initial_total / ___, # initial mean
other_totals / ___ # other means
)
}