Last observation carried forward
When you have missing data in a time series, one common technique is to carry forward the last value that wasn't missing. This is known as the last observation carried forward. It can naturally be expressed with iterative code. Here's an implementation using R:
na_locf1 <- function(x) {
current <- NA
res <- x
for(i in seq_along(x)) {
if(is.na(x[i])) {
# Replace with current
res[i] <- current
} else {
# Set current
current <- x[i]
}
}
res
}
Like rolling means, it's really difficult to vectorize this code while keeping it readable. However, since this is just a for loop, it can be easily translated to C++.
na_locf1()
is provided in your workspace. Convert it to C++ and assign it to na_locf2()
.
This exercise is part of the course
Optimizing R Code with Rcpp
Exercise instructions
- Initialize
current
to theNumericVector
'sNA
value. - The
if
condition should check if thei
th element ofx
is aNumericVector
'sNA
. - When that condition is true, set the
i
th element ofres
tocurrent
. - Otherwise, set
current
to thei
th element ofx
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
#include
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector na_locf2(NumericVector x) {
// Initialize to NA
double current = ___::___();
int n = x.size();
NumericVector res = clone(x);
for(int i = 0; i < n; i++) {
// If ith value of x is NA
if(___::___(___)) {
// Set ith result as current
res[i] = ___;
} else {
// Set current as ith value of x
current = ___;
}
}
return res ;
}
/*** R
library(microbenchmark)
set.seed(42)
x <- rnorm(1e5)
# Sprinkle some NA into x
x[sample(1e5, 100)] <- NA
microbenchmark(
na_locf1(x),
na_locf2(x),
times = 5
)
*/