Session Ready
Exercise

Don't change the size of Rcpp vectors

Rcpp vector classes are designed as very thin wrappers around R vectors. This has the performance implication that growing or shrinking them means creating a new vector of the right size and copy the relevant data. This is a very slow task, so it should be avoided.

If possible, you should structure your code to first calculate the final size of the vector, and then allocate it with that size.

Let's see an example that selects the positive values from a vector, equivalent to x[x > 0] in R. Since you don't know in advance how many positive numbers there will be in advance, it is tempting to start with a zero-length vector and append a value each time you find one. Here, push_back() is a function that appends a value.

NumericVector bad_select_positive_values_cpp(NumericVector x) {
  NumericVector positive_x(0);
  for(int i = 0; i < x.size(); i++) {
    if(x[i] > 0) {
      positive_x.push_back(x[i]);
    }
  }
  return positive_x;
}

Unfortunately, this function will be slow because it must repeatedly create new vectors and copy the data. See if you can do better!

Instructions
100 XP
  • Complete the definition of a more efficient function, good_select_positive_values_cpp() to select the positive numbers.
    • In the first for loop, if the ith element of x is greater than zero, add one to n_positive_elements.
    • After that for loop, allocated a numeric vector, positive_x, of size n_positive_elements.
    • In the second for loop, again check if if the ith element of x is greater than zero.
    • When it is, set the jth element of positive_x to the ith element of x then add one to j.
  • bad_select_positive_values_cpp() is available in your workspace for comparison. Examine the console output to see the benchmarked difference in running time.