1. Learn
  2. /
  3. Courses
  4. /
  5. Dimensionality Reduction in R

Connected

Exercise

Reduce data using feature importances

Now that you have created a full random forest model, you will explore feature importance.

Even though random forest models naturally — but implicitly — perform feature selection, it is often advantageous to build a reduced model. A reduced model trains faster, computes predictions faster, and is easier to understand and manage. Of course, it is always a trade-off between model simplicity and model performance.

In this exercise, you will reduce the data set. In the next exercise, you will fit a reduced model and compare its performance to the full model. rf_fit, train, and test are provided for you.

The tidyverse, tidymodels, and vip packages have been loaded for you.

Instructions

100 XP
  • Use vi() with the rank parameter to extract the ten most important features.
  • Add the target variable back to the top feature list.
  • Apply the top feature mask to reduce the data sets.