Session Ready
Exercise

Removing correlated features

Highly correlated features are often a stumbling block for many Machine Learning models. This also hinders the interpretability of your model. While we cannot guarantee that a model with no correlated features will inevitably lead to better performance, the consensus is that performing this step is generally useful.

The caret package provides a neat function to detect highly correlated features: findCorrelation(). This function expects a pre-calculated correlation matrix (which you can easily get with the cor() function from the stats package) and optionally a cut-off value for the minimum correlation (default is 0.9). Try it out with a subset of the FIFA19 dataset available in your workspace as fifa. As before, the dplyr package is already loaded too.

Instructions 1/4
undefined XP
  • 1
  • 2
  • 3
  • 4
  • Glimpse at the fifa data.