Exercise

Tuning the number of features

The default hyperparameters used by your models are not optimized for your data. The goal of grid search cross-validation is to identify those hyperparameters that lead to optimal model performance. In the video, you saw how the random forest's n_estimators hyperparameter was tuned. Here, you'll practice tuning the max_features hyperparameter. The cv hyperparameter is set to 3 so that the code executes quickly.

Hyperparameter Purpose
max_features Number of features for best split

A random forest is an ensemble of many decision trees. The n_estimators hyperparameter controls the number of trees to use in the forest, while the max_features hyperparameter controls the number features the random forest should consider when looking for the best split at decision tree.

A random forest classifier has been instantiated for you as clf.

Instructions 1/4

undefined XP
    1
    2
    3
    4
  • Import GridSearchCV from sklearn.model_selection.