Tuning the number of features
The default hyperparameters used by your models are not optimized for your data. The goal of grid search cross-validation is to identify those hyperparameters that lead to optimal model performance. In the video, you saw how the random forest's n_estimators
hyperparameter was tuned. Here, you'll practice tuning the max_features
hyperparameter. The cv
hyperparameter is set to 3 so that the code executes quickly.
Hyperparameter | Purpose |
---|---|
max_features | Number of features for best split |
A random forest is an ensemble of many decision trees. The n_estimators
hyperparameter controls the number of trees to use in the forest, while the max_features
hyperparameter controls the number features the random forest should consider when looking for the best split at decision tree.
A random forest classifier has been instantiated for you as clf
.
This exercise is part of the course
Marketing Analytics: Predicting Customer Churn in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import GridSearchCV