Grid search
Hyperparameter tuning can be done by sklearn through providing various input parameters, each of which can be encoded using various functions from numpy. One method of tuning, which exhaustively looks at all combinations of input hyperparameters specified via param_grid, is grid search. In this exercise, you will use grid search to look over the hyperparameters for a sample random forest classifier with a scoring function as the AUC of the ROC curve.
X_train, y_train, X_test, y_test are available in your workspace. pandas as pd, numpy as np, and sklearn are also available in your workspace. Additionally, GridSearchCV() from sklearn.model_selection is available.
Este ejercicio forma parte del curso
Predicting CTR with Machine Learning in Python
Instrucciones del ejercicio
- Create the list of values for each hyperparameter in
n_estimatorsandmax_depth. - Create a random forest classifier.
- Set up a grid search to iterate over all hyperparameter combinations.
- Print out the best AUC score using
.best_score_, and the best estimator that led to this score using.best_estimator_.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Create list of hyperparameters
n_estimators = [10, 50]
max_depth = [5, 20]
param_grid = {'n_estimators': ____, 'max_depth': ____}
# Use Grid search CV to find best parameters
print("starting RF grid search.. ")
rf = ____()
clf = ____(estimator = rf, param_grid = ____, scoring = 'roc_auc')
clf.fit(X_train, y_train)
print("Best Score: ")
print(clf.____)
print("Best Estimator: ")
print(clf.____)