Hyperparameter tuning with RandomizedSearchCV
As you saw, GridSearchCV
can be computationally expensive, especially if you are searching over a large hyperparameter space. In this case, you can use RandomizedSearchCV
, which tests a fixed number of hyperparameter settings from specified probability distributions.
Training and test sets from diabetes_df
have been pre-loaded for you as X_train
. X_test
, y_train
, and y_test
, where the target is "diabetes"
. A logistic regression model has been created and stored as logreg
, as well as a KFold
variable stored as kf
.
You will define a range of hyperparameters and use RandomizedSearchCV
, which has been imported from sklearn.model_selection
, to look for optimal hyperparameters from these options.
This exercise is part of the course
Supervised Learning with scikit-learn
Exercise instructions
- Create
params
, adding"l1"
and"l2"
aspenalty
values, settingC
to a range of50
float values between0.1
and1.0
, andclass_weight
to either"balanced"
or a dictionary containing0:0.8, 1:0.2
. - Create the Randomized Search CV object, passing the model and the parameters, and setting
cv
equal tokf
. - Fit
logreg_cv
to the training data. - Print the model's best parameters and accuracy score.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the parameter space
params = {"penalty": ["____", "____"],
"tol": np.linspace(0.0001, 1.0, 50),
"C": np.linspace(____, ____, ____),
"class_weight": ["____", {0:____, 1:____}]}
# Instantiate the RandomizedSearchCV object
logreg_cv = ____(____, ____, cv=____)
# Fit the data to the model
logreg_cv.____(____, ____)
# Print the tuned parameters and score
print("Tuned Logistic Regression Parameters: {}".format(____.____))
print("Tuned Logistic Regression Best Accuracy Score: {}".format(____.____))