Hyperparameter tuning with RandomizedSearchCV
As you saw, GridSearchCV can be computationally expensive, especially if you are searching over a large hyperparameter space. In this case, you can use RandomizedSearchCV, which tests a fixed number of hyperparameter settings from specified probability distributions.
Training and test sets from diabetes_df have been pre-loaded for you as X_train. X_test, y_train, and y_test, where the target is "diabetes". A logistic regression model has been created and stored as logreg, as well as a KFold variable stored as kf.
You will define a range of hyperparameters and use RandomizedSearchCV, which has been imported from sklearn.model_selection, to look for optimal hyperparameters from these options.
Questo esercizio fa parte del corso
Supervised Learning with scikit-learn
Istruzioni dell'esercizio
- Create
params, adding"l1"and"l2"aspenaltyvalues, settingCto a range of50float values between0.1and1.0, andclass_weightto either"balanced"or a dictionary containing0:0.8, 1:0.2. - Create the Randomized Search CV object, passing the model and the parameters, and setting
cvequal tokf. - Fit
logreg_cvto the training data. - Print the model's best parameters and accuracy score.
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
# Create the parameter space
params = {"penalty": ["____", "____"],
"tol": np.linspace(0.0001, 1.0, 50),
"C": np.linspace(____, ____, ____),
"class_weight": ["____", {0:____, 1:____}]}
# Instantiate the RandomizedSearchCV object
logreg_cv = ____(____, ____, cv=____)
# Fit the data to the model
logreg_cv.____(____, ____)
# Print the tuned parameters and score
print("Tuned Logistic Regression Parameters: {}".format(____.____))
print("Tuned Logistic Regression Best Accuracy Score: {}".format(____.____))