Hyperparameter tuning dengan RandomizedSearchCV

Seperti yang Anda lihat, GridSearchCV dapat memakan banyak komputasi, terutama jika Anda menelusuri ruang hyperparameter yang besar. Dalam kasus ini, Anda dapat menggunakan RandomizedSearchCV, yang menguji sejumlah tetap pengaturan hyperparameter dari sebaran probabilitas yang ditentukan.

Himpunan latih dan uji dari diabetes_df telah dimuat sebelumnya untuk Anda sebagai X_train, X_test, y_train, dan y_test, dengan target "diabetes". Sebuah model regresi logistik telah dibuat dan disimpan sebagai logreg, serta sebuah variabel KFold disimpan sebagai kf.

Anda akan mendefinisikan rentang hyperparameter dan menggunakan RandomizedSearchCV, yang telah diimpor dari sklearn.model_selection, untuk mencari hyperparameter optimal dari opsi-opsi tersebut.

Latihan ini adalah bagian dari kursus

Supervised Learning dengan scikit-learn

Petunjuk latihan

Buat params, tambahkan "l1" dan "l2" sebagai nilai penalty, tetapkan C ke rentang 50 nilai float antara 0.1 hingga 1.0, dan class_weight ke "balanced" atau kamus berisi 0:0.8, 1:0.2.
Buat objek Randomized Search CV, dengan meneruskan model dan parameter, serta menetapkan cv sama dengan kf.
Fit logreg_cv pada data latih.
Cetak hyperparameter terbaik model dan skor akurasinya.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Create the parameter space
params = {"penalty": ["____", "____"],
         "tol": np.linspace(0.0001, 1.0, 50),
         "C": np.linspace(____, ____, ____),
         "class_weight": ["____", {0:____, 1:____}]}

# Instantiate the RandomizedSearchCV object
logreg_cv = ____(____, ____, cv=____)

# Fit the data to the model
logreg_cv.____(____, ____)

# Print the tuned parameters and score
print("Tuned Logistic Regression Parameters: {}".format(____.____))
print("Tuned Logistic Regression Best Accuracy Score: {}".format(____.____))

Edit dan Jalankan Kode

Latihan ini adalah bagian dari kursus

Supervised Learning dengan scikit-learn

SkillTag.level.intermediateSkillTag.label

4.8+

Mulai Kursus Gratis

In this chapter, you'll be introduced to classification problems and learn how to solve them using supervised learning techniques. You'll learn how to split data into training and test sets, fit a model, make predictions, and evaluate accuracy. You’ll discover the relationship between model complexity and performance, applying what you learn to a churn dataset, where you will classify the churn status of a telecom company's customers.

Exercise 1: Machine learning with scikit-learn Exercise 2: Binary classification Exercise 3: The supervised learning workflow Exercise 4: The classification challenge Exercise 5: k-Nearest Neighbors: Fit Exercise 6: k-Nearest Neighbors: Predict Exercise 7: Measuring model performance Exercise 8: Train/test split + computing accuracy Exercise 9: Overfitting and underfitting Exercise 10: Visualizing model complexity

In this chapter, you will be introduced to regression, and build models to predict sales values using a dataset on advertising expenditure. You will learn about the mechanics of linear regression and common performance metrics such as R-squared and root mean squared error. You will perform k-fold cross-validation, and apply regularization to regression models to reduce the risk of overfitting.

Exercise 1: Introduction to regression Exercise 2: Creating features Exercise 3: Building a linear regression model Exercise 4: Visualizing a linear regression model Exercise 5: The basics of linear regression Exercise 6: Fit and predict for regression Exercise 7: Regression performance Exercise 8: Cross-validation Exercise 9: Cross-validation for R-squared Exercise 10: Analyzing cross-validation metrics Exercise 11: Regularized regression Exercise 12: Regularized regression: Ridge Exercise 13: Lasso regression for feature importance

Having trained models, now you will learn how to evaluate them. In this chapter, you will be introduced to several metrics along with a visualization technique for analyzing classification model performance using scikit-learn. You will also learn how to optimize classification and regression models through the use of hyperparameter tuning.

Exercise 1: Seberapa baik model Anda?Exercise 2: Menentukan metrik utama Exercise 3: Menilai pengklasifikasi prediksi diabetes Exercise 4: Regresi logistik dan kurva ROC Exercise 5: Membangun model logistic regression Exercise 6: Kurva ROC Exercise 7: ROC AUC Exercise 8: Penyetelan hiperparameter Exercise 9: Penyetelan hyperparameter dengan GridSearchCV Exercise 10: Hyperparameter tuning dengan RandomizedSearchCV

Latihan Saat Ini

Learn how to impute missing values, convert categorical data to numeric values, scale data, evaluate multiple supervised learning models simultaneously, and build pipelines to streamline your workflow!

Exercise 1: Preprocessing data Exercise 2: Creating dummy variables Exercise 3: Regression with categorical features Exercise 4: Handling missing data Exercise 5: Dropping missing data Exercise 6: Pipeline for song genre prediction: I Exercise 7: Pipeline for song genre prediction: II Exercise 8: Centering and scaling Exercise 9: Centering and scaling for regression Exercise 10: Centering and scaling for classification Exercise 11: Evaluating multiple models Exercise 12: Visualizing regression model performance Exercise 13: Predicting on the test set Exercise 14: Visualizing classification model performance Exercise 15: Pipeline for predicting song popularity Exercise 16: Congratulations