Kompetisi boosting: Light vs Extreme

Walaupun performa model CatBoost cukup baik, mari coba dua varian boosting lainnya dan lihat mana yang lebih unggul: pendekatan "Light" atau "Extreme".

CatBoost sangat direkomendasikan ketika ada fitur kategorikal. Dalam kasus ini, semua fitur bersifat numerik, sehingga salah satu pendekatan lain mungkin menghasilkan kinerja yang lebih baik.

Karena kita membangun regressor, kita akan menggunakan parameter tambahan, objective, yang menentukan fungsi pembelajaran yang digunakan. Untuk menerapkan squared error, kita akan menetapkan objective ke 'reg:squarederror' untuk XGBoost dan 'mean_squared_error' untuk LightGBM.

Selain itu, kita akan menentukan parameter n_jobs untuk XGBoost guna meningkatkan waktu komputasinya.

CATATAN: hati-hati jangan menggunakan classifier, atau sesi Anda bisa kedaluwarsa!

Latihan ini adalah bagian dari kursus

Metode Ensemble di Python

Petunjuk latihan

Bangun XGBRegressor dengan parameter: max_depth = 3, learning_rate = 0.1, n_estimators = 100, dan n_jobs=2.
Bangun LGBMRegressor dengan parameter: max_depth = 3, learning_rate = 0.1, dan n_estimators = 100.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Build and fit an XGBoost regressor
reg_xgb = ____.____(____, ____, ____, ____, objective='reg:squarederror', random_state=500)
reg_xgb.fit(X_train, y_train)

# Build and fit a LightGBM regressor
reg_lgb = ____.____(____, ____, ____, objective='mean_squared_error', seed=500)
reg_lgb.fit(X_train, y_train)

# Calculate the predictions and evaluate both regressors
pred_xgb = reg_xgb.predict(X_test)
rmse_xgb = np.sqrt(mean_squared_error(y_test, pred_xgb))
pred_lgb = reg_lgb.predict(X_test)
rmse_lgb = np.sqrt(mean_squared_error(y_test, pred_lgb))

print('Extreme: {:.3f}, Light: {:.3f}'.format(rmse_xgb, rmse_lgb))

Edit dan Jalankan Kode

Latihan ini adalah bagian dari kursus

Metode Ensemble di Python

SkillTag.level.advancedSkillTag.label

4.9+

Mulai Kursus Gratis

Do you struggle to determine which of the models you built is the best for your problem? You should give up on that, and use them all instead! In this chapter, you'll learn how to combine multiple models into one using "Voting" and "Averaging". You'll use these to predict the ratings of apps on the Google Play Store, whether or not a Pokémon is legendary, and which characters are going to die in Game of Thrones!

Exercise 1: Introduction to ensemble methods Exercise 2: Exploring Google apps data Exercise 3: Predicting the rating of an app Exercise 4: Voting Exercise 5: Choosing the best model Exercise 6: Assembling your first ensemble Exercise 7: Evaluating your ensemble Exercise 8: Averaging Exercise 9: Journey to Westeros Exercise 10: Predicting GoT deaths Exercise 11: Soft vs. hard voting

Bagging is the ensemble method behind powerful machine learning algorithms such as random forests. In this chapter you'll learn the theory behind this technique and build your own bagging models using scikit-learn.

Exercise 1: The strength of “weak” models Exercise 2: Restricted and unrestricted decision trees Exercise 3: "Weak" decision tree Exercise 4: Bootstrap aggregating Exercise 5: Training with bootstrapping Exercise 6: A first attempt at bagging Exercise 7: BaggingClassifier: nuts and bolts Exercise 8: Bagging: the scikit-learn way Exercise 9: Checking the out-of-bag score Exercise 10: Bagging parameters: tips and tricks Exercise 11: Exploring the UCI SECOM data Exercise 12: A more complex bagging model Exercise 13: Tuning bagging hyperparameters

Boosting is class of ensemble learning algorithms that includes award-winning models such as AdaBoost. In this chapter, you'll learn about this award-winning model, and use it to predict the revenue of award-winning movies! You'll also learn about gradient boosting algorithms such as CatBoost and XGBoost.

Exercise 1: Efektivitas pembelajaran bertahap Exercise 2: Mengenal basis data film Exercise 3: Menjelajahi fitur film Exercise 4: Memprediksi pendapatan film Exercise 5: Boosting untuk prediksi pendapatan Exercise 6: Adaptive boosting: model pemenang penghargaan Exercise 7: Model AdaBoost pertama Anda Exercise 8: Regresi AdaBoost berbasis pohon Exercise 9: Memaksimalkan AdaBoost Exercise 10: Gradient boosting Exercise 11: Meninjau kembali ulasan aplikasi Google Exercise 12: Analisis sentimen dengan GBM Exercise 13: Ragam gradient boosting Exercise 14: Prediksi pendapatan film dengan CatBoost Exercise 15: Kompetisi boosting: Light vs Extreme

Latihan Saat Ini

Get ready to see how things stack up! In this final chapter you'll learn about the stacking ensemble method. You'll learn how to implement it using scikit-learn as well as with the mlxtend library! You'll apply stacking to predict the edibility of North American mushrooms, and revisit the ratings of Google apps with this more advanced approach.

Exercise 1: The intuition behind stacking Exercise 2: Exploring the mushroom dataset Exercise 3: Predicting mushroom edibility Exercise 4: K-nearest neighbors for mushrooms Exercise 5: Build your first stacked ensemble Exercise 6: Applying stacking to predict app ratings Exercise 7: Building the stacking classifier Exercise 8: Stacked predictions for app ratings Exercise 9: Let's mlxtend it!Exercise 10: A first attempt with mlxtend Exercise 11: Back to regression with stacking Exercise 12: Mushrooms: a matter of life or death Exercise 13: Ensembling it all together