Kesalahan lebih besar, penalti lebih besar

Semua kesalahan tetaplah salah, tetapi tidak semuanya sama buruknya. Terkadang kesalahan prediksi yang besar jauh lebih merugikan daripada kesalahan kecil.

Kesalahan lebih besar, penalti lebih besar — itulah salah satu karakteristik root mean squared error atau RMSE. Ukuran ini mengkuadratkan kesalahan besar, sehingga menghukum pencilan tersebut lebih berat dibandingkan kesalahan yang lebih kecil.

RMSE dapat dihitung menggunakan rumus berikut, di mana squared_diff ke-\(i\) adalah kuadrat dari galat ke-\(i\).

$$RMSE = \sqrt{\frac{1}{n} \cdot \sum_{i=1} ^n i\text{th squared_diff}}$$

Dalam latihan ini, Anda akan menghitung RMSE dari prediksi Anda.

Di workspace Anda tersedia hasil dari latihan sebelumnya, test_enriched, yaitu data uji dengan kolom baru .pred, prediksi model di luar sampel.

Latihan ini adalah bagian dari kursus

Machine Learning dengan Model Berbasis Pohon di R

Petunjuk latihan

Hitung selisih per komponen antara prediksi dan nilai akhir, kuadratkan, lalu simpan sebagai squared_diffs.
Gunakan rumus di atas untuk menghitung RMSE dan simpan sebagai rmse_manual.
Gunakan fungsi rmse() untuk menghitung galat dan simpan sebagai rmse_auto.
Cetak rmse_manual dan rmse_auto untuk memverifikasi bahwa nilainya sama.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Calculate the squared differences
squared_diffs <- (___ - ___)^___

# Compute the RMSE using the formula
rmse_manual <- ___(1 / ___ * ___)

# Compute the RMSE using a function
rmse_auto <- ___(___,
                 ___,
                 ___)

# Print both errors
___
___

Edit dan Jalankan Kode

Latihan ini adalah bagian dari kursus

Machine Learning dengan Model Berbasis Pohon di R

SkillTag.level.beginnerSkillTag.label

4.9+

Mulai Kursus Gratis

Ready to build a real machine learning pipeline? Complete step-by-step exercises to learn how to create decision trees, split your data, and predict which patients are most likely to suffer from diabetes. Last but not least, you’ll build performance measures to assess your models and judge your predictions.

Exercise 1: Welcome to the course!Exercise 2: Why tree-based methods?Exercise 3: Specify that tree Exercise 4: Train that model Exercise 5: How to grow your tree Exercise 6: Train/test split Exercise 7: Avoiding class imbalances Exercise 8: From zero to hero Exercise 9: Predict and evaluate Exercise 10: Make predictions Exercise 11: Crack the matrix Exercise 12: Are you predicting correctly?

Ready for some candy? Use a chocolate rating dataset to build regression trees and assess their performance using suitable error measures. You’ll overcome statistical insecurities of single train/test splits by applying sweet techniques like cross-validation and then dive even deeper by mastering the bias-variance tradeoff.

Exercise 1: Keluaran kontinu Exercise 2: Latih pohon regresi Exercise 3: Memprediksi nilai baru Exercise 4: Periksa keluaran model Exercise 5: Metrik kinerja untuk pohon regresi Exercise 6: Kinerja in-sample Exercise 7: Kinerja out-of-sample Exercise 8: Kesalahan lebih besar, penalti lebih besar

Latihan Saat Ini

Exercise 9: Cross-validation Exercise 10: Buat lipatan Exercise 11: Latih tiap lipatan Exercise 12: Evaluasi lipatan Exercise 13: Pertukaran bias-varian Exercise 14: Sebutkan sesuai istilahnya Exercise 15: Sesuaikan kompleksitas model Exercise 16: Kinerja in-sample dan out-of-sample

Time to get serious with tuning your hyperparameters and interpreting receiver operating characteristic (ROC) curves. In this chapter, you’ll leverage the wisdom of the crowd with ensemble models like bagging or random forests and build ensembles that forecast which credit card customers are most likely to churn.

Exercise 1: Tuning hyperparameters Exercise 2: Generate a tuning grid Exercise 3: Tune along the grid Exercise 4: Pick the winner Exercise 5: More model measures Exercise 6: Calculate specificity Exercise 7: Draw the ROC curve Exercise 8: Area under the ROC curve Exercise 9: Bagged trees Exercise 10: Create bagged trees Exercise 11: In-sample ROC and AUC Exercise 12: Check for overfitting Exercise 13: Random forest Exercise 14: Bagged trees vs. random forest Exercise 15: Variable importance

Ready for the high society of tree-based models? Apply gradient boosting to create powerful ensembles that perform better than anything that you have seen or built. Learn about their fine-tuning and how to compare different models to pick a winner for production.

Exercise 1: Introduction to boosting Exercise 2: Bagging vs. boosting Exercise 3: Specify a boosted ensemble Exercise 4: Gradient boosting Exercise 5: Train a boosted ensemble Exercise 6: Evaluate the ensemble Exercise 7: Compare to a single classifier Exercise 8: Optimize the boosted ensemble Exercise 9: Tuning preparation Exercise 10: The actual tuning Exercise 11: Finalize the model Exercise 12: Model comparison Exercise 13: Compare AUC Exercise 14: Plot ROC curves Exercise 15: Wrap-up