BaşlayınÜcretsiz Başlayın

Check for overfitting

A very high in-sample AUC like \(99.9\%\) can be an indicator of overfitting. It is also possible that your dataset is just very well structured, or your model might just be terrific!

To check which of these is true, you need to produce out-of-sample estimates of your AUC, and because you don't want to touch your test set yet, you can produce these using cross-validation on your training set.

Your training data, customers_train, and the bagged tree specification, spec_bagged, are still available in your workspace.

Bu egzersiz

Machine Learning with Tree-Based Models in R

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Using fit_resamples(), estimate your roc_auc metric using three CV folds of your training set and the model formula still_customer ~ total_trans_amt + customer_age + education_level.
  • Collect the metrics of the result to display the AUC.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

set.seed(55)

# Estimate AUC using cross-validation
cv_results <- fit_resamples(spec_bagged,
                            ___, 
                            resamples = vfold_cv(___),
                            metrics = ___)

# Collect metrics
___(cv_results)
Kodu Düzenle ve Çalıştır