Compare KNN and median imputation
All of the preprocessing steps in the train()
function happen in the training set of each cross-validation fold, so the error metrics reported include the effects of the preprocessing.
This includes the imputation method used (e.g. knnImpute
or medianImpute
). This is useful because it allows you to compare different methods of imputation and choose the one that performs the best out-of-sample.
median_model
and knn_model
are available in your workspace, as is resamples
, which contains the resampled results of both models. Look at the results of the models by calling
dotplot(resamples, metric = "ROC")
and choose the one that performs the best out-of-sample. Which method of imputation yields the highest out-of-sample ROC score for your glm
model?
Diese Übung ist Teil des Kurses
Machine Learning with caret in R
Interaktive Übung
In dieser interaktiven Übung kannst du die Theorie in die Praxis umsetzen.
