Get startedGet started for free

More on resamples

1. More on resamples

Resamples provides a ton of cool methods

2. Comparing models

for comparing models. It's one of my favorite functions in the caret package (thanks Max!) and actually inspired me to write my own package (caretEnsemble) for ensembling lists of caret models.

3. Box-and-whisker

Let's start with a simple box and box-and-whisker plot of AUC scores. We can use this to chose the model with the highest average AUC in this case the random forest model.

4. Dot plot

We can also use a dotplot to show the same information in a visually simpler manner.

5. Density plot

A density plot shows the full distribution of AUC scores using a kernel density plot, and can be a useful way to look for outlier folds with unusually high or low AUC.

6. Scatter plot

We can also use a scatterplot to directly compare the AUC on all 10 cross-validation folds. This plot shows us that on every fold, the random forest model provided higher AUC than the glmnet model, and would make us very confident in choosing the random forest model for this particular churn modeling problem.

7. Another dot plot

Finally, if we had many models to compare (let's pretend we'd also fit an SVM, a GBM, and a decision tree model), we can still summarize them using the same functions. In this case, I typically choose the dotplot, which gives a very clean summary, even for dozens of models. Here, it seems that the random forest model gives us very good predictions on our churn data.

8. Let’s practice!

Let's explore the resamples plots in more detail.