Cross validation

Cross validation is a clever method to avoid overfitting as you could see. In this exercise you are going to calculate the cross validated accuracy.

You can go right ahead, the necessary data defaultData and model are waiting for you. You can find the accuracy function in the first few lines of code. This is your cost function. Just leave it as it is and use it for your call to cv.glm() below. Try it out!

Deze oefening maakt deel uit van de cursus

Machine Learning for Marketing Analytics in R

Cursus bekijken

Oefeninstructies

Use a 6-fold cross validation and calculate the accuracy for the model logitModelNew. The function you need is cv.glm() of the boot package. The cross validated accuracy is stored in the first position of the delta element of the result.
Compare your accuracy of the cross validation to the one of the in-sample validation. Remember, it was 0.7922901.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

library(boot)
# Accuracy function
costAcc <- function(r, pi = 0) {
  cm <- confusion.matrix(r, pi, threshold = 0.3)
  acc <- sum(diag(cm)) / sum(cm)
  return(acc)
}

# Cross validated accuracy for logitModelNew
set.seed(534381)
cv.glm(___, ___, cost = ___, K = ___)$delta[1]

Code bewerken en uitvoeren