Accuracy and loss functions

A simple measure of performance in binary classification is accuracy: the average number of correctly classified observations.

Classification methods such as logistic regression aim to (approximately) minimize the incorrectly classified observations. The mean of incorrectly classified observations can be thought of as a penalty (loss) function for the classifier. Less penalty = good.

Since we know how to make predictions with our model, we can also compute the average number of incorrect predictions.

Este ejercicio forma parte del curso

Helsinki Open Data Science

Ver curso

Instrucciones del ejercicio

Define the loss function loss_func
Execute the call to the loss function with prob = 0, meaning you define the probability of high_use as zero for each individual. What is the interpretation of the resulting proportion?
Adjust the code: change the prob argument in the loss function to prob = 1. What kind of a prediction does this equal to? What is the interpretation of the resulting proportion?
Adjust the code again: change the prob argument by giving it the prediction probabilities in alc (the column probability). What is the interpretation of the resulting proportion?

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# the logistic regression model m and dataset alc with predictions are available

# define a loss function (mean prediction error)
loss_func <- function(class, prob) {
  n_wrong <- abs(class - prob) > 0.5
  mean(n_wrong)
}

# call loss_func to compute the average number of wrong predictions in the (training) data
loss_func(class = alc$high_use, prob = 0)

Editar y ejecutar código