Get startedGet started for free

Binary predictions (1)

When you have a linear model, you can make predictions. A very basic question is, of course, how well does our model actually predict the target variable. Let's take a look!

The predict() function can be used to make predictions with a model object. If predict() is not given any new data, it will use the data used for finding (fitting, leaning, training) the model to make predictions.

In the case of a binary response variable, the 'type' argument of predict() can be used to get the predictions as probabilities (instead of log of odds, the default).

This exercise is part of the course

Helsinki Open Data Science

View Course

Exercise instructions

  • Fit the logistic regression model with glm().
  • Create object probabilities by using predict() on the model object.
  • Mutate the alc data: add a column 'probability' with the predicted probabilities.
  • Mutate the data again: add a column 'prediction' which is true if the value of 'probability' is greater than 0.5.
  • Look at the first ten observations of the data, along with the predictions.
  • Use table() to create a cross table of the columns 'high_use' versus 'prediction' in alc. This is sometimes called a 'confusion matrix`.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# alc, dplyr are available

# fit the model
m <- glm(high_use ~ failures + absences + sex, data = alc, family = "binomial")

# predict() the probability of high_use
probabilities <- predict(m, type = "response")

# add the predicted probabilities to 'alc'
alc <- mutate(alc, probability = probabilities)

# use the probabilities to make a prediction of high_use
alc <- mutate(alc, prediction = "change me!")

# see the last ten original classes, predicted probabilities, and class predictions
select(alc, failures, absences, sex, high_use, probability, prediction) %>% tail(10)

# tabulate the target variable versus the predictions
table(high_use = alc$high_use, prediction = "change me!")
Edit and Run Code