Get startedGet started for free

Making predictions

1. Making predictions

In the last lesson we talked about how to visualize your model. For the rest of this chapter, we'll be looking at how we can use our estimated model to make predictions for observations in our dataset, and for new data.

2. Making predictions for observed data

We'll start by getting predictions for the observations in our kidiq dataset. We first estimate a model predicting the child's IQ from the mom's IQ and whether or not their mom completed high school. We can then get a predicted score at each iteration of the model for each kid by using the posterior_predict function. This returns a matrix that has a row for every iteration and a column for each observation. This means that we can get a posterior distribution of the predicted score for each student, just like when we were doing posterior predictive model checks in chapter three. In chapter three, we got these predictions using the posterior_linpred function. The benefit of using the `posterior_predict` function now is that we can get predictions for new data that weren't used to estimated the model.

3. Making predictions for new data

To make predictions for new data, we first have to create the data we want to predict. For this example, we will predict the IQ for two children whose mothers both had an IQ of 110, one who completed high school, and one who didn't. For these predictions, we create a new data frame with the same variable names as our observed data. Our data frame then has two columns, one for each predictor in our model, and two rows, one for each prediction that we want to make.

4. Making predictions for new data

After creating the new data for predictions, we can supply this data frame to the newdata argument of the posterior_predict function. This creates predictions for the new data at all 4,000 draws from the posterior distributions. Here we can see the predicted scores for these observations at the first 10 iterations. We can also look at a summary for each column. Looking at the summaries, we see that the predicted scores for the observations with a mother who completed high school are consistently higher.

5. Let's practice

Now it's your turn to make some predictions about the popularity of songs in the Spotify data!