Compute predictions

Often, in practice, we are interested in using the fitted logistic regression to estimate the probabilities and construct confidence intervals for these estimates. Using the wells dataset and the model 'switch ~ arsenic' let's assume you have new observations wells_test which were not part of the training sample and you wish to predict the probability of switching to the nearest safe well.

You will do this with the help of the .predict() method.

Note that .predict() takes in several arguments:

exog - new observations (test dataset)
transform = True - passes the formula of the fit y ~ x to the data.

If exog is not defined the probabilities are computed for the training dataset.

Model wells_fit and datasets wells and wells_test are preloaded in the workspace.

Using the fitted model wells_fit, compute prediction on test data wells_test and save as prediction.
Add prediction to the existing data frame wells_test and name the column prediction.
Using print() display the first 5 rows of wells_test with columns switch, arsenic and prediction. Use pandas function head() to view only the first 5 rows.

Introduction to GLMs

Modeling Binary Data

Modeling Count Data

Multivariable Logistic Regression

Exercise

Compute predictions

Instructions