Compute predictions
Often, in practice, we are interested in using the fitted logistic regression to estimate the probabilities and construct confidence intervals for these estimates. Using the wells
dataset and the model 'switch ~ arsenic'
let's assume you have new observations wells_test
which were not part of the training sample and you wish to predict the probability of switching to the nearest safe well.
You will do this with the help of the .predict()
method.
Note that .predict()
takes in several arguments:
exog
- new observations (test dataset)transform = True
- passes the formula of the fity ~ x
to the data.
If exog
is not defined the probabilities are computed for the training dataset.
Model wells_fit
and datasets wells
and wells_test
are preloaded in the workspace.
This exercise is part of the course
Generalized Linear Models in Python
Exercise instructions
- Using the fitted model
wells_fit
, compute prediction on test datawells_test
and save asprediction
. - Add
prediction
to the existing data framewells_test
and name the columnprediction
. - Using
print()
display the first 5 rows ofwells_test
with columnsswitch
,arsenic
andprediction
. Use pandas functionhead()
to view only the first 5 rows.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute predictions for the test sample wells_test and save as prediction
prediction = ____.predict(exog = ____)
# Add prediction to the existing data frame wells_test and assign column name prediction
____[____] = ____
# Examine the first 5 computed predictions
print(____[[____, ____, ____]].head())