Comparing predicted values
In the previous exercise, you have fitted both a linear and a GLM (logistic) regression model using crab data, predicting ywith width. In other words, you wanted to predict the probability that the female has a satellite crab nearby given her width.
In this exercise, you will further examine the estimated probabilities (the output) from the two models and try to deduce if the linear fit would be suitable for the problem at hand.
The usual practice is to test the model on new, unseen, data. Such dataset is called test sample.
The test sample has been created for you and loaded in the workspace. Note that you need test values for all variables present in the model, which in this example is width.
The crab dataset has been preloaded in the workspace.
Diese Übung ist Teil des Kurses
Generalized Linear Models in Python
Anleitung zur Übung
- Using
print()view thetestset. - Using the
testsample, compute estimated probabilities using.predict()on the fitted linear modelmodel_LMand save aspred_lm. Also, compute estimated probabilities using.predict()on the fitted GLM (logistic) model saved asmodel_GLMand save aspred_glm. - Using
pandasDataFrame()combine predictions from both models and save aspredictions. - Concatenate the
testandpredictionsand save asall_data. Viewall_datausingprint().
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# View test set
print(____)
# Compute estimated probabilities for linear model: pred_lm
____ = model_LM.____(____)
# Compute estimated probabilities for GLM model: pred_glm
____ = model_GLM.____(____)
# Create dataframe of predictions for linear and GLM model: predictions
____ = pd.DataFrame({'Pred_LM': ____, 'Pred_GLM': ____})
# Concatenate test sample and predictions and view the results
all_data = pd.concat([____, ____], axis = 1)
print(____)