Comparing predicted values

In the previous exercise, you have fitted both a linear and a GLM (logistic) regression model using crab data, predicting ywith width. In other words, you wanted to predict the probability that the female has a satellite crab nearby given her width.

In this exercise, you will further examine the estimated probabilities (the output) from the two models and try to deduce if the linear fit would be suitable for the problem at hand.

The usual practice is to test the model on new, unseen, data. Such dataset is called test sample.
The test sample has been created for you and loaded in the workspace. Note that you need test values for all variables present in the model, which in this example is width.

The crab dataset has been preloaded in the workspace.

Using print() view the test set.
Using the test sample, compute estimated probabilities using .predict() on the fitted linear model model_LM and save as pred_lm. Also, compute estimated probabilities using .predict() on the fitted GLM (logistic) model saved as model_GLM and save as pred_glm.
Using pandas DataFrame() combine predictions from both models and save as predictions.
Concatenate the test and predictions and save as all_data. View all_data using print().

Introduction to GLMs

Modeling Binary Data

Modeling Count Data

Multivariable Logistic Regression

Exercise

Comparing predicted values

Instructions