Get startedGet started for free

Plotting a regression model

1. Plotting a regression model

To conclude this chapter on Poisson regression you will learn how to visualize model estimates and compare it to the linear model fit. The reason for visualizing and comparing to the linear fit is that the normal distribution is a good approximation to a Poisson distribution for data with the mean above 30 or so.

2. Import libraries

For visualizations we will use two libraries, seaborn and matplotlib which we can import as sns and plt respectively. As an example, we will take the horseshoes crab model we fitted earlier where sat is predicted by the width of the female crab.

3. Plot data points

Let's start with the initial plot of data points. First, we define the desired figure size to be 8 by 5. To plot the data points we will use the regplot function from the seaborn library, where the x-axis is width, the y-axis is sat variable, data argument specifies where x and y are saved. We also need to set the argument fit_reg to False, so that the linear model is not fitted.

4. Add jitter

Since we are dealing with count data many data points overlap. To see the data points more clearly, we add small noise to the data with the argument y_jitter.

5. Add linear fit

For comparison of linear and Poisson regression we add a linear fit to the visualization by setting the fit_reg to true. Additionally, we need to define the color for the fit since by default everything is plotted in the same color. Using the argument line_kws we define the color to be green and additionally, we label the fit as LM fit, which we can later use if we were to call legend of the plot.

6. Add Poisson GLM estimated values

Finally, to add the fit of the Poisson regression we plot the fitted values of the Poisson model. First, we extract the fitted values from the Poisson regression model with the fittedvalues function and add as a fit_values column to the crab dataframe. Using the seaborn scatterplot function we plot the fitted values on the current plot and coloring the points in red with the label Poisson, which is now displayed in the legend automatically. Notice how the two regression models provide similar predictions over the range of width values where most of the observations occur but do diverge for smaller and larger values of width.

7. Predictions

Now that we have visualized the model, let's revisit the computation of predictions as we did for logistic regression. The Poisson fit is given in the figure on the right.

8. Predictions

Now let's assume we would like to estimate the number of satellites given the width of the crab at 24, 28 and 32 centimeters. Using the predict function we obtain the estimate or prediction for width 24 at 1.88, meaning that the mean number of satellites present is 1.88 or we can round to the nearest integer and say we expect 2 satellites.

9. Predictions

Similarly for width at 28 we expect 3.6 or 4 satellites

10. Predictions

and finally for width 32, we would expect 7 satellites.

11. Let's practice!

Now let's try some of the visualizations in exercises.