Get Started

Plotting Poisson regression

1. Plotting Poisson regression

During the previous set of exercises, you learned about interpreting coefficients from Poisson regression. During this video, you will learn about plotting Poisson regressions using geom_smooth in ggplot2.

2. When to use geom_smooth with Poisson

Plotting a Poisson regression with geom_smooth() works best with continuous predictor variables. For example, if we had dose data and the number of cancer cells per square centimeter, a geom_smooth could be used to produce figures. Otherwise, for discrete predictor variables, tools like boxplots are better. For example, the daily fire injury data we have been working with previously can best be summarized with other tools such as boxplots by month or simply plotting the raw data across time. Other DataCamp courses on ggplot2 cover these tools.

3. Cancer cells dose study

For this example, we will use simulated dose-response data. The data examines the dose of a chemical and the resulting number of cancer cells per unit area. Example data like this could be found in a toxicology study.

4. Plot points

When facing a new dataset, the first thing I do is plot the points, using geom_point() in ggplot2. In this case, my data is a data.frame called "dat" with columns "dose" and "cells". I use these columns as my aesthetics setting. However, notice how we cannot see all of the points because some are overlapping.

5. Jitter points

We can solve the problem of overlapping points by jittering our points to make them slightly non-overlapping. I have randomly moved the points by setting the width and height options with geom_jitter().

6. geom_smooth()

Now that I've plotted my points, I like to add a trend line to help me see if anything is going on. To start off, I add a spline on the plot using geom_smooth(). However, this seems to be overfitting the data. For example, notice the wiggle down between 5 and 7.

7. GLMs with geom_smooth()

One way to possibly improve the fit of geom_smooth() is to specify which function we use to model the line. We might simply start off using a GLM, but recall the default glm() model uses a Gaussian family, which corresponds to a linear model.

8. Poisson GLM with geom_smooth()

We can change the GLM method to be a Poisson regression by chaing the method arguments input. Specifically, we set the family argument to be "Poisson" inside of a list. Now, we've plotting a Poisson regression using geom_smooth().

9. Summary of steps

In summary, to plot a Poisson regression, first jitter the points to make them non-overlapping. Second, add a Poisson trend line. Third, polish this figure, something covered in other DataCamp courses on ggplot2(). For example, to polish the figure we would want to use better axis titles that describe the axis better than simply the variable name. We might also use a different theme.

10. Let's practice!

Now, it you're turn to plot a Poisson regression with ggplot2.