1. Plotting regression
Visualizing a regression is necessary to understand and present the results.
2. Linear line
Plots are essential for a full understanding of the data and any test results. You have collected the data, run group difference analyses and visualized them, likely run a correlation, and run a regression analysis which now needs to be visualized. We will create plots with ggplot2.
Use a scatter plot after deriving the regression line to visualize each data point contributing to the model. When running a regression, always plot the dependent variable on the y-axis and independent variable on the x-axis. Recall the dependent variable is the one we are assessing being impacted by the independent variable. We are assessing the time to eat pizza as the dependent variable and enjoyment of the pizza as the independent variable, impacting the time to eat the pizza. After calling the ggplot-two package, in the ggplot function, call the data frame and aes to set the aesthetics, specifying enjoyment as x and time as y. Produce the data points by adding geom-underscore-point. This code produces the plot here. To make the plot more informative, we can produce the regression line.
Add the function geom-underscore-smooth. Since the relation in this plot is linear and is assessed with a linear regression, set the method argument to lm, surrounding it in quotes. Notice both the regression line, across groups, and confidence interval are added to the plot.
3. Linear prediction
If we have predicted values using the regression model, we can include where the predicted value falls in the plot. We can indicate this with a horizontal and vertical line, intersecting at the point of interest. We first need to specify the point we want to predict and derive the prediction. Then we can call the values to create the intersecting lines.
Add the function geom-underscore-hline to denote a horizontal line at the dependent variable, prediction, with y-intercept. Add the function geom-underscore-vline specifying the value to predict, Enjoy, with x-intercept.
Notice one regression line is plotted using this code and the data included has no specification for groups.
4. Denoting groups
If taking advantage of the AB design, however, we may be assessing the groups as well as the time and enjoyment of the Cheese and Pepperoni pizzas. To indicate groups and automatically create a regression line for each group, we can denote the color argument as topping in the ggplot aesthetics function. Be aware that this color and having two lines is representative of each group being modeled separately, not with topping included as an independent variable and may not be ideal for your model.
5. Denoting groups
To avoid plotting a regression line for each group, do not include topping as the color in the ggplot aesthetics, but instead add the aesthetic of coloring the topping in geom-underscore-point.
6. Logistic line
If our data is assessed with a logistic regression, our ggplot and geom-underscore-point functions remain the same, while the method argument in the geom-underscore-smooth function becomes glm. We also need to include the argument method-dot-args and set it to list parentheses family equals binomial.
7. Logistic prediction
We can plot the vertical line of the predicted value to indicate where it falls on the x-axis using geom-underscore-vline specifying the xintercept as the value to predict, enjoy. Recall the output of predict for a logistic regression is a likelihood so it is uninformative to plot a horizontal line at the predicted likelihood.
8. Let's practice!
Let's practice regression visualization.