Confidence intervals for the average response for all observations
The confidence interval for the average response can be computed for all observations in the dataset. Using augment() directly on the twins dataset gives predictions and standard errors for the Foster twin based on all the Biological observations.
Note that the calculation of the regression line is more stable at the center, so predictions for the extreme values are more variable than predictions in the middle of the range of explanatory IQs.
The foster twin IQ predictions that you calculated last time are provided as predictions. These predictions are shown in a plot using geom_smooth().
This exercise is part of the course
Inference for Linear Regression in R
Exercise instructions
Manually create what geom_smooth() does, using predictions. Provide the aesthetics and data to each geom.
- Add a point layer of
Fostervs.Biological, using thedata = twinsdataset. - Add a line layer of
.fittedvs.Biological, using thedata = predictionsdataset. Color the line"blue". - Add a ribbon layer with
xmapped toBiological,yminmapped tolower_mean_predictionandymaxmapped toupper_mean_prediction. Use thedata = predictionsdataset and set the transparency,alphato0.2.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# This plot is shown
ggplot(twins, aes(x = Biological, y = Foster)) +
geom_point() +
geom_smooth(method = "lm")
ggplot() +
# Add a point layer of Foster vs. Biological, using twins
___(aes(___, ___), data = ___) +
# Add a line layer of .fitted vs Biological, using predictions, colored blue
___ +
# Add a ribbon layer of lower_mean_prediction to upper_mean_prediction vs Biological,
# using predictions, transparency of 0.2
___