Estimation with and without outlier
The data provided in this exercise (hypdata_outlier
) has an extreme outlier. A plot is shown of the dataset, and a linear regression model of response
versus explanatory
. You will remove the outlying point to see how one observation can affect the estimate of the line.
This exercise is part of the course
Inference for Linear Regression in R
Exercise instructions
- Filter
hypdata_outlier
to remove the outlier. - Update the plot,
p
, to add another smooth layer (usegeom_smooth
).- Like the other ribbon, the update should use the linear regression method, and not draw the ribbon.
- Unlike the other ribbon, the update should use the
data = hypdata_no_outlier
and be colored red. - For now, just use the smooth curve, and not the confidence bounds (
se = FALSE
).
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# This plot is shown
p <- ggplot(hypdata_outlier, aes(x = explanatory, y = response)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
# Filter to remove the outlier
hypdata_no_outlier <- ___
p +
# Add another smooth lin .reg. layer, no ribbon,
# hypdata_no_outlier data, colored red
___