Do the data come from the population?
Recall that the observed difference (i.e. the difference in proportions in the homes
dataset, shown as the red vertical line) was around -0.0078, which seems to fall below the bulk of the density of shuffled differences. It is important to know, however, whether any of the randomly permuted differences were as extreme as the observed difference.
In this exercise, you'll re-create this dotplot as a density plot and count the number of permuted differences that were to the left of the observed difference.
This exercise is part of the course
Foundations of Inference in R
Exercise instructions
The homeown_perm
dataset is available in your workspace.
- Using
geom_density()
, plot the permuted differences. - Add a vertical red line with
geom_vline()
where the observed difference falls.diff_orig
is provided in your workspace and represents the original value of the difference statistic. - Count the number of permuted differences that were less than or equal to the observed difference.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Plot permuted differences, diff_perm
ggplot(homeown_perm, aes(x = ___)) +
# Add a density layer
___() +
# Add a vline layer with intercept diff_orig
___(aes(xintercept = ___), color = "red")
# Compare permuted differences to observed difference
homeown_perm %>%
summarize(n_perm_le_obs = sum(___ <= ___))