Dealing with Missing Test Scores
If we want to use SAT scores as our outcome, we should examine missingness. Examine the pattern of missingness across all the variables in nyc_scores
using miss_var_summary()
from the naniar
package. naniar
integrates with Tidyverse code styling, including the pipe operator (%>%
).
There are 60 missing scores in each subject. Though there are many R packages which help with more advanced forms of imputation, such as MICE
, Amelia
, and mi
, we will continue to use simputation
and impute_median()
.
Create a new dataset, nyc_scores_2
by imputing Math score by Borough, but note that impute_median()
returns the imputed variable as type "impute". You'll convert the variable to the numeric in a separate step.
simputation
and dplyr
are loaded.
This exercise is part of the course
Experimental Design in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load naniar
___
# Examine missingness with miss_var_summary()
___