Your first missing data visualizations
It can be difficult to get a handle on where the missing values are in your data, and here is where visualization can really help.
The function vis_miss()
creates an overview visualization of the missingness in the data. It also has options to cluster rows based on missingness, using cluster = TRUE
; as well as options for sorting the columns, from most missing to least missing (sort_miss = TRUE
).
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
Using the riskfactors
dataset from naniar
:
- Use
vis_miss()
to visualize the missingness in the data. - Use
vis_miss()
withcluster = TRUE
to explore some clusters of missingness. - Use
vis_miss()
and sort the missings withsort_miss
to arrange the columns by missingness.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Visualize all of the missingness in the `riskfactors` dataset
vis_miss(___)
# Visualize and cluster all of the missingness in the `riskfactors` dataset
vis_miss(___, ___ = TRUE)
# visualize and sort the columns by missingness in the `riskfactors` dataset
vis_miss(___, ___ = TRUE)