How do we visualize missing values?
1. How do we visualize missing values?
We now know what missing values are, how they work, how to count and summarize them - now let's look at some of the built-in visualizations that come with naniar.2. Introduction to missing data visualizations in naniar
Data summaries are very useful, but sometimes an idea or a thought can be quickly captured with a visualization. naniar provides a friendly family of missing data visualization functions, each presenting different visualizations missingness summaries. In fact, each of these visualizations is a nice compact shorthand for the data summaries. While you could create similar and more complex visualizations using the summary information from the previous lesson, this can be repetitive. The visualizations in `naniar` reduce repetition and increase iteration, so you can operate closer to the speed of thought.3. Lesson overview
In this lesson, we cover how to get a bird's eye view of the data, how to look at missings in the variables and cases, and how to generate visualizations for missing spans and across groups in the data.4. Get a bird's eye view of the missing data
When you first get a dataset, it can be difficult to get a visceral sense of where the missings are. To get an overview of the amount of missingness, use the vis_miss function from the visdat package. vis_miss produces a "heatmap" of the missingness - like as if the plot corresponded to the dataset as a giant spreadsheet, with values colored black for missing, and gray for present. vis_miss also provides missingness summary statistics, showing the overall percentage of missingness in the legend, and the amount of missings in each variable. These can be turned off in its options, described in the help file.5. Get a bird's eye view of the missing data
vis_miss also allows for clustering of the missing data by setting cluster equals TRUE. This orders the rows by missingness to identify common co-occurrences.6. Look at missings in variables and cases
To quickly show the missingness in variables and cases, we visualize them using gg_miss_var and gg_miss_case. Note that these are visual analogues of the miss_var_summary and miss_case_summary functions. These plots show the amount of missingness on the x axis, and for gg_miss_var, each point represents the amount of missingness in that variable, and for gg_miss_case, each line represents the amount of missingness in that case. Note that these visualizations are ordered so that the most missing is at the top. The orderings in gg_miss_case can be turned off with option, order_cases equals FALSE.7. Look at missings in variables and cases
gg_miss_var and gg_miss_case also allow for faceting by one variable. This means you can explore missingness in cases and variables across the levels of another group. This plot is faceted by month, showing the number of missings in each variable for each month. Here we see that Ozone in Month 6 has the most missings.8. Visualizing missingness patterns
To visualize the common combinations of missingness - which variables and cases go missing together, use gg_miss_upset. This powerful visualization shows the number of combinations of missing values that co-occur. An upset plot of the airquality dataset shows there are only missing values in Ozone and Solar-dot-R, with 35 in only Ozone, 5 in Solar-dot-R, and in both Ozone and Solar-dot-R, there are 2 missing cases.9. Visualizing factors of missingness
To explore how missingness in each variable changes across a factor, use gg_miss_fct. This displays a heatmap visualization showing the factors on the x axis, each other variable on the y axis, and the amount of missingness colored from dark purple to yellow. gg_miss_fct does not support faceting.10. Visualizing spans of missingness
gg_miss_span is the visual analogue of miss_var_span. This calculates the number of missings in a given span, the number of missings for every 3000 rows. It displays the amount of missing values in each span in a filled barplot. gg_miss_span supports faceting.11. Let's practice!
Now it's your turnCreate Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.