How many missing values are there?
One of the first things that you will want to check with a new dataset is if there are any missing missing values, and how many there are.
You could use are_na() to and count up the missing values, but the most efficient way to count missings is to use the n_miss() function. This will tell you the total number of missing values in the data.
You can then find the percent of missing values in the data with the pct_miss function. This will tell you the percentage of missing values in the data.
You can also find the complement to these - how many complete values there are - using n_complete and pct_complete.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
Using the example dataframe of heights and weights dat_hw:
- Use
n_miss()on the dataframedat_hwto count the total number of missing values the dataframe. - Use
n_miss()on the variabledat_hw$weightto count the total number of missing values it. - Similarly, use
prop_miss(),n_complete(), andprop_complete()to get the proportion of missings, and the number and proportion of complete values for the dataframe and the variables.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use n_miss() to count the total number of missing values in dat_hw
n_miss(___)
# Use n_miss() on dat_hw$weight to count the total number of missing values
n_miss(___$___)
# Use n_complete() on dat_hw to count the total number of complete values
n_complete(___)
# Use n_complete() on dat_hw$weight to count the total number of complete values
___(___$___)
# Use prop_miss() and prop_complete() on dat_hw to count the total number of missing values in each of the variables
prop_miss(____)
prop_complete(___)