How many missing values are there?
One of the first things that you will want to check with a new dataset is if there are any missing missing values, and how many there are.
You could use are_na()
to and count up the missing values, but the most efficient way to count missings is to use the n_miss()
function. This will tell you the total number of missing values in the data.
You can then find the percent of missing values in the data with the pct_miss
function. This will tell you the percentage of missing values in the data.
You can also find the complement to these - how many complete values there are - using n_complete
and pct_complete
.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
Using the example dataframe of heights and weights dat_hw
:
- Use
n_miss()
on the dataframedat_hw
to count the total number of missing values the dataframe. - Use
n_miss()
on the variabledat_hw$weight
to count the total number of missing values it. - Similarly, use
prop_miss()
,n_complete()
, andprop_complete()
to get the proportion of missings, and the number and proportion of complete values for the dataframe and the variables.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use n_miss() to count the total number of missing values in dat_hw
n_miss(___)
# Use n_miss() on dat_hw$weight to count the total number of missing values
n_miss(___$___)
# Use n_complete() on dat_hw to count the total number of complete values
n_complete(___)
# Use n_complete() on dat_hw$weight to count the total number of complete values
___(___$___)
# Use prop_miss() and prop_complete() on dat_hw to count the total number of missing values in each of the variables
prop_miss(____)
prop_complete(___)