How many missing values are there?
One of the first things that you will want to check with a new dataset is if there are any missing missing values, and how many there are.
You could use are_na()
to and count up the missing values, but the most efficient way to count missings is to use the n_miss()
function. This will tell you the total number of missing values in the data.
You can then find the percent of missing values in the data with the pct_miss
function. This will tell you the percentage of missing values in the data.
You can also find the complement to these - how many complete values there are - using n_complete
and pct_complete
.
Cet exercice fait partie du cours
Dealing With Missing Data in R
Instructions
Using the example dataframe of heights and weights dat_hw
:
- Use
n_miss()
on the dataframedat_hw
to count the total number of missing values the dataframe. - Use
n_miss()
on the variabledat_hw$weight
to count the total number of missing values it. - Similarly, use
prop_miss()
,n_complete()
, andprop_complete()
to get the proportion of missings, and the number and proportion of complete values for the dataframe and the variables.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Use n_miss() to count the total number of missing values in dat_hw
n_miss(___)
# Use n_miss() on dat_hw$weight to count the total number of missing values
n_miss(___$___)
# Use n_complete() on dat_hw to count the total number of complete values
n_complete(___)
# Use n_complete() on dat_hw$weight to count the total number of complete values
___(___$___)
# Use prop_miss() and prop_complete() on dat_hw to count the total number of missing values in each of the variables
prop_miss(____)
prop_complete(___)