Get startedGet started for free

Impute data below range with nabular data

We want to keep track of values we imputed. If we don't, it is very difficult to assess how good the imputed values are.

We are going to practice imputing data and recreate visualizations in the previous set of exercises by imputing values below the range of the data.

This is a very useful way to help further explore missingness, and also provides the framework for imputing missing values.

First, we are going to impute the data below the range using impute_below_all(), and then visualize the data. We notice that although we can see where the missing values are in this instance, we need some way to track them. The track missing data programming pattern can help with this.

This exercise is part of the course

Dealing With Missing Data in R

View Course

Exercise instructions

Using the oceanbuoys data:

  • Impute below the range using impute_below_all().
  • Visualize the new missing values for wind_ew on the x-axis and air_temp_c on the y-axis.
  • Impute and track data with bind_shadow(), impute_below_all(), and add_label_shadow().
  • Show the plot and inspect the imputed values.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Impute the oceanbuoys data below the range using `impute_below`.
ocean_imp <- impute_below_all(___)

# Visualize the new missing values
ggplot(___, 
       aes(x = ___, y = ___)) +  
  geom_point()

# Impute and track data with `bind_shadow`, `impute_below_all`, and `add_label_shadow`
ocean_imp_track <- bind_shadow(___) %>% 
  ___() %>% 
  ___()

# Look at the imputed values
___
Edit and Run Code