Get startedGet started for free

Identify outliers

Consider the distribution, shown here, of the life expectancies of the countries in Asia. The box plot identifies one clear outlier: a country with a notably low life expectancy. Do you have a guess as to which country this might be? Test your guess in the console using either min() or filter(), then proceed to building a plot with that country removed.

This exercise is part of the course

Exploratory Data Analysis in R

View Course

Exercise instructions

gap2007 is still available in your workspace.

  • Apply a filter so that it only contains observations from Asia, then create a new variable called is_outlier that is TRUE for countries with life expectancy less than 50. Assign the result to gap_asia.
  • Filter gap_asia to remove all outliers, then create another box plot of the remaining life expectancies.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Filter for Asia, add column indicating outliers
gap_asia <- ___ %>%
  filter(___) %>%
  mutate(___ = ___)

# Remove outliers, create box plot of lifeExp
gap_asia %>%
  filter(___) %>%
  ggplot(aes(x = ___, y = ___)) +
  ___
Edit and Run Code