Get Started

Dropping levels

The contingency table from the last exercise revealed that there are some levels that have very low counts. To simplify the analysis, it often helps to drop such levels.

In R, this requires two steps: first filtering out any rows with the levels that have very low counts, then removing these levels from the factor variable with droplevels(). This is because the droplevels() function would keep levels that have just 1 or 2 counts; it only drops levels that don't exist in a dataset.

This is a part of the course

“Exploratory Data Analysis in R”

View Course

Exercise instructions

The contingency table from the last exercise is available in your workspace as tab.

  • Load the dplyr package.
  • Print tab to find out which level of align has the fewest total entries.
  • Use filter() to filter out all rows of comics with that level, then drop the unused level with droplevels(). Save the simplified dataset as comics_filtered.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load dplyr
___

# Print tab
___

# Remove align level
comics_filtered <- ___ %>%
  ___(align != ___) %>%
  ___()

# See the result
comics_filtered

This exercise is part of the course

Exploratory Data Analysis in R

IntermediateSkill Level
5.0+
21 reviews

Learn how to use graphical and numerical techniques to begin uncovering the structure of your data.

In this chapter, you will learn how to create graphical and numerical summaries of two categorical variables.

Exercise 1: Exploring categorical dataExercise 2: Bar chart expectationsExercise 3: Contingency table reviewExercise 4: Dropping levels
Exercise 5: Side-by-side bar chartsExercise 6: Bar chart interpretationExercise 7: Counts vs. proportionsExercise 8: Conditional proportionsExercise 9: Counts vs. proportions (2)Exercise 10: Distribution of one variableExercise 11: Marginal bar chartExercise 12: Conditional bar chartExercise 13: Improve pie chart

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free