Get Started

Subsetting rows by categorical variables

Subsetting data based on a categorical variable often involves using the "or" operator (|) to select rows from multiple categories. This can get tedious when you want all states in one of three different regions, for example. Instead, use the .isin() method, which will allow you to tackle this problem by writing one condition instead of three separate ones.

colors = ["brown", "black", "tan"]
condition = dogs["color"].isin(colors)
dogs[condition]

homelessness is available and pandas is loaded as pd.

This is a part of the course

“Data Manipulation with pandas”

View Course

Exercise instructions

Filter homelessness for cases where the USA census state is in the list of Mojave states, canu, assigning to mojave_homelessness. View the printed result.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The Mojave Desert states
canu = ["California", "Arizona", "Nevada", "Utah"]

# Filter for rows in the Mojave Desert states
mojave_homelessness = homelessness[____]

# See the result
print(mojave_homelessness)

This exercise is part of the course

Data Manipulation with pandas

BeginnerSkill Level
4.5+
277 reviews

Learn how to import and clean data, calculate statistics, and create visualizations with pandas.

Let’s master the pandas basics. Learn how to inspect DataFrames and perform fundamental manipulations, including sorting rows, subsetting, and adding new columns.

Exercise 1: Introducing DataFramesExercise 2: Inspecting a DataFrameExercise 3: Parts of a DataFrameExercise 4: Sorting and subsettingExercise 5: Sorting rowsExercise 6: Subsetting columnsExercise 7: Subsetting rowsExercise 8: Subsetting rows by categorical variables
Exercise 9: New columnsExercise 10: Adding new columnsExercise 11: Combo-attack!

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free