Get Started

Omitting outliers

Now let's use the student_data dataset to compare the distribution of final grades ("G3") between students who have internet access at home and those who don't. To do this, we'll use the "internet" variable, which is a binary (yes/no) indicator of whether the student has internet access at home.

Since internet may be less accessible in rural areas, we'll add subgroups based on where the student lives. For this, we can use the "location" variable, which is an indicator of whether a student lives in an urban ("Urban") or rural ("Rural") location.

Seaborn has already been imported as sns and matplotlib.pyplot has been imported as plt. As a reminder, you can omit outliers in box plots by setting the sym parameter equal to an empty string ("").

This is a part of the course

“Introduction to Data Visualization with Seaborn”

View Course

Exercise instructions

  • Use sns.catplot() to create a box plot with the student_data DataFrame, putting "internet" on the x-axis and "G3" on the y-axis.
  • Add subgroups so each box plot is colored based on "location".
  • Do not display the outliers.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a box plot with subgroups and omit the outliers






# Show plot
plt.show()

This exercise is part of the course

Introduction to Data Visualization with Seaborn

BeginnerSkill Level
4.8+
98 reviews

Learn how to create informative and attractive visualizations in Python using the Seaborn library.

Categorical variables are present in nearly every dataset, but they are especially prominent in survey data. In this chapter, you will learn how to create and customize categorical plots such as box plots, bar plots, count plots, and point plots. Along the way, you will explore survey data from young people about their interests, students about their study habits, and adult men about their feelings about masculinity.

Exercise 1: Count plots and bar plotsExercise 2: Count plotsExercise 3: Bar plots with percentagesExercise 4: Customizing bar plotsExercise 5: Box plotsExercise 6: Create and interpret a box plotExercise 7: Omitting outliers
Exercise 8: Adjusting the whiskersExercise 9: Point plotsExercise 10: Customizing point plotsExercise 11: Point plots with subgroups

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free