Identify Extreme Values

Now that you have created a DataFrame with the percentage of Hispanic racial self-identification by state, you will explore it further, beginning by creating a boxplot using seaborn.

You will also find the states with the largest or smallest percentage of Hispanics identifying as particular races. To do so, you will apply the squeeze() method. This method converts a single-row DataFrame to a series (with no effect on a DataFrame with more than one row).

pandas has been imported. The DataFrame states_hr is loaded, which has percentages of racial self-identification for 7 different race categories.

This exercise is part of the course

Analyzing US Census Data in Python

View Course

Exercise instructions

Create a boxplot by setting the data parameter to the name of the DataFrame. (orient = "h" will plot the boxplots horizontally.)
Using squeeze, show the state with the largest value in column hispanic_white.
Using squeeze, show the state with the smallest value in column hispanic_other.
Notice that very few Hispanics identify as Asian, but one state is a high outlier. Using squeeze, show the state with the largest value in column hispanic_asian.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import seaborn and matplotlib.plt
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

# Create a boxplot
sns.boxplot(data = ____, orient = "h")
plt.show()

# Show states with extreme values in various columns
print(states_hr.nlargest(1, ____).squeeze())
print(states_hr.nsmallest(____).____)
print(____)

Edit and Run Code