Identify Extreme Values
Now that you have created a DataFrame with the percentage of Hispanic racial self-identification by state, you will explore it further, beginning by creating a boxplot using seaborn.
You will also find the states with the largest or smallest percentage of Hispanics identifying as particular races. To do so, you will apply the squeeze() method. This method converts a single-row DataFrame to a series (with no effect on a DataFrame with more than one row).
pandas has been imported. The DataFrame states_hr is loaded, which has percentages of racial self-identification for 7 different race categories.
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Create a boxplot by setting the
dataparameter to the name of the DataFrame. (orient = "h"will plot the boxplots horizontally.) - Using
squeeze, show the state with the largest value in columnhispanic_white. - Using
squeeze, show the state with the smallest value in columnhispanic_other. - Notice that very few Hispanics identify as Asian, but one state is a high outlier. Using
squeeze, show the state with the largest value in columnhispanic_asian.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import seaborn and matplotlib.plt
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
# Create a boxplot
sns.boxplot(data = ____, orient = "h")
plt.show()
# Show states with extreme values in various columns
print(states_hr.nlargest(1, ____).squeeze())
print(states_hr.nsmallest(____).____)
print(____)