Identify Extreme Values
Now that you have created a DataFrame with the percentage of Hispanic racial self-identification by state, you will explore it further, beginning by creating a boxplot using seaborn
.
You will also find the states with the largest or smallest percentage of Hispanics identifying as particular races. To do so, you will apply the squeeze()
method. This method converts a single-row DataFrame to a series (with no effect on a DataFrame with more than one row).
pandas
has been imported. The DataFrame states_hr
is loaded, which has percentages of racial self-identification for 7 different race categories.
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Create a boxplot by setting the
data
parameter to the name of the DataFrame. (orient = "h"
will plot the boxplots horizontally.) - Using
squeeze
, show the state with the largest value in columnhispanic_white
. - Using
squeeze
, show the state with the smallest value in columnhispanic_other
. - Notice that very few Hispanics identify as Asian, but one state is a high outlier. Using
squeeze
, show the state with the largest value in columnhispanic_asian
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import seaborn and matplotlib.plt
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
# Create a boxplot
sns.boxplot(data = ____, orient = "h")
plt.show()
# Show states with extreme values in various columns
print(states_hr.nlargest(1, ____).squeeze())
print(states_hr.nsmallest(____).____)
print(____)