Combo-attack!
You've seen the four most common types of data manipulation: sorting rows, subsetting columns, subsetting rows, and adding new columns. In a real-life data analysis, you can mix and match these four manipulations to answer a multitude of questions.
In this exercise, you'll answer the question, "Which state has the highest number of homeless individuals per 10,000 people in the state?" Combine your new pandas
skills to find out.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Add a column to
homelessness
,indiv_per_10k
, containing the number of homeless individuals per ten thousand people in each state, usingstate_pop
for state population. - Subset rows where
indiv_per_10k
is higher than20
, assigning tohigh_homelessness
. - Sort
high_homelessness
by descendingindiv_per_10k
, assigning tohigh_homelessness_srt
. - Select only the
state
andindiv_per_10k
columns ofhigh_homelessness_srt
and save asresult
. Look at theresult
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create indiv_per_10k col as homeless individuals per 10k state pop
homelessness["indiv_per_10k"] = 10000 * ____ / ____
# Subset rows for indiv_per_10k greater than 20
high_homelessness = ____
# Sort high_homelessness by descending indiv_per_10k
high_homelessness_srt = ____
# From high_homelessness_srt, select the state and indiv_per_10k cols
result = ____
# See the result
print(result)