Adding new columns
You aren't stuck with just the data you are given. Instead, you can add new columns to a DataFrame. This has many names, such as transforming, mutating, and feature engineering.
You can create new columns from scratch, but it is also common to derive them from other columns, for example, by adding columns together or by changing their units.
homelessness
is a DataFrame containing estimates of homelessness in each U.S. state in 2018. The individual
column is the number of homeless individuals not part of a family with children. The family_members
column is the number of homeless individuals part of a family with children. The state_pop
column is the state's total population.
homelessness
is available and pandas
is loaded as pd
.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Add a new column to
homelessness
, namedtotal
, containing the sum of theindividuals
andfamily_members
columns. - Add another column to
homelessness
, namedp_homeless
, containing the proportion of thetotal
homeless population to the total population in each statestate_pop
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Add total col as sum of individuals and family_members
____
# Add p_homeless col as proportion of total homeless population to the state population
____
# See the result
print(homelessness)