Adding new columns
You aren't stuck with just the data you are given. Instead, you can add new columns to a DataFrame. This has many names, such as transforming, mutating, and feature engineering.
You can create new columns from scratch, but it is also common to derive them from other columns, for example, by adding columns together or by changing their units.
homelessness is a DataFrame containing estimates of homelessness in each U.S. state in 2018. The individual column is the number of homeless individuals not part of a family with children. The family_members column is the number of homeless individuals part of a family with children. The state_pop column is the state's total population.
homelessness is available and pandas is loaded as pd.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Add a new column to
homelessness, namedtotal, containing the sum of theindividualsandfamily_memberscolumns. - Add another column to
homelessness, namedp_homeless, containing the proportion of thetotalhomeless population to the total population in each statestate_pop.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Add total col as sum of individuals and family_members
____
# Add p_homeless col as proportion of total homeless population to the state population
____
# See the result
print(homelessness)