Joining Tracts and Metropolitan Areas

In order to focus on how the merge method works, a function that calculates the Index of Dissimilarity has been provided for you. (You will create this function yourself in the next exercise!)

To apply this function, you need to add the MSA identifiers to the tracts DataFrame. You will use state and county, present in both DataFrames, as the join keys. At the end, you will use seaborn's stripplot method to show the ten most segregated metros.

The tracts DataFrame that you have used previously is loaded. Population data by MSA is loaded as msa, and the first few rows are displayed in the console. Finally, msa_def is loaded with the counties that make up each MSA.

pandas and seaborn have been loaded with the usual aliases.

Use the nlargest method on the msa DataFrame to return the 50 largest metros by "population".
Both tracts and msa_def have columns "state" and "county". Use the merge method with the on parameter to join on these columns.
Use the merge method to join msa and msa_D on the MSA identifier.

Decennial Census of Population and Housing

American Community Survey

Measuring Segregation

Exploring Census Topics

Exercise

Joining Tracts and Metropolitan Areas

Instructions