Session Ready
Exercise

Joining Tracts and Metropolitan Areas

In order to focus on how the merge method works, a function that calculates the Index of Dissimilarity has been provided for you. (You will create this function yourself in the next exercise!)

To apply this function, you need to add the MSA identifiers to the tracts data frame. You will use state and county, present in both data frames, as the join keys. At the end, you will use seaborn's stripplot method to show the ten most segregated metros.

The tracts data frame that you have used previously is loaded. Population data by MSA is loaded as msa, and the first few rows are displayed in the console. Finally, msa_def is loaded with the counties that make up each MSA.

pandas and seaborn have been loaded with the usual aliases.

Instructions
100 XP
  • Use the nlargest method on the msa data frame to return the 50 largest metros by "population".
  • Both tracts and msa_def have columns "state" and "county". Use the merge method with the on parameter to join on these columns.
  • Use the merge method to join msa and msa_D on the MSA identifier.