Create Function to Calculate D

Calculating the Index of Dissimilarity requires multiple steps and has high reuse potential. In this exercise you will create the function dissimilarity that we used in the previous exercise. The function's input parameters will be a DataFrame of small area geographies (such as tracts) and three column names: the two columns with population counts of Group A and Group B, and the column with the names or geographic identifiers of the container geography (such as states or metro areas).

As a reminder, the formula the the Index of Dissimilarity is:

$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$

pandas has been imported using the usual alias. The groupby and merge are already completed for you in the code below.

Calculate the expression inside the absolute value bars based on the formula: The column names for $A$ and $B$ are formed by adding the suffix "_sum" to the parameters col_A and col_B
The sum method on a single column returns a series; use the to_frame() method to convert the series to a DataFrame
Test the new function on tracts: calculate White-Black dissimilarity by MSA name

Decennial Census of Population and Housing

American Community Survey

Measuring Segregation

Exploring Census Topics

Exercise

Create Function to Calculate D

Instructions