Exercise

# Create Function to Calculate D

Calculating the Index of Dissimilarity requires multiple steps and has high reuse potential. In this exercise you will create the function `dissimilarity`

that we used in the previous exercise. The function's input parameters will be a data frame of small area geographies (such as tracts) and three column names: the two columns with population counts of Group A and Group B, and the column with the names or geographic identifiers of the container geography (such as states or metro areas).

As a reminder, the formula the the Index of Dissimilarity is:

$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$

`pandas`

has been imported using the usual alias. The `groupby`

and `merge`

are already completed for you in the code below.

Instructions

**100 XP**

- Calculate the expression inside the absolute value bars based on the formula: The column names for \(A\) and \(B\) are formed by adding the suffix
`"_sum"`

to the parameters`col_A`

and`col_B`

- The
`sum`

method on a single column returns a series; use the`to_frame()`

method to convert the series to a data frame - Test the new function on
`tracts`

: calculate White-Black dissimilarity by MSA name