Calculating D for One State
In this exercise you will compute the Index of Dissimilarity for the state of Georgia. Remember that the formula for the Index of Dissimilarity is:
$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$
In this case, Group A will be Whites, Group B will be Blacks. \(a\) and \(b\) represent the White and Black population of the small geography (tracts), while \(A\) and \(B\) represent the White and Black population of the larger, containing geography (Georgia, postal code = GA, FIPS code = 13).
pandas
has been imported using the usual alias, and the tracts
DataFrame with population columns "white"
and "black"
has been loaded.
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Create the new DataFrame
ga_tracts
with only the tracts in Georgia ("state"
column should equal FIPS code"13"
) - Provide the column names in a list (use the variables
w
andb
) to print the sum of Nonhispanic Whites and Blacks in Georgia - Take the White population of each tract divided by the sum of the White population, and subtract the Black population of each tract divided by the sum of the Black population; use the
w
andb
variables to improve code readability
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define convenience variables to hold column names
w = "white"
b = "black"
# Extract Georgia tracts
ga_tracts = tracts[____]
# Print sums of Black and White residents of Georgia
print(ga_tracts[____].sum())
# Calculate Index of Dissimilarity and print rounded result
D = 0.5 * sum(abs(
____ / ____ - ____ / ____))
print("Dissimilarity (Georgia):", round(D, 3))