Calculating D in a Loop
Is Georgia's Index of Dissimilarity of of 0.544 high or low? Let's compare it to Illinois (FIPS = 17), home of Chicago.
In this exercise we will use a loop to calculate \(D\) for all states, then compare Georgia and Illinois.
Remember that the formula for the Index of Dissimilarity is:
$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$
pandas has been imported using the usual alias, and the tracts DataFrame with population columns "white" and "black" has been loaded. The variables w and b have been defined with the column names "white" and "black".
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Use the
unique()method on the"state"column to create a list of state FIPS codes. - Use a for-loop to store each element of
states(that is, each FIPS code) in a variable namedstate. - Filter the
tractsDataFrame on each value ofstate, and assign totmp. - Calculate \(D\) for each state by applying the formula to
tmp, and store the result in the dictionarystate_D.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Get list of state FIPS Codes
states = list(tracts["state"].____)
state_D = {} # Initialize dictionary as collector
for state in ____:
# Filter by state
tmp = ____
# Add Index of Dissimilarity to Dictionary
state_D[state] = 0.5 * sum(____)
# Print D for Georgia (FIPS = 13) and Illinois (FIPS = 17)
print("Georgia D =", round(state_D["13"], 3))
print("Illinois D =", round(state_D["17"], 3))