Calculating D in a Loop
Is Georgia's Index of Dissimilarity of of 0.544 high or low? Let's compare it to Illinois (FIPS = 17), home of Chicago.
In this exercise we will use a loop to calculate \(D\) for all states, then compare Georgia and Illinois.
Remember that the formula for the Index of Dissimilarity is:
$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$
pandas
has been imported using the usual alias, and the tracts
DataFrame with population columns "white"
and "black"
has been loaded. The variables w
and b
have been defined with the column names "white"
and "black"
.
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Use the
unique()
method on the"state"
column to create a list of state FIPS codes. - Use a for-loop to store each element of
states
(that is, each FIPS code) in a variable namedstate
. - Filter the
tracts
DataFrame on each value ofstate
, and assign totmp
. - Calculate \(D\) for each state by applying the formula to
tmp
, and store the result in the dictionarystate_D
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Get list of state FIPS Codes
states = list(tracts["state"].____)
state_D = {} # Initialize dictionary as collector
for state in ____:
# Filter by state
tmp = ____
# Add Index of Dissimilarity to Dictionary
state_D[state] = 0.5 * sum(____)
# Print D for Georgia (FIPS = 13) and Illinois (FIPS = 17)
print("Georgia D =", round(state_D["13"], 3))
print("Illinois D =", round(state_D["17"], 3))