Get startedGet started for free

Calculating D in a Loop

Is Georgia's Index of Dissimilarity of of 0.544 high or low? Let's compare it to Illinois (FIPS = 17), home of Chicago.

In this exercise we will use a loop to calculate \(D\) for all states, then compare Georgia and Illinois.

Remember that the formula for the Index of Dissimilarity is:

$$D = \frac{1}{2}\sum{\left\lvert \frac{a}{A} - \frac{b}{B} \right\rvert}$$

pandas has been imported using the usual alias, and the tracts DataFrame with population columns "white" and "black" has been loaded. The variables w and b have been defined with the column names "white" and "black".

This exercise is part of the course

Analyzing US Census Data in Python

View Course

Exercise instructions

  • Use the unique() method on the "state" column to create a list of state FIPS codes.
  • Use a for-loop to store each element of states (that is, each FIPS code) in a variable named state.
  • Filter the tracts DataFrame on each value of state, and assign to tmp.
  • Calculate \(D\) for each state by applying the formula to tmp, and store the result in the dictionary state_D.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Get list of state FIPS Codes
states = list(tracts["state"].____)

state_D = {}  # Initialize dictionary as collector
for state in ____:
    # Filter by state
    tmp = ____
    
    # Add Index of Dissimilarity to Dictionary
    state_D[state] = 0.5 * sum(____)

# Print D for Georgia (FIPS = 13) and Illinois (FIPS = 17)    
print("Georgia D =", round(state_D["13"], 3))
print("Illinois D =", round(state_D["17"], 3))
Edit and Run Code