Get startedGet started for free

What is a choropleth?

1. What is a choropleth?

In this chapter, you will learn about a special map for visualizing geospatial data called a choropleth.

2. Definition of a choropleth

A choropleth map is a thematic map that uses color gradation to compare regions. In this choropleth, the average age of residents is shown for the town of Milton Keynes in southeast England. The regions shown are LSOAs, Lower Layer Super Output Areas. LSOAs are demographic regions used by the United Kingdom's Office for National Statistics. Notice that the higher average ages are associated with darker shades.

3. Normalization

If your data does not contain a statistical variable like average age, you may have to calculate a standardized measure. This choropleth shows 2017 GDP - Gross Domestic Product - per capita. It is normalized to both the population size and purchasing power parity or PPP, which adjusts for the cost of living in each country.

4. Density

In this example, we will be working with the schools_in_districts data to create a normalized value, school_density, to use for plotting a choropleth. Here is the schools_in_districts GeoDataFrame that we have created by spatially joining the schools and the school_districts.

5. Get counts

We can easily count the number of schools in each district after the spatial join. First we group the data by district and then we get the size of each group. Here we store that information in school_counts.

6. Add counts

With a little work, we can add those counts to the school_districts GeoDataFrame. First we convert the school_counts, a pandas series, to a DataFrame with the to_frame() method, reset the index, and add names for the columns: "district" and "school_count". Next, we merge that DataFrame, school_counts_df, with the school_districts to create a new GeoDataFrame, districts_with_counts. Here is the head. We have counts of schools in districts. But the counts are not normalized data. Lets create a new value that considers the varying size of each school district.

7. Divide counts by areas

We can divide the school_counts for each district by each district's area. This will give us a normalized value for each district, that takes into account how large the school district is. First we use the GeoSeries area attribute to create a column in the districts_with_counts GeoDataFrame to store the area for each school district. Then we create another column that holds the school_density, a normalized value, by dividing the school_count by the area. Now we have a data column that can be used to plot a choropleth.

8. Let's Practice!

In the exercises for this chapter you will be using a DataFrame of building permits issued in Nashville between September 2015 and September 2018. According to the US Census Bureau, the population of Nashville grew by 106 people each day in 2017. Understanding where building construction occurred will help us understand which areas were most affected by that population growth. You will also use a GeoDataFrame with the polygons that define each city council district. The Nashville City Council is the legislative authority for the Metropolitan Nashville government, and each geographically drawn district elects a council person to represent it. Let's go get a look at those datasets and practice creating normalized data for a choropleth!