Get startedGet started for free

Neighborhood Segregation Over Time

1. Neighborhood Segregation Over Time

Welcome back. In this lesson we will apply some methods that we already know, to look at neighborhoods in Cook County, home to Chicago, one of the most segregated cities in America.

2. Histograms

We'll create a histogram with a new seaborn method, `distplot`. This is a simple but powerful way to explore univariate data. Examine the first few rows of the DataFrame `counties`. You see that it has county populations. Distplot is called with sns.distplot, and then in parentheses the name of a list-like object--in this case a pandas Series. kde = False ensures that the vertical axis will show the count of cases (counties) in each bin. The histogram shows that a large number of counties have extremely low populations, and a small number have very high population. This phenomenon of large swathes of lightly populated space and small areas of intense urbanization is found the world over.

3. Histograms

Now examine the first few rows of the tracts DataFrame, ... ...and see that it has Census tract populations. Call distplot the same way. Note that the vast majority of tracts have populations of a few thousand. Census tracts have a target size of 2,000 households. They vary greatly in area in order to keep the populations similar. High population tracts, of which there are a few, are usually large apartment complexes or group quarters.

4. National Historical GIS

The data that we'll use in this lesson comes the University of Minnesota National Historical GIS project. The site does not have an API, and if you want to use the site after completing this course, you'll have to create a free account. Nonetheless, NHGIS offers a few benefits over using the data provided directly by the Census Bureau.

5. NHGIS vs. Census Bureau FTP

* First, as the name implies, NHGIS has historical data. It goes back to the first American Census of 1790. The earliest Census available to download from Census FTP servers is 1980, and only 1990 and more recent are available via API. * Second, NHGIS also supplies geospatial data, and the data selection process makes it easy to grab the right GIS file to match the demographic data. * Third, NHGIS provides time series data that has been conflated by geography. For example, tracts change slightly from Census to Census, and NHGIS will adjust population to account for these tract changes.

6. Let's Practice!

In the exercises, we will use a time series that contains estimates of 1990 population in the Census tracts of 2010. This will allow us to calculate population changes for consistent geographies. Let's go!