1. Basic Mapping with Geopandas
We've now worked with Census population counts and ACS estimates. People have to live somewhere. Let's put them on a map! Mapping can be a powerful tool for communication and exploratory analysis. In this lesson we will introduce basic mapping with geopandas.
2. Geospatial Data - Further Learning
We will only scratch the surface of the complexities of working with geospatial data.
If you want to go further, check out these dedicated DataCamp courses.
3. Loading Geospatial Data
Geopandas is conventionally imported using the alias gpd.
Geospatial data can be read in by passing a filepath to the geopandas `read_file` method. The GPKG ending means this file is a GeoPackage, just one of many file formats for geospatial data. Other common formats include shapefiles and GeoJSON
The resulting object ...
...is a GeoDataFrame
4. Geopandas DataFrames
GeoDataFrames inherit many of the methods of pandas DataFrames
For example, the columns method shows the GeoDataFrame's column names.
Indexing and selecting data, including selecting columns by keys, also works the same way as for DataFrames.
5. Geopandas DataFrames
head also works...
...displaying the first few rows.
6. Geopandas geometry Column
GeoDataFrames are special in that they always have a "geometry" column.
This column holds the coordinates for the geographic features in the DataFrame. These geometries are Polygons. GeoDataFrames can also store Points or Linestrings.
7. Geopandas Plotting
Creating a map with Geopandas is very easy, using the GeoDataFrame plot method. Calling GeoDataFrame.plot() without parameters creates a single-color map of the geometries.
Because Alaska and Hawaii are geographically separated from the other states, they are often shown in inset maps, which would add considerable complexity to the plot construction. We're keeping it basic, so these exercises will only show the contiguous United States.
We can also color the states based on a data value. Coloring areas this way is referred to as
8. Choropleth Maps
choropleth mapping. Since Summary File and ACS data are reported by geographic areas, choropleth mapping is a go-to method for displaying data for states, Census tracts, and other geographies.
To create a choropleth, just pass a column name, such as "has_computer", to the column parameter of the plot method.
This is the result. Wasn't that easy? The default color scheme is matplotlib's viridis colormap. Unfortunately, we have committed two cartographic no-nos with this map. First, cartographers recommend using darker colors to represent higher numeric values. California has the most households with computers, but is represented by the lightest color in the palette.
9. Choropleth Maps
We can fix that using the cmap parameter
to choose a colormap, such as Yellow-Orange-Red, that goes from light-to-dark instead of dark-to-light. There is another problem with this map, however. This map shows the raw count of households with computers. High population states like California and Texas are going to be the darkest, and that will pretty much be true for any population-linked phenomenon. What we really want is to show a rate of computer ownership.
10. Choropleth Maps
We can calculate a new column in a GeoDataFrame the same as in a DataFrame. Here we calculate the percentage of households with computers, by dividing by the total number of households and multplying by 100.
The resulting map looks really different! Utah stands out has having the highest rate of computer ownership.
The cmap parameter can accept any colormap in the matplotlib package.
11. Matplotlib Sequential Colormaps
A large number of matplotlib colormaps are available. However, until you get some experience with cartographic visualization, I highly recommend that you restrict yourself to this set of sequential, light-to-dark colormaps.
12. Let's practice!
That was your crash course in geopandas. Let's get mapping!