1. Projections and Coordinate Reference Systems
You can construct a GeoDataFrame from a DataFrame, as long as you have the required pieces in place: a geometry column and a coordinate reference system, or CRS.
2. Projections
Before we talk about coordinate reference systems it's helpful to talk about projections. Map projections are necessary for representing the earth in two-dimensional space.
3. Many approaches to map projection
In this comic from XKCD twelve different types of map projections are shown, along with a humorous take on what your favorite projection says about you.
The most common projection is the mercator projection she upper left corner. A variation of the mercator projection known as WGS84 (which is short for the world geodetic system 1984 standard) is the projection used in most mapping applications and by the Global Positioning System or GPS.
4. Coordinate Reference Systems
Setting the coordinate reference system for a GeoDataFrame tells geopandas how to interpret the longitude and latitude coordinates. Distance units are also dependent on the CRS being used.
The most common coordinate reference systems are EPSG:4326 and EPSG:3857, both of which use the WGS84 projection. EPSG stands for European Petroleum Survey Group, the entity that developed these systems. EPSG:4326 is used with applications like Google Earth, while EPSG:3857 is used in most map applications.
5. Creating a geometry column
Geometry is a special data structure, and is a required component of GeoDataFrames.
If you have longitude and latitude, you can use the geopandas .points_from_xy() method to create the geometry column.
Here are the first four rows of the schools DataFrame. We can create a Point geometry from Longitude and Latitude.
Now the schools data has a geometry column and is ready to be used to build a GeoDataFrame.
6. Creating a GeoDataFrame from a DataFrame
To construct a GeoDataFrame from the schools DataFrame, use the GeoDataFrame constructor, passing it the schools DataFrame, the crs to use, and the geometry to use.
Here we have set the schools_geo CRS to the epsg:4326, and we've specified the geometry column that we just created as the new GeoDataFrame's geometry.
We can look at the first four rows and see that the schools_geo data is identical to the schools data. Only the datatype has changed from a DataFrame to a GeoDataFrame.
7. Changing from one CRS to another
Notice that the schools_geo geometry uses decimal degrees to measure distance from the reference points: the Prime Meridian and the Equator. You can convert the geometry to measure distance in meters, using the .to_crs() method.
Here we convert the geometry column of schools_geo to EPSG:3857. The resulting measurements are in meters. Note that the original Longitude and Latitude columns remain in decimal degree units. .to_crs() only works on the geometry column.
8. Let's practice!
Now it's time to put your new knowledge about coordinate reference systems and constructing geometries into practice!