1. Working with raster data
Remember from the first chapter of this course that we have two main types of geospatial data: vector data and raster data. In this course, we have learnt a lot about vector data.
But, often you will encounter both vector and raster formats and will have to combine them. Although it is out of scope to go in much detail, in this final lesson, we want to give you some basic tips to read, plot and combine raster data with vector data.
2. Raster data
Raster data represent the world as a grid, where each pixel in that grid takes a continuous or discrete value.
3. Raster data
An example of continuous data is the chance on rain map shown here.
On the right, we show an example of a raster with discrete pixel values: different land cover types in the USA.
4. Raster data with multiple bands
Raster data can have a single band, where each pixel has a single value, as in the previous examples.
Or, they can have multiple bands, as is typically the case for satellite images, shown here.
5. The rasterio package
We will interact with raster data through the rasterio package.
This package provides a pythonic interface to the GDAL library, which contains state-of-the-art functionality to deal with raster data. Rasterio can read and write many raster file formats, and is also capable of processing rasters in several ways.
6. Opening a raster file
Opening a raster file is done by calling the "open" method of rasterio. However, unlike the vector case, this does not yet read any data into memory. Only metadata are read at this point.
This allows us to peak into the file and see, for example, how many bands the file contains;
or how many pixels the raster is wide or high.
In this example, we are reading a raster file with elevation data of the full world. It contains a single band for the elevation.
7. Raster data = numpy array
If we want to read the actual data, we can call the read method of the open raster.
This operation reads into memory all the raster data and stores it as a numpy array. This is interesting, as once we have the data as a numpy array, we can apply most of the familiar Python stack to raster data.
8. Plotting a raster dataset
Visualizing a raster file is just as easy thanks to rasterio. Passing the source object (what is returned by the open method) to the show method will create a matplotlib figure. Remember, we see here elevation data.
9. Extracting information based on vector data
Given this raster of the elevation, we might want to know the elevation at a certain location or for each country.
For the countries example, as shown here, we want to extract the pixel values that fall within a country polygon, and calculate a statistic for it, such as the mean or the maximum.
Such functionality to extract information from a raster for given vector data is provided by the rasterstats package.
10. Extract raster values with rasterstats
For extracting the pixel value for points, we use the point_query function, passing it the GeoSeries, the path to the raster file, and the interpolation method.
For extracting the pixel values for polygons, we use a similar function called zonal_stats. Here we can specify the statistics to compute.
But let's look at an example with the elevation data and the countries.
11. Extract raster values with rasterstats
We pass the country geometries and the path to the raster file to the zonal_stats function, and we specify that we want to calculate the mean of all pixel values that fall within a country.
If we then assign the result to a new column of the DataFrame, we can show the countries with the largest average elevation.
12. Let's practice!
Let's do some final exercises.