Image reducer in Python

fields

Google Earth Engine (GEE) has a very nice feature called ‘image reducer’ and, frankly, it is incredibly useful. Say, for example, you have a field boundary and you want to know the mean, median, maximum and minimum NDVI values for it. In GEE you can use the reducer to get these values. You can also loop through multiple fields and get the values AND you can loop through multiple images to get the values. You can get the ‘stats’ for any image or any area that you parse in: DEM data, raw satellite bands, products from satellite bands, or in fact any raster data set. It is very useful.

I have wondered for a while if that was possible to do in Python. The advantages for getting this information locally for images that I am working on and plotting the results make for a convincing case. I see this as another step in looking at a satellite image without ever looking at a satellite image. What could be the use here? Tracking NDVI in a field over time? Or better multiple fields? Or, what about extracting values at point locations and then saving them back to those points?

It can all be done with rasterstats, a Python Library by perrygeo

Install using pip

pip install rasterstats

Let’s take a look at an example

Aim: To get the average NDVI values for points and then for four fields.

I’ve created 2 NDVI images at different times in the year. They are shown below. It doesn’t matter what images you use. I am using single band images but you could loop through multi band images.

2fields

First up, let’s read a point Shapefile and get the points in wkt format (POINT X Y) as this is what rasterstats takes as an input. Open up a Shapefile using ogr and loop through the features extracting the centroid, returning the XY.

from osgeo import ogr
import os

shapefile = ".../points.shp"
driver = ogr.GetDriverByName("ESRI Shapefile")
dataSource = driver.Open(shapefile, 0)
layer = dataSource.GetLayer()

for feature in layer:
    geom = feature.GetGeometryRef()
    print geom.Centroid().ExportToWkt()

Using rasterstats you need your image and the centroids. The code is below.

from rasterstats import point_query

point = "POINT (725233.602128884 5664901.89211738)"
point_query(point, ".../NDVI_example_1.tif")

It is that simple; define the point and then using point_query parse the point and then the image. It will return the value at that point. If we add this to the above code we can loop through all the features and get the raster values.

shapefile = "../points.shp"
driver = ogr.GetDriverByName("ESRI Shapefile")
dataSource = driver.Open(shapefile, 0)
layer = dataSource.GetLayer()

for feature in layer:
    geom = feature.GetGeometryRef()
    point = geom.Centroid().ExportToWkt()
    print point_query(point, "../NDVI_example_1.tif")

Make sure your projections match!

What about polygons?

This time we will use zonal_statistics.

from rasterstats import zonal_stats
stats = zonal_stats(".../polys.shp"".../NDVI_example_1.tif")

print [f['mean'] for f in stats]
print [f['max'] for f in stats]

We parse the polygon shapefile (polys.shp) and the NDVI image (NDVI_example_1.tif) to zonal_stats [line 2]. Then [lines 3 & 4] print out to a list the mean for every polygon [3] and the max for every polygon [4].

If you want to run through multiple images then reuse my Shapefile count code instead looking for .tif images.

for root, dirs, files in os.walk(".../"):
    for file in files:
        if file.endswith(".tif"):
            raster = (os.path.join(root, file))
            stats = zonal_stats(".../polys.shp",  raster)
            print "filename: ", file
            print [f['mean'] for f in stats]

An example output looks like this:

filename: NDVI_example_1.tif
[0.17415108794477352, 0.6178150208015752, 0.2820635024507548, 0.4951216243519921]
filename: NDVI_example_2.tif
[0.7579913160863959, 0.6720901117092226, 0.7985192637380338, 0.6334189456158745]

These could potentially be graphed and trend lines generated.

This has been a relatively simple example, but it should be easy to scale. If you have hundreds of field boundaries and an image every month, getting out the mean is super easy with raster stats. Or you could always use Google Earth Engine. The hardest part is perhaps displaying the results in a usable way.

Recap

  • rasterstats can extract values from raster images using point Shapefiles
  • rasterstats can perform the role of the image reducer in Google Earth Engine
  • Loop through all your images and all your features to quickly get the mean, maximum and minimum values for polygons.

This code is on my GitHub as a Jupyter Notebook

I am trying to build up a series of useful lightweight satellite processing notebooks that could be used or scaled on a variety of projects. If you want to use the code feel free; I will keep them here.

I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you.

I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) at http://gis.acgeospatial.co.uk.

Feel free to connect or follow me; I am always keen to talk about Earth Observation.

I am @map_andrew on twitter