If you are a Geospatial Programming beginner, one of the quickest ways to get started (once you have grasped the basics) is to plot some data. Data analysis often begins with inspecting and plotting data. It is a great way to check that you have the data correctly loaded. With geospatial data we have more than a few choices: we can inspect the metadata (perhaps learning something about the location or position of the data), print the data table associated to a feature or features, we can plot a histogram of raster values and of course we can physically plot the data visually.
This blog is all about trying to show this in the simplest possible way and at the end, as a bonus, we will use the vector data to clip the raster data. I am going to be using two core Geospatial Libraries – Rasterio (for Raster data) and GeoPandas (for Vector data). I will be using Python 3.7.4 for these examples. I have found that Python 3.6.x and Rasterio on the JPEG2000 file has a read problem, so go with 3.7 if you want to follow along.
As ever, this code is available as a Jupyter Notebook here – this is the best way to learn; go ahead and download the code.
First off, get the imports we are going to use.
import numpy as np import geopandas as gpd import matplotlib.pyplot as plt import os import rasterio from rasterio import plot from rasterio.plot import show from rasterio.mask import mask import os
I have downloaded a Sentinel 2b level 2a image acquired over the UK on 26th February 2019 and stored it on my computer. It unzips to show various folders, subfolders and files. In this example I am going work with the 10m True Colour Image (also known as TCI). In this first code block, search the folder for this file.
for root, dir, files in os.walk(r'.\S2B_MSIL2A_20190226T111049_N0211_R137_T30UXB_20190226T172349.SAFE'): for file in files: if file.endswith("TCI_10m.jp2"): TCI = (os.path.join(root, file)) print(TCI)
We now have a variable (it’s a filename) called TCI. We will use this variable (filename) to plot. Our data will be called RGB and first off we open it using the JP2000 driver and then we plot it using the rasterio.show method. Note that the transform method assigns the location of the raster in the plot.
rgb = rasterio.open(TCI, driver='JP2OpenJPEG') #RGB show(rgb.read(), transform=rgb.transform)
That is a nice start! In the notebook I have added the code to make this plot a little larger. Next open a vector file using GeoPandas and plot. In this example I am using a geojson file, but you can use any vector file format supported by GeoPandas, including a shapefile.
gdf = gpd.read_file('boundary.geojson') gdf.plot()
gpd.read_file reads in our vector data as gdf (a GeoDataFrame object) and gdf.plot() plots it.
It’s not that impressive as it is, just a simple rectangular boundary. You can try print(gdf.head()) to inspect the first five rows of the tabular data (there is only one row in this case though). Now let’s plot them together. Note that both datasets have the same projection: in the notebook I show how to get this information, but for this simple example, take it from me they ae both in EPSG 32630.
fig, ax = plt.subplots(figsize=(12, 10)) show(rgb.read(), transform=rgb.transform, ax=ax) gdf.plot(ax=ax, color='white', alpha=.75) ## alpha is the transparency setting plt.show()
In the first line of code here I am creating a figure with a size of (12,10) and an axis object (ax). In the second line I am plotting the image (this will be at the back, try switching line 2 and 3 around if you are not sure) and setting the image in the figure. In line 3 I am calling the GeoPandas plot method and getting it to appear on the same plot as the raster, setting the colour as white and the alpha value (read transparency setting) as 0.75 so you can see the image beneath it. Finally line 4 plt.show() to plot the vector on top of the raster.
And that is it plotting the two datasets together.
What if we just want to plot the area in the slightly transparent white box? We can do this using the rasterio.mask method. If you are a geospatial person you probably would describe this as clip.
masked, mask_transform = mask(dataset=rgb,shapes=gdf.geometry,crop=True) show(masked, transform=mask_transform)
Rasterio creates two objects from the mask method, the image (masked) and the location or position of that clipped image (mask_transform). We need to tell the mask method what data to mask (rgb) and the shape to mask it to (gdf.geometry) and finally we parse crop=True to crop to the data. The last line plots the data.
Now that we have clipped or masked (depending on how you prefer), let’s save this file as a Geotiff. This is nice and easy with Rasterio – the trick is to get the meta data correct. In this example we will ‘borrow’ and update the meta data from the orginal jpg2000 file.
profile = rgb.meta WIDTH = masked.shape ## get the dimensions of the image we are writting out HEIGHT = masked.shape profile.update(driver='GTiff', transform=mask_transform, height = HEIGHT, width = WIDTH) print(profile) ## check on the updated profile
In the first line create a new variable called profile and assign it the .meta properties of the rgb (remember it is a rasterio object). Next we need the dimensions of our new raster (if you are not sure what mask.shape related to then print it out). Once we have these dimensions we can parse these updated parameters using profile.update. Now we are ready to write out the new clipped raster as a Geotiff.
with rasterio.open('clip.tif', 'w', **profile) as dst: dst.write(masked)
The nice thing about Python (and perhaps other languages?) is that the construct of the above syntax is very similar for the read and write of data. In this example, create a file called ‘clip.tif’, set it to write (the ‘w’ flag) and add the profile parameters. We open this file as a Rasterio object called dst and then write the clipped data we have just created (masked). That is it – we don’t have to call a close method as we are using the with statement.
If you like you can read this image back in and plot it (with the same code as we used at the start) or just go ahead and use it in any geospatial software.
I hope this, while rather simple, guide is helpful to someone. If you are just learning programming getting little quick wins like read/plot/write data can make a massive difference to your confidence. The code is available here to download.
I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you. Please check out the books I have written on QGIS 3.4
I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) at http://gis.acgeospatial.co.uk.
Feel free to connect or follow me; I am always keen to talk about Earth Observation.