Community GBDX Notebooks

GBDX notebooks are a great way of acessing a vast array of satellite data. You can get yourself a trial account here:

https://notebooks.geobigdata.io/

No more downloading satellite imagery, just process it in the cloud. When you think about the sheer volume of satellite data that Digital Globe has and its size to download, processing it on their servers is pretty convincing. However, if you want to use it past the evaluation phase and on Worldview data then there is a cost involved. Minimum price at the time of writting is $100 per month, which gives you 1,000 sq km of imagery each month with 8gb of notebook memory and 20gb of storage (plus some other things). If you were trialling an idea using Earth Observation data this seems like a cost effective way of testing it out.

So give it a go.

However…

I saw this tweet while following #satsummit.

I don’t know how ‘new’ this tier is but Open data is cool. It is not just Sentinel and Landsat that are available on the Open UI but also IKONOS data.

IKONOS was deactived in March 2015, so while the data will not necessarily be ‘current’ this sensor was operational for over 15 years and produced high resolution imagery during that time. The catalog must be pretty large. Again a convincing starting point for testing on VHR (very high resolution) data.

An Introduction to Community Notebooks

From the GBDX notebooks page this is what you get for ‘free’:

A restricted free version of GBDX Notebooks with access to Ikonos, Landsat, and Sentinel-2 imagery types. Community work is public to all users under an MIT license.

Plus

  • 6 GB Notebook Memory
  • 20 GB Notebook Disk Space

Any work that you do on these notebooks is publically available. The benefit of this is that there are many great notebooks you can study and use for your own challenges and studies. In fact there are loads of tutorials on how to access and process satellite data!

At the time of writting there are the following:

The applied use cases are particularly excellent. Within the platform you can click on Details to view the notebook and step through, or you can click on ‘Clone and Open’ which will clone the notebook to your account library and open it as a fully accessible Python Jupyter notebook. This means that you can run the code for yourself or make adjustments etc. The final thing to note is that a huge effort has obviously gone in to documentation in these notebooks. It really helps to know what is going on in each code block. It makes a massive difference to a user.

Creating your first Notebook

I am going to assume that you have set up a community (or higher level) account. To create your first notebook click on + New Notebook and select Start with Imagery.

This opens a notebook called ‘Introduction to Image Access with GBDXTools’. It steps you through the process of finding an image, reprojecting images, clipping and viewing images, parsing them to numpy arrays and saving images as Geotiffs. You can adapt this notebook and save it if needed.

Creating your own notebook: an example using Sentinel 2 data

Click on + New Notebook and select Blank. This will open an empty notebook that looks like this:

If you are not familiar with Jupyter Notebooks then have a look at this guide I recently wrote.

Python for Geospatial work flows part 2: Use Jupyter Notebooks

Getting imagery in

Some really nice work has been done here to make this as easy for you as possible. In fact you do not have to write any code at all. Click on Open underneath the Imagery Button. This will open a world map on the left hand side of your screen. Start zooming into an area you are interested in. In this example I am zooming into Dover, UK. When you are zoomed close enough, a draw bounding box icon will appear in the map. Click on this to draw an AOI and then click search.

All the available imagery will appear beneath the map. You can filter this based on cloud cover. Choose an image and click on ‘Insert Code For 1 Images’.

By clicking on the button the first code box in Jupyter Notebook will be populated automatically with your image request. In my case:

from gbdxtools.task import env
from gbdxtools import CatalogImage

bbox = env.inputs.get('bbox', '1.2840270996093752, 51.0832371168014, 1.4069366455078125, 51.14639813691828')

catalog_id1 = env.inputs.get('catalog_id1', 'e89d5a29-1119-5c0a-a007-a03341d5bc48')
image_id1 = CatalogImage(catalog_id1, bbox=map(float, bbox.split(",")))

This is really helpful as it automatically gets me the catalog id value. Now to display this code all I need to do is plot it. Use the code below:

%matplotlib inline
image_id1.plot(w=10, h=10)

Pretty neat.

Now we have plotted our image we can do pretty much anything we wish to do. I have run an unsupervised cluster analysis on one band and then extracted one of the clusters as a mask.

First off, get a single band.

import numpy as np
print (image_id1.shape)
## get the first band
band1 = image_id1[0,:,:]
print (band1.shape)

Then you can plot it. After that run the following clustering code, that I wrote about here in more detail previously. In short though, we need to reshape our array to be parsed into the cluster algorithm. Then, post clustering, reshape it back to the single band image dimensions.

from sklearn import cluster


X = band1.reshape((-1,1))
print (X.shape)


k_means = cluster.KMeans(n_clusters=8)
k_means.fit(X)

X_cluster = k_means.labels_
X_cluster = X_cluster.reshape(band1.shape)

If the code above does not work in the environment then run the pip install command below and run again.

!pip install sklearn ### to install sklearn

Plot the image to see your classification. Finally, in this example, set all the values where the class value is >= 7 to 1 and everything else to 0 and then plot.

temp = np.less(X_cluster, 7)
np.putmask(X_cluster, temp, 0)
temp = np.greater_equal(X_cluster, 7)
np.putmask(X_cluster, temp, 1)

print (X_cluster)

plt.figure(figsize=(20,20))
plt.imshow(X_cluster)

plt.show()

Publishing your work

When you have finished and commented the code (it helps others understand), click on the Publish Notebook button as shown below:

Then add a description and tags (so others can find it).

And here it is:

https://notebooks.geobigdata.io/hub/tutorials/5bc45bf7e9c92b5d7a00f914?tab=code

I have also added the code to my Satellite_Imagery_Python series on GitHub here

Recap

In this post I looked at Community GBDX Notebooks. In this guide I showed you:

  • Where to get a community account
  • How to load imagery in using the intuitive interface that automatically writes the code for you
  • How to plot a Sentinel 2 10m resolution image
  • How to extract 1 band and cluster it
  • How to mask this cluster using numpy
  • Finally I showed you how to share your work

Final thoughts

Python with Jupyter notebooks is the language of Earth Observation / Satellite imagery today. This is such a simple way to interact with the data, once you build up experience with the language. The community notebooks are an incredible resource; huge kudos to Digtal Globe for making so many well documented guides available.

There is a really great intro video to analysis-ready data at Digital Globe from the ARD and STAC interoperability workshop (huge thanks to Radiant Earth Foundation for releasing these) here:

Paraphrasing from the above video

“You can’t do too much in a notebook but enough to ‘experiment and learn’… When you deploy a notebook into GBDX it becomes the task.”

This is a rapid Python prototyping environment for satellite data. You don’t have to download the data, just start using it.

 

I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you.

I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) at http://gis.acgeospatial.co.uk.

Feel free to connect or follow me; I am always keen to talk about Earth Observation.

I am @map_andrew on twitter