Skip to content

Colaboratory notebooks and GDAL

  • blog
colaboratory satellite

colaboratory satellite

Last year I heard about Colaboratory by Google and, now that I am using Jupyter Notebooks, it seems the perfect opportunity to explore it further. I previously wrote about how using Jupyter Notebooks is a perfect match for Satellite imagery processing. If you would like to read about that then the post is here. Otherwise, let’s take a look at Colaboratory and see if GDAL will play nicely with it.

Colaboratory is a Google research project created to help disseminate machine learning education and research. It’s a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.”

There is a great deal of potential to use this for data science and of course for anyone involved in Earth Observation, Python, image processing and Machine Learning. You need to sign up and have a Google account and then you are good to go. It is pretty powerful and you can also use a GPU accelerator, in combination with Tensorflow.

## example from the tensorflow colab tutorial
## https://colab.research.google.com/notebooks/welcome.ipynb#scrollTo=mwdQ1INEZKkb

import tensorflow as tf
import numpy as np

with tf.Session():
  input1 = tf.constant(1.0, shape=[2, 3])
  input2 = tf.constant(np.reshape(np.arange(1.0, 7.0, dtype=np.float32), (2, 3)))
  output = tf.add(input1, input2)
  result = output.eval()

result

Check out this brilliant post for one of the amazing things you can do with Colaboratory:

https://medium.com/deep-learning-turkey/google-colab-free-gpu-tutorial-e113627b9f5d

 

GDAL and colaboratory notebooks – this is a guide for Python 2.7

I have written up a Python notebook that you can download from my github page to setup a colaboratory notebook with GDAL, load an image and display it. GDAL can cause problems to install so hopefully this guide will take away some of the pain of setup and leave you to get on with the data science.

  1. As this will be running on a ubuntu server, run apt-get update
  2. Run apt-get install libgdal-dev – run this before trying to install GDAL otherwise you will get errors. The flag -y accepts the download and runs – this is the longest part (but not too long)
  3. Install python-gdal again with the -y flag
  4. Install NumPy and SciPy (not compulsory, but why not?)
  5. Test the installation by importing GDAL. No error? It should be ok.
#Step 1
!apt-get update
#Step 2
!apt-get install libgdal-dev -y
#Step 3
!apt-get install python-gdal -y
#Step 4
!apt-get install python-numpy python-scipy -y
#Step 5
import gdal ## fingers crossed!

Now that GDAL is installed, next challenge is to get some data in. There is a guide to do it here. The way that works best for me is to clone a git repository. For this example you can clone my Geospatial course example here.


Update 26th September 2018

On further experimentation I have found these command to also install gdal, and in fact has become my preferred way.

!apt update
!apt upgrade
!apt install gdal-bin python-gdal python3-gdal ## this should replace step 2 and 3 above

You could try these instead of the above. Email me if you find a better way! info@acgeospatial.co.uk

Anyway back to the blog…


 

The commands below tell you what to do:

# list the current files - it will return datalab
!ls 
# clone my github geospatial programming course
! git clone --recursive https://github.com/acgeospatial/Geospatial_Course_Example/
# check to see if it has installed
!ls # should return a 2 folders
# access the data
cd Geospatial_Course_Example
# list the files in this folder
!ls # should return 4 files one of which is a .jp2 image

Now just access the data like you would with any other piece of code. For example if you want to display one band of the image with a ‘hot’ colour map and a colour bar then run the code below.

import matplotlib.pyplot as plt
raster_ds = gdal.Open("L2A_T30UXB_20170102T111442_TCI_60m.jp2", gdal.GA_ReadOnly)
image_gdal = raster_ds.GetRasterBand(1).ReadAsArray()
print image_gdal.shape # if you want to see it as a np array

plt.imshow(image_gdal, cmap = "hot", interpolation='nearest', aspect='auto')
plt.colorbar()
plt.show()

Hopefully this very quick guide will save you some time setting up your colaboratory notebook! You can enable the GPU by selecting from the Runtime menu bar and then ‘Change runtime type’. I think these run time environments are up for 12hours of computing time which should be enough to do some pretty cool things.

colaboratory satellite

Recap

  • Colaboratory access – you need a google account
  • GDAL can be hard to install so follow the steps above for pain free access to an incredibly powerful tool
  • Get data loaded in with the command clone a Git repo
  • Use the notebook as you would any other.

This is a Python 2.7 guide and not tested on Python 3; I’d be keen to hear about your experiences with this. I am indebted to countless resources online over the years that I have used to set up GDAL. Today OSGeo4W makes things so easy on windows. This is my very small contribution to hopefully help someone get up and running after I have been helped so many times in the past.

This code is on my GitHub as a Jupyter Notebook

I am trying to build up a series of useful lightweight satellite processing notebooks that could be used or scaled on a variety of projects. If you want to use the code, feel free; I will keep them here.

I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you.

I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) at http://gis.acgeospatial.co.uk.

Feel free to connect or follow me; I am always keen to talk about Earth Observation.

I am @map_andrew on twitter