Identifying buildings on medium resolution Satellite data using Monteverdi software

buildings classified

This is the final part in a series on using Planet’s Open California dataset. I’ve summarised it all here. Detailed mapping of building footprints is fast becoming one of the key challenges/uses for very high spatial resolution Earth Observation data today. Being able to accurately acquire these footprints remotely and at speed is where the edge of Earth Observation science is at presently. This is being done using vast amounts of training data and machine learning on high powered computers. With that in mind let’s take a look at medium resolution data and unsupervised k-means classification.

It doesn’t seem possible to accurately map buildings at this resolution on the data set I am using. It is possible to identify a reasonable amount of buildings though and that is a promising start for medium resolution mapping. By medium resolution I am referring to 2-20m pixel size.

In this blog I am going to take a look at Monteverdi to classify my pan-sharpened Sentinel 2a image with Planet data. The aim is to try and classify buildings and then convert them to polygons.

Monteverdi in its own words:

Monteverdi is an application for capacity building to provide simple remote sensing data analysis tools for non-experienced users.

The easiest way to install Monteverdi (on Windows) is perhaps through the OSGeo4W installer. Once installed you can run the program from within the OSGeo4W folder.


Opening the satellite image in the viewer.


There are various tools for the classification of satellite data. For example you can build classes for supervised classification or (as in this case) run a KMeans clustering (unsupervised) classification.


Change the parameters to suit your needs. The results can be saved as a tif file. I like to view this data in QGIS.


The buildings are in grey and, admittedly, a little difficult to see in the image above. Let’s zoom in and just show the buildings – image below.


It’s a reasonable result. I would expect a supervised classification to produce better results. To convert this to a vector run gdal_polygonize.

gdal_polygonize inraster.tif outvector.shp

Watch out for loss of projection. I wrote about a solution to this previously here. The resulting vector file can get pretty big. All the polygons classified as buildings are shown in yellow.

buildings classified

I need to filter this data twice, firstly to only get the buildings and secondly to filter on areas removing any tiny polygons (that are not big enough to be a building). You will need to add an area field to then calculate the area. I did this in field calculator in QGIS using the $area operator on the field.

I filtered on building (DN = 5) on my outvector.shp file to create buildings.shp in ogr2ogr

ogr2ogr buildings.shp outvector.shp -where DN=5

Next, I filter buildings.shp on areas greater than 50sqm in ogr2ogr

ogr2ogr filteredbuildings.shp buildings.shp -where area>50

This gave me 11,271 objects (buildings)

Buildings from satellite

There is still some noise on this data; the golf course on the western side and the Franklin Canyon Reservoir have erroneous buildings. But it’s not too bad for medium resolution mapping. The population of Beverly Hills in 2010 was 34,109 from this wikipedia article. Which means approximately three people to each building, that is if my buildings count is correct 🙂

Series on Open California, including overview, code, ML, classification is here

I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) are here

I am @map_andrew on twitter