GeoPandas and PostGIS

  • blog

In late June GeoPandas 0.8 was released.

This release has made massive steps in Input/Output (IO). A compelling thing about GeoPandas is its simple way to read a GIS vector dataset (Shapefile, GeoJson etc). It’s one line of code, or two if you include the import statement.

import geopandas as gpd
gpd.read_file('your_shapefile.shp')

Now, with the new release, read and write to Post GIS is possible; I think that is huge. Here is a simple guide to getting setup and running.

Setup

Set yourself up with PostgreSQL and PostGIS. I am running Windows 10 OS, I have installed PostgreSQL 12 (download) and PostGIS 3. The best guide to stepping through the step up is on the bostongis page here. You can get up and running on Windows via the wizard. There are loads of excellent blogs on PostGIS – I highly recommend this one for a fantastic introduction.

Update your version of GeoPandas. On conda use:

conda install -c conda-forge geopandas

or via pip:

pip install geopandas

Or, if you already have GeoPandas installed, try:

pip install --upgrade geopandas

And, because we are going to connect to PostGIS, we will need to have either psycopg2 or GeoAlchemy2 installed (see dependancies in the GeoPandas docs). For this blog I am using GeoAlchemy2 and you can install this via pip:

pip install geoalchemy2

Once you are done, open your Python prompt or Jupyter Notebook and type:

import geopandas as gpd
print(gpd.__version__)

You should get 0.8.0 back:

GeoPandas and PostGIS

Now we should be good to go. Assuming all is installed correctly then this block of code will send a shapefile called boundaries to a database.

from sqlalchemy import create_engine
import geopandas as gpd
engine = create_engine('postgresql://user:password@localhost/database')
gdf = gpd.read_file('Boundaries.shp')
gdf.to_postgis(name="bounds", con=engine)

The first two lines are import statements. The third line is your connection to your database. I have highlighted in bold the bits you are going to need to change to make this work for you. I have a database called spatial and, in my case in pgAdmin4 (installed as part of PostgreSQL), my bounds looks like this:

if you click the ‘eye’ icon next to geometry you should be able to visualise this data – really neat trick.

We now have our data in PostGIS. Let’s extract this out using GeoPandas. As you probably have guessed, its only a few lines of code:

sql = 'SELECT geometry as geom, id FROM bounds'
gdf_bounds = gpd.read_postgis(sql, con=engine)
gdf_bounds.plot()

The first line is a simple SQL statement, the second reads it into GeoPandas using the connection engine we previously created and the third plots it up.

That’s it. GeoPandas again makes IO operations with the minimum of code. The code for this tutorial is on my Github page here

I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you.

Feel free to connect or follow me; I am always keen to talk about Earth Observation.

I am @map_andrew on twitter