I have written a lot in the past about using Python for GIS and Earth Observation. If you have not used Python before and you are looking to get started here are a few recommendations to get you up and running. This blog post is for anyone who is new to programming with Python. There will be a second post that will give an introduction to Jupyter Notebooks.
This post is a series of steps to how I would set up Python 3 today, with Windows 10 as my operating system.
- Python environments can be confusing.
Image from https://m.xkcd.com/1987/
I have found, if you are completely starting from scratch, that anaconda is the best place to start. Use the link below to download.
2. Install Python 3. Why? The Python 2.7 clock is ticking https://pythonclock.org/ – don’t get caught out in 2020 as 2.7 will no longer be maintained. You can have both Python 3 and Python 2.7 running on your machine if you want to make things more complex (see xkcd cartoon above) – but unless you have to, don’t. To make life easy add a link to the anaconda command prompt on your desktop for ease of access.
3. Assuming that you have installed anaconda from above, use this cheat sheet to familiarise yourself with the common commands. They are mostly simple and intuitive, however any help is always useful.
4. Why anaconda? Well, I think that it has all the packages that you need for Geospatial programming. Below is a list of the packages available, but first here is a list of commands to get you up and running:
conda install -c conda-forge gdal conda install -c conda-forge opencv conda install -c anaconda scikit-learn conda install -c anaconda scikit-image conda install -c conda-forge rasterstats conda install -c anaconda scipy conda install -c anaconda rasterio conda install -c conda-forge geopandas conda install -c anaconda psycopg2 conda install -c conda-forge shapely conda install -c anaconda netcdf4
This is loosely what they do:
- gdal – open source ‘geo’ processing
- opencv – computer vision
- scikit-learn – machine learning
- scikit-image – image processing
- rasterstats – statistics on rasters
- rasterio – reader / writer of geospatial rasters
- scipy – scientific python
- geopandas – geospatial data in python
- psycopg2 – connecting to Postgres / PostGIS
- Shapely – working with manipulation and analysis of planar geometric objects. Will also install fiona (read / write geospatial files)
- netcdf4 – read and write .nc files
When installing these Python libraries other libraries may also be installed at the same time. These are know as dependencies for the original library. For example the fiona library will install at the same time as the shapely library.
5. Sometimes the package you are looking for is not available in anaconda. That can be frustrating for beginners; we have all been there. However, anaconda allows packages to be installed via pip install. If after you have searched for the package in anaconda using the following command but the library has not been found…
conda search PACKAGENAME
… you can try the following command:
pip install PACKAGENAME
For more information about managing packages in anaconda have a look here.
6. You can check to see if a package is installed by opening up the anaconda command prompt and typing:
which will tell you the version of Python you are running and then:
If it runs without an error you have installed that library.
If it does error then restart and run the associated anaconda install command from step 4.
— update December 2018—
Not all packages follow
Where this is not true I have added the correct import statements below (please don’t type the text in brackets)
import sklearn (for scikit-learn)
import skimage (for scikit-image)
import cv2 (for opencv)
— end of update —-
7. Finally, while not an anaconda step I strongly recommend installing these two pieces of software:
And while you are at it you may as well get set up with Postgres / PostGIS (after all you installed psycopg2, right?) :
And a simple guide:
We have taken a very quick walk through setting up Python on a windows computer and installing the required libraries in order to become a Python Geospatial professional. Anaconda is definitely not the only way to go, though experience tells me this is the easiest for a beginner. The reason for this, for me at least, was that installing libraries like rasterio and shapely would never seem to quite work on a ‘standard’ python install. I did get them working in a virtual environment, which is an excellent way of doing testing. The anaconda way is the best way I can see to quickly get set up. I recommend attendees of my courses to install it.
It is worth noting that packages are being updated all the time (as is Python). You may notice that when you are installing a library it wants to downgrade a package. This is quite normal behaviour and for the most part should not cause you any concern as you are ultimately building an environment optimised for you. In a Python interpreter you can always check the version of a library you are using after import by running the command:
If I have missed anything in the above please give me a shout and I will update. I am happy to log any differences between operating systems as well.
This series is really meant to be for complete beginners who have no familiarity with these environments. If you have any comments / questions or suggestions leave them below or get in contact with me on my website.
In the second part I will post a beginners guide to using jupyter notebooks.
I am a freelancer able to help you with your projects. I offer consultancy, training and writing. I’d be delighted to hear from you. Please check out the books I have written on QGIS 3.4
I have grouped all my previous blogs (technical stuff / tutorials / opinions / ideas) at http://gis.acgeospatial.co.uk.
Feel free to connect or follow me; I am always keen to talk about Earth Observation.