Quite often I get sent / send shapefiles (zipped up mostly). I wondered if there was any way of reducing the filesize, sometimes these files can get quite big!

In this notebook I wanted to run some checks / experiments on whether I could reduce the size of a shapefile. I tried various options, simplifiying the geometry, trying to assign a field as categorical field, running a memory reducing function from a kaggle competition. But the only thing that really worked in the ultimate file size of a written shapefile was… only writing the columns of data I needed.

Perhaps this work may be of some use to memory optimisation – especially if you have an incredibly large dataset, there are some amazing memory savings to be had!

I’ve added comments where needed to the code here. I hope this has been of use.

