Usually this blog is about things made here. This post isn’t.
Shapely and Fiona are essential Python tools for geospatial programming written by Sean Gillies. Use them instead of ESRI’s Python toolchain. They are free, stable, and mean you can post your code on GitHub and nonrich people will be able to run it.
Recently Brian Timoney and James Fee have been writing about how geospatial work is more and more programming and less GUI-driven operations in the ArcGIS mold. They’ve been pointing a lot to Python for this.
To make this a bit more concrete, here are some quick recipes of how to do things with Shapely that I have recently done and may be useful.
The shapefile are a historical format that has outlived its term. I do not endorse them philosophically, but as they are the lingua franca of most open source and closed source tools, and are easily transmuted into anything with ogr2ogr, I will write about them as if they are okay.
A sidenote: this toolchain will work on Windows, but I’d strongly recommend against it. I will not attempt to describe how to install this stuff on Windows, since that road is filled with tears. I am sorry.
Python is an open-source programming language. It doesn’t have anything to do with GIS itself, but has become one of the key languages to use for GIS. This is mainly because it’s very commonly available and integrates well with the C++ code which forms the basis of a lot of GIS functionality (GEOS, Mapnik, and OGR are written in C++).
Plus, Python is known for being pretty easy to use - MIT switched to using it and it’s a language that doesn’t bother you too much with its own baggage or complexity.
Shapely does manipulating and analyzing data. It’d based on GEOS, the standard library for doing that kind of thing, that is very fast. With Shapely, you can do things like buffers, unions, intersections, centroids, convex hulls, and lots more. It does it all quite efficiently.
Fiona does reading and writing data formats. For this it uses OGR, the most popular open-source conversion system. OGR is extremely powerful and supports many, many formats - it’s used by Mapnik, a tile rendering engine, to support more types of data, and used by people like me every day to convert formats.
The go-between is a really simple format called Well-Known Text and a slightly more efficient format called Well-Known Binary. This lets Shapely worry about tricky spatial calculations and Fiona worry about tons of file formats, and work together efficiently.
Why not use GEOS or OGR directly? GEOS, for one thing, doesn’t provide bindings to Python, so you’d need to use C++. You don’t want to write C++ for data-munging scripts. OGR does provide Python bindings, but they’re extremely ‘un-Pythonic’ - they don’t behave like other things in the language, and they’re very error-prone.
So, in a typical script, you would use Fiona for input and output, and use Shapely for making and manipulating geodata. Got it? Let’s go.
Okay, so you have a CSV file called
some.csv with latitude and longitude values, and you want to turn it into a Shapefile.
name,lat,lon Chicago,41.88-87.63 Kansas City,39.101,-94.584
First: grab the documentation to Python’s CSV reader. It’s a good one, and pretty simple to use. Using one of the code examples on that page, you can make
This isn’t meant to be a Python tutorial (use codecademy for that), but the elements are as such:
shapelylibrary, by saying
for x in y, where
ycan be a list of lines in a CSV file, or shapes, or numbers, or anything that’s ‘iterable’
withstatement is used when you’re opening a file for a while. This code keeps
some.csvopen for as long as you need to print out each row, and then closes it.
Save this as
python process.py yields the output:
Next up, making points. Import Shapely’s idea of a point with
from shapely.geometry import Point and then make some - it takes coordinates in longitude, latitude order, and since CSV is implicitly as text rather than numbers, we use Python’s
float() function to convert:
Okay, now to save those points. Let’s bring in Fiona, and save these points to a shapefile.
Pretty simple, right? You define the kinds of features you’re putting in,
'Point', the properties they’ll have, and then it’s as simple as opening an output file and calling
output.write with a feature object.
Next up: classic GIS operations. How about buffering points? First, let’s use our previous calculation as input (so you should have
Sidenote: these tools work in native projections. The projection we’re using here is EPSG:4326, so we’re working in latitude and longitude. This is why the buffers we create will be odd-looking on a map that uses the spherical mercator projection.
That’s readable with Fiona too:
And you can go over each features in that
input and turn it into a shape that Shapely can read:
And Shapely provides a nice buffer method which you can use on nearly any kind of geometry - just call
shape.buffer(1.0) or any other radius. So, all together:
Beautiful, right? Here’s that script as a gist.
I hope this is something of a tantalizing introduction to this toolchain. This kind of code is not only neat because it’s open-source and free, so you can share it with other people and use it on machines without expensive licenses, but also because it turns things that are manual into things that can be reusable.
If there’s some bunch of operations you do often, you can use Python’s sys.argv and instead of hardcoding filenames into a script, you can use its values:
Then you can call the script like
python script.py foo.shp and easily run it on directories of files.
The full documentation for this code is in the Shapely manual.