Technically, Geographic Information Systems is
computer systems, infrastructure, software, an
entire industry dedicated to "spatial data"... blah, blah, blah,
blah
Really, GIS is mostly about "data used to make maps"
Spatial data is just regular data, with location info attached.
Table of firestations in Austin - via data.austintexas.gov
Table of firestations in Austin with lat/lon coordinates - via data.austintexas.gov
Map of firestations in Austin - via data.austintexas.gov
Injecting spatial into your everyday data munging is an easy
way to punch up the impact of your products:
don't just show what, show where.
(Plus, maps just look cool.)
Spatial data comes in two flavors: vector and raster
Most likely, most of the data you'll work with (or create) will be vector data...
...But let's talk about rasters for a minute anyway.
Rasters store data in pixel grids:
each pixel (or cell) stores
a single value.*
Aerial imagery
Elevation models
Remote sensing
Raster GIS data is the same as "regular" digital images, just with location data attached.
Rasters are really good at representing information that uniformly covers a broad area (e.g., elevation).
Not so good for "discrete" entities (e.g., firestation locations, congressional districts).
Points, lines, and polygons.
Not pixels.
Just like geometric points are defined by an (x,y) pair:
geospatial points can be defined by (lat/lon) coordinates.
You've found some GIS data. Now what?
QGIS. Free & open-source - also offers a whole lot more than visual previews. Edit, query, analyze. Extend existing functionality via Python.
Mostly, GIS data is stored like any other data, just with some added "spatial" components.
Options: Use a database, serialize your data, or download a shapefile and immediately convert it into something else.
GIS data is traditionally stored in a spatial database.
That's just a regular database, plus extra spatial-enabled features:
Optional:
Spatialite. Spatially-enabled SQLite. Great for sharing data or development, probably not for production.
(Vector) GIS data can also be stored in a serialized format:
just add location data to existing formats.
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-97.740323, 30.274656]
},
"properties": {
"name": "Texas Capitol Building",
"address": "1100 Congress Ave, Austin, TX 78701"
}
}
If you can tolerate XML:
KML. Google's XML flavor. Import into Google Earth.
The most common format of GIS data right now:
most ready-made GIS vector
data from public data sources will be in shapefile format.
Reading & writing GIS data
Pure Python, reads (and writes) shapefile data. That's all.
import shapefile
shp = shapefile.Reader(my_shapefile_path)
fields = shp.fields[1:] # shp.fields[0] == deletion flag
field_names = [field[0] for field in fields]
records = shp.shapeRecords()
GDAL/OGR: Massive library of raster & vector GIS tools.
Pros
Cons
Caveat: Many Python libs still require GDAL/OGR installations and headers. Worth it, but can be a bumpy ride the first time around.
Fiona. Elegant Python API for OGR (vectors) - plus fio, a commandline tool
>>> with fiona.open('path/to/my_awesome_shapefile.shp', 'r') as collection:
>>> print(len(collection))
>>> # collections contain one object for each feature in input GIS data
>>> print(collection[1])
>>> # collections of features are iterable, GeoJSON-ready strings
$ fio info my_awesome_shapefile.shp
# json-ified data summary (e.g., feature count, bounding box coords, etc)
Supports every format OGR supports (not just shapefiles), makes reading/munging/writing GIS data a breeze.
And yes, there's a command line tool rio.
Combine Fiona + Shapely to chain reading, converting, and analysing GIS data:
>>> with fiona.open('city_parks.shp', 'r') as collection:
>>> parks = [shapely.geometry.shape(c['geometry']) for c in collection]
>>> park = parks[0] # grab the first park just for demo purposes
>>> park.type
'Polygon'
>>> park.area
20868.47980
>>> park.buffer(10.0).area
26877.85083
>>> (park.centroid.x, park.centroid.y)
(3100374.119480808, 10106879.690095564)
Geocoding services
Make maps.