Introduction to GeoJSON
Geographic data is ubiquitous in our lives. It is found in small applications like planning a vacation, and in complex worldwide topics such as population health and demographic studies. While the terms and systems involved may seem daunting at first, understanding geographic data can open up a lot of possibilities for visualizing and understanding the world.
Nearly every web application that works with geographic data uses represents it as GeoJSON. This is a short overview of GeoJSON and its structure to help make more sense of it.
In later articles in this series, I'll cover more interesting things you can do with geographic data and how to actually work with it, but this will provide a good foundation to build upon.
As the name implies, GeoJSON is just a format for representing geographic data as JSON. It defines six types of geometries, and a few different containers that group them together or provide extra information.
All images below are from geojson.io, a great tool for playing with and visualizing GeoJSON.
Points and Positions
The simplest geometry is the
Point. Points are commonly used to indicate places of interest and other markers on maps where you just want to show the location, and don't need an area or boundary.
A Point contains just a single position. In GeoJSON, a "position" is an array with two numbers: the longitude and the latitude, in that order.
You may be accustomed to the more common "latitude first, longitude second" ordering. This is a common source of confusion when getting started, so it helps to think in terms of
[x, y] coordinates instead.
A position array may have a third value to represent elevation, but this is usually omitted unless the application requires it.
Every GeoJSON geometry has a
type field indicating what type of geometry it is, and a
coordinates field that defines the shape itself. A point representing a random location in southern Australia looks like this:
You may also place multiple points into a single geometry using the
MultiPoint type. In this case, the coordinates are an array of positions.
MultiPoint should only be used when the set of points are conceivably actually part of the same object. There are better ways to represent multiple, independent points, which I'll cover later.
LineString is useful when drawing lines on the map, such as when suggesting a route between two places. It contains a set of points that join together, connect-the-dots style, to form a line. As with
MultiPoint, it contains an array of positions, but a valid
LineString must contain at least two positions.
GeoJSON also provides the
MultiLineString type, which represents a set of related lines. Its coordinates are an array of
LineString coordinates, with each array member drawing a single line.
Although the lines are considered part of the same geometry, they are not connected together.
Making Geometries More Useful
A geometry on its own is just a shape, with no way to attach any other information to it. GeoJSON solves this with the
Feature type, which wraps a geometry and allows you add any metadata you want.
id property is optional but can be useful for linking a
Feature back to a particular object in a database, or something similar.
There is no defined schema for
properties — you can put any JSON object inside there.
Groups of Objects
FeatureCollection allows you to group multiple
Feature objects together in a single object.
GeometryCollection type is similar, but for raw geometries instead of
Feature. You probably won't encounter
GeometryCollection often. Usually, one of the "Multi" objects or a
FeatureCollection is a better fit, but it's good to know about.
Polygon is commonly used to draw boundaries around geographic regions or other areas. This is one of the most used geometries in GeoJSON, but they do add additional complexity.
Polygon must be explicitly closed. That is, the first and last points in the
Polygon must be identical for it to be valid.
Furthermore, the points in a
Polygon must be in counterclockwise order. This rule makes it easier for renderers to figure out which side of the polygon is the outside vs. the inside, without needing to do additional calculations.
There is no limit on the number of segments in a polygon. It can be as simple as a triangle or complex enough to follow the contours of a shoreline.
LineString above would be represented like this:
You'll note that although there's just a single shape here, the array nesting looks more like a
MultiLineString. This is because
Polygon allows you to specify holes inside the shape as well.
Polygon is a square with two smaller square holes inside it. The first entry represents the outer ring that gives the overall shape, and the other entries define the holes inside the polygon.
While the outer ring of the
Polygon has its points in counterclockwise order, the inner holes' points must be in clockwise order, which again helps renderers to determine which side of the edge is inside the polygon.
Ensuring that the points are in the correct order may seem difficult, but geometry libraries generally handle this for you. They also provide functions such as turf's rewind or PostGIS's ST_ForcePolygonCCW, which will take a polygon and "fix" the point order so that the outer line is counterclockwise and the inner ones are clockwise.
By now you can probably guess what a
MultiPolygon is. These are very useful in representing geographic regions such as the state of Michigan or anything with islands, which can't be represented as a single contiguous shape.
It starts to get hard to read with all the nested arrays, but the coordinates for a
MultiPolygon is just an array of
Polygon coordinates. This example shows a
MultiPolygon that contains two polygons — the square with holes from above, and a simple triangle.
That covers all the different types of objects that you can create in GeoJSON. In future articles in this series, we'll look more into actually processing data and working with these objects in real applications.