Adding elevation data to OSRM13 Apr 2015
Photo by Richard Masoner - licensed under Creative Commons by-sa
OSRM is a quite impressive routing software: it is open source, it is extremely fast, and it is also very configurable and extensible. I’ve worked with it quite a lot in the last year, creating my by far most ambitious open source project, Leaflet Routing Machine.
Recently, I’ve started working on a new hobby project that uses both Leaflet Routing Machine as well as OSRM, a bike planner for Sweden, built on OpenStreetMap data and open source software - it’s called cykelbanor.se (cykelbanor means “bike lanes” in swedish).
OSRM comes with a profile for bike routing which is pretty good, but one thing that quite quickly becomes obvious is that any sane bike router has to avoid going too much uphill. Comparing the results of OSRM to some routes I frequently go by bike gave some suggestions that someone familiar with the terrain would never give; and Göteborg, where I live, isn’t even that hilly.
Looking around, there are several OSRM issues that discuss adding elevation data to OSRM (#271 and 546, for example). From reading them, and some other posts, it is apparent that lots of people have worked on this, and some have might have working solutions. However, I could find no comprehensive documentation on how to actually do this. Well, we have to do something about that, right?
In the end, we need to tell OSRM that the average speed of a road section that is uphill is lower than a flat one, and that of a downhill road is higher. More specifically, OSRM needs this information for all ways it should route through, which roughly means all ways tagged with “highway” (which means some kind of road or path, not an actual highway, in OpenStreetMap parlance).
With this information, OSRM will see that uphill takes longer time (and downhill somewhat shorter), and route around hills if going uphill doesn’t make the route significantly shorter.
My approach to get this information into OSRM is to add a preprocessing step, which calculates the amount of uphill and downhill for each way in OpenStreetMap. From reading the issues mentioned above, it seems I’m far from the first with this approach.
The preprocessing generates a file that the OSRM profile, the code responsible for generating the travel times for OSRM ways (the “cost” of going a certain way), can read and use. An architectural side note is that preprocessing, in contrast to integrating this into OSRM, also decouples the actual handling of raw elevation data from the rest of OSRM, making it easier to test separately and use for other purposes.
So, lets get to the actual implementation. I chose to do this in Node.js, but a lot of other technologies would probably have worked equally well.
The preprocessing is done by a tool with the somewhat boring name osm-slope. It takes an OpenStreetMap
PBF or OSM XML
file as input, and outputs a JSON file with entries for all ways tagged with the
highway tag. You
can download PBF data for your region from for example Geofabrik.
This file serves as a database for the OSRM profile (the lua script) when
osrm-extract runs, with the
way’s id as key and the uphill and downhill data (climb and descent) as values.
You can find osm-slope on GitHub.
Custom OSRM profile
The data from osm-slope is used by a custom version of OSRM’s bike profile. I haven’t extracted this into a separate project, the files are available from cykelbanor.se’s config. The current version uses Lua CJSON to read the database from osm-slope, so you need to install it.
The important parts are, except for the trivial part reading the JSON file, are:
The calculation gets a bit messy since each way can consist of flat parts as well as uphill and downhill. Note that the current function for speed depending on gradient is pretty basic and I’m far from sure it accurately models actual bike physics.
Underlying osm-slope, I wrote a module to handle reading and handling elevation data.
From one of my earlier projects, I had some experience with working with elevation data in the HGT format, and knew there was some pretty good data available.
From reading about previous work, I also wanted to make sure my solution offered reasonable performance: some earlier attempts where apparently quite slow, making the import process run for hours.
Hence, I wrote a small Node module to read HGT files and query them for elevation based on a geographic coordinate (latitude and longitude). It supports interpolation, since most elevation data is still not that high resolution.
To improve performance, I also used a memory mapped solution, although to be fair, I’ve not benchmarked it against doing it some other way.
Finally, it also has a solution for managing and automatically downloading set of HGT files (tiles), using the nice search tools on imagico.de (you don’t need to use their web frontend, node-hgt will query it for you).
This module can be used standalone any time you need elevation data, just install node-hgt:
npm install --save node-hgt
As mentioned earlier, I had some fears adding elevation would be a time consuming process, making the already quite tedious task of preparing data for OSRM even longer. So far, it’s not been as bad as I thought.
My numbers comes from handling a complete OSM extract of Sweden, in PBF format. It currently contains
just over 833,000 ways tagged with
highway. Test was performed on my more than two year old System76
laptop with - it’s not too bad, but far from high performance.
Preprocessing with osm-slope takes just under three minutes (2:54), and osrm-extract takes 5:55, compared to the baseline 3:55 it takes with the standard bicycle profile distributed with OSRM. That adds up to well more than twice the time (+125%), quite a lot although far from as bad as the three hours someone talked about in previous OSRM discussions (not at all clear what the data size was, though).
Some parts of this toolchain are still a bit hardwired and could use some more config options.
Also, while these tools work great for the dataset I’m using, Sweden, I’m not confident they would scale to much larger data sets. For example, both osm-slope as well as the OSRM profile hold the way elevation database in memory, which will obviously not scale well. An idea here could be to use LevelDB or similar in-process database, to avoid the overhead of TCP/IP etc.
osm-slope also aggregates data in a way that I’m not entirely sure keeps enough data to calculate good speeds in all cases. I need to verify this, or store more detailed data.
To actually display the elevation profile in cykelbanor.se, I hacked together a small (you could say micro) service that takes GeoJSON as input, and adds elevation data to its coordinates. As everything else, it’s on GitHub: elevation-service, or you can just post your GeoJSON to http://data.cykelbanor.se/elevation/geojson.
This service was trivial to write when I already had node-hgt.