How I created an API to correct GPS to Sea Level.
A simple solution to a complicated problem.
A Quick request. If you like what I write then please follow me. Under the new rules — I need some more followers to have the partner status I had before. Thanks
GPS is wonderful!
But it has problems with altitude!
- Vertical GPS measurements are just less accurate (because of geometry), and
- We don’t know where the surface of the Earth is!
Putting aside the first (since fixing geometry is beyond the scope of the article :), this article documents how to create a solution for “finding the surface of the Earth” or more accurately how to convert from GPS coordinates to a height above Mean Sea Level (MSL) which is defined for the purpose of this API as the WGS84 Orthometric Height.
What is GPS’ problem with altitudes?
I previously investigated the problems analysing and using GPS altitudes, especially on smartphones and other small devices, in this article:
How to draw a gradient profile using GPS track data and polynomials
Understanding GPS altitude and how to get a reasonable result.
NOTE — throughout this article I will talk about GPS. I should actually be talking about GNSS as the generic name for satellite positioning systems including GPS, GLONASS, Galileo, BeiDou etc but I have chosen to use the ideomatic “GPS” instead. The concepts are all the same for all systems.
Without rehashing the argument, the problems come down to the following:
- There are fundamental geometrical reasons why the accuracy of GPS in the vertical plane is less than the accuracy in the horizontal plane. The rough rule of thumb usually quoted (but which I have never seen totally justified) is that whilst the horizontal accuracy (pace other factors such as multi-path interference) is of the order of 2m, the vertical accuracy is of the order of 10m, and
- The concept of height is complicated. The problem is “height above what?” and in some cases “height in which direction?”.
Survey-grade GPS equipment (such as differential or RTK GPS etc) gets around these problems by:
- using fixed base stations to remove systemic errors to get, in some cases, cm level accuracy, and
- applying corrections from the sensor coordinates using models such as EGM2008. This will be explained below.
However, handheld GPS terminals, especially smartphones, generally cannot do the first and often cannot or do not do the second.
This article particularly relates to apps on smartphones. In this case, there is no way of reducing the systemic errors (if I am wrong please let me know in the comments) except by using the “smoothing over time” approach in the article linked above.
We will be looking at how to address the second problem.
There follows a very TL;DR introduction to the science. If you don’t care and just want to see the Implementation details then click HERE.
To start at the beginning. The GPS device calculates where it is in relation to the GPS satellites. The GPS satellites similarly know where they are in relation to each other and the control stations on the ground.
But, people want to know where they are on the map!
Converting from space-referenced positions to earth-referenced positions is the science of Geodesy. This is a particularly complicated area of study, to which I cannot possibly do justice in an article like this.
NOTE — the following was corrected slightly after comments from various reviewers
The TL;DR version is:
- GPS devices natively create the coordinates in geocentric (i.e. XYZ) coordinates but the devices and APIs used for navigation (which is is the use case here and covers smartphones) will usually report their coordinates in a coordinate system called EPSG:4979 (not EPSG:4326 since that is a 2D system) often referred to as “WGS84” (survey-grade devices, as discussed above, will probably be applying additional corrections and conversions which will be specified in the relevant documentation),
- EPSG codes are a standardized way of referring to complicated and highly technical definitions of coordinate systems and will be used throughout this article,
- EPSG:4979 is a Geodetic Coordinate system meaning that coordinates are reported as Latitude and Longitude (LatLng) with respect to a reference ellipsoid that approximates the shape of the Earth, and as an Altitude (Alt) above the ellipsoid normal to the ellipsoid (the ellipsoidal or geodetic height). The reference ellipsoid used in EPSG:4979 is the WGS84 ensemble reference ellipsoid.
Interesting but Tangential Fact #1 — Not all LatLngs are the Same
You often see coordinates just quoted as latitude and longitude.
But, the coordinates only have meaning with respect to the ellipsoid used. It has to do with the specific position used as the origin of the ellipsoid and the specific direction of the normal to the ellipsoid etc.
There are many commonly used ellipsoids, mostly for technical reasons in local areas where a different ellipsoid will provide a better match to the local “shape of the Earth”, whilst WGS84 is designed to be the least bad global fit.
An example is the British National Grid (EPSG:27700) which is based on a geodetic coordinate system OSGB36 (EPSG:4277) which uses a reference called the Airy 1830 ellipsoid that most closely matches the shape of the UK.
If you mistakenly map GPS coordinates (i.e. EPSG:4979) as if they were EPSG:4277 coordinates (or vice versa) you will get an error of about 100m.
Yes, I am speaking from experience :).
Interesting but Tangential fact #2 — Things change over Time
To make things even more complicated, the Earth is a pretty dynamic place which changes shape and where “real world” things that we care about (like continents and cities) are moving constantly.
Because of this, reference systems like WGS84 are dynamic and frequently updated. That is why the standard EPSG:4979 refers to the WGS84 ensemble which includes a number of frame realizations each of which has a relevant time period or epoch. To add to this, the different varieties of GNSS use different frame realizations …
The document linked above notes that in Australia “WGS84 coordinates collected 20 years ago will have experienced 1.4 metres of apparent horizontal motion”. I have heard anecdotally that there are parts of California that do that in a year.
For my use case, this is irrelevant since I am calculating the height change in the last couple of kilometres travelled. However, if you are in any way storing the location for long-term retrieval then you should bear this in mind and probably store the epoch and source as well.
Heights and Gravity Models
Getting back to heights …
So, we have determined that the GPS coordinates give us the ellipsoidal Altitude.
But, on land, there is a centuries-old tradition of measuring heights above Mean Sea Level (MSL) and at sea … most people assume that the sea is at sea level!
Only three problems:
- what does MSL mean,
- where is MSL (especially when you are thousands of miles inland), and
- what direction to measure in.
The answers to all of those questions are “it’s complicated”! See this article for more details. The story of the UK’s quest for a reliable MSL datum alone is worth a read — let alone how we have ended up with one that slopes :).
The solution has been the creation of “Gravity Models” that attempt to follow the surfaces of equal gravity potential that most closely match an arbitrary definition of the average sea level across an area or for the whole globe. This is then used as the datum for height measurements.
The height above the Geoid is called the Orthometric Height.
The shape defined by the Gravity Model is the Geoid. There are many different Geoids defined for different purposes but the use case we are addressing needs to have a single solution that works anywhere in the world. Therefore, we are going to use the EGM2008 model and Geoid. This is the latest available version of the most commonly used global Geoid (note, there was supposed to be EGM2020 by now but this seems to be still in development).
The orthometric height calculated going from the WGS84 ellipsoid to the EGM2008 Geoid can be called the WGS84 Orthometric Height. This is the closest we can get to a global GPS height above MSL or just “height”!
The Gravity Model is defined in terms of some complicated spherical harmonic equations, so the most usual way of using the model is to download precalculated grids of values, usually calculated at 2.5' or 1' intervals(e.g. from Proj). For reference, 1' of latitude is 1 nautical mile or about 1.8km. 1' of longitude varies between approx. 1 nautical mile at the equator to zero at the poles. So, the takeaway is that we are talking about km scale grides.
EGM2008, like all global models, probably makes some compromises in local accuracy in comparison to “local” MSL. Analyses of these errors would be highly technical and seem to be relatively rare, but what I have found suggests that it is a good model unless you want to go down to cm level of accuracy in the altitude which, given the definitional problems I have outlined, seems fair. Certainly not a problem for this use case.
Interesting but Tangential Fact #3 — Down is not always Down!
The direction of “down” is called the plumb line and is defined by local gravity potential (i.e. which way would a plumb line point).
The orthometric height is defined as the distance in the direction of the plumb line to the Geoid.
However, as we know, the gravity potential changes as we move. This means that if you are standing next to (but not above) some particularly heavy rock then your plumb line will be deflected towards that heavy area and away (slightly) from the “centre of the Earth” (which we know anyway is highly subjective)!
All of the sources I have ever seen say that orthometric height is the difference between ellipsoidal height and the height of the Geoid at that point. However, that can not be true because of the problem above and because the direction of the vector of ellipsoidal height (and all heights are vectors) depends on the reference ellipsoid used!
However, this API uses the common equation for the calculation of the orthometric height.
Similarly, as pointed out in the Wikipedia article, since the gravitational potential varies by location if you calculate a gravity equipotential surface that is (say) 100m above the Geoid at one point it will not be 100m above the Geoid at (most) other points. This kind of confuses our definition of height!
So, enough of the science, let's get to the implementation.
The Use Case
As defined here, the application is one that is used when travelling by train and displays the current altitude and the current altitude profile for the track.
How to draw a gradient profile using GPS track data and polynomials
Understanding GPS altitude and how to get a reasonable result.
But, the difference between WGS84 and MSL can be quite large (about 48m in the UK) and it becomes annoying when you are next to the sea but your app says that you are at 48m altitude (or when you are entering the Severn Tunnel that goes beneath the sea at 46m altitude!). It does nothing for people’s trust in the app no matter how well you explain it!
So, the use case is to create a performant and global way of correcting the Altitude.
The application in question (www.trackbash.co.uk) uses my favourite NGX/GAE architecture as described here:
This means that I would have to choose between doing the functionality at the front end in Angular and Typescript or as a backend Python API.
As discussed above, the solution to convert between EPSG:4979 and EGM2008 uses the grid files and spherical interpolation using the geodetic coordinates.
The grid files are large (approx 80MiB for the 2.5" versions and 400MiB for the 1" versions) and obviously, no one is going to be happy with an application that tries to load those on the local device.
The calculations are doable on modern devices but are going to take power and time. Any map-based application already needs to be careful of processor lag and overheating so best to avoid the calculation if possible.
Finally, there are some libraries that can help. Probably the most widely used and respected is
PROJ and is a C++ library. There is a JS spin off
Proj4JS but it does not have as much functionality as the C++ equivalent. Finally,
PROJ comes with a CDN for accessing the grid files and a Python wrapper
Based on these facts, it was decided that creating an API method to do the transformation was the scaleable and safe solution.
PROJ library and the
PyProj wrapper, the problem becomes a relatively case of a transformation between coordinate reference systems, thanks to this StackExchange article for the pointers.
The supplied GPS coordinates are in EPSG:4979.
The output coordinates we want are in EPSG:9518, a combination of WGS84 and EGM2008.
Finally, what we are looking for is the correction for Geoid height at the location since that is what we will apply in the frontend to received GPS coordinates, we will be applying that transformation to a LatLng pair with Altitude zero. This gives a number that we need to add to the GPS Alt to get the height above MSL.
The architecture described above makes it very easy to add an API, so the following two files were all that were needed.
NOTE: details of the Auth processes have not been published for security reasons. See the architecture article for details
As you can see, the actual functionality is very simple thanks to PROJ:
- PROJ networking is turned on to access the CDN,
- the definitions for EPSG:4979 and EPSG:9518 are accessed,
- A transformation is created between the two coordinate systems, and
- on-demand using the supplied LatLng values, the transformation is applied.
PROJ itself takes care of accessing normative definitions for the coordinate systems and downloading and caching the relevant parts of the transformation grid as well as applying the transformation.
Using a real implementation on the smallest tier of Google App Engine accessed using real devices from real network locations yields:
- approx 2 second response times for the first request that includes waking up the GAE instance and accessing the required grids,
- approx 30ms response times for follow-on requests.
The Application Implementation.
The next part is to implement the API in the actual application.
The plumbing is nothing special and the details of how to access a backend API are described in detail in the Architecture article linked above.
The key problem is when to query the API and apply the correction.
The key points in the process flow are:
- JS Geolocation API updates are received every few seconds,
- In my app, I am then only processing the updates (and adding to the profile) if there is more than about 5m of movement to prevent chewing up a lot of cycles when the train is stationary,
- finally, I am querying nominatim to get location information.
The frequency of the last step is based on an algorithm that would fill an article in itself, in order to get suitably local information without doing hundreds of API calls. The result is that the API is called between every 5 and 10km of travel. Which can still mean “every minute” if your train is travelling at 300kph (yes — trains are the fastest vehicles you are allowed to use your GPS in).
The decision was, of course, to call the geoid_correction API using the same trigger as the nominatim API and to apply the correction to every GPS coordinate set update added to the profile (which is just an addition operation on the Alt coordinate), thus relying on the much quoted Tobler’s second law!
So far, this all seems to be working very well. Any errors caused if we only fetch correction values relatively infrequently are masked by the inherent systemic errors in the GPS.