GPS Trip Matching and Strava API

Problem Solving

Jake Batsuuri
Jan 27 · 9 min read

Many endurance sports require the participant to follow a pre-set course and achieve specific control points by a specific time. Confirmation of a participants adherence to the pre-set course and control guidelines has been accomplished through the use of paper systems, photographs and manned control points.

The use of a participants GPS file, generated from the participants bike computer, sport watch or smartphone is gaining adoption. However that requires the event coordinator to manually gather, compare and manage the participants GPS file for verification purposes. This can be a complicated, and time consuming activity for event coordinators who have limited technical expertise.

So we could automate this task…


We have an event organizer who will input the race track in the form of GPS coordinates. We assume the GPS coordinates form a non closing path. There is a starting point and an ending point.

A typical GPS point object may look like this:

We would have an array of these, where order matters:

Task #1

So given the race track, we will also have the GPS data of the racer. Their GPS object may look like this:

Our task is to form a path with the race track, then find the nearest point on the track from the current iteration of the racer’s GPS object.

Image for post
Image for post

As you can see once we know which two points the current iteration of the racer’s GPS object is, its a simple matter of finding the perpendicular line joining the two closest race track points.

Image for post
Image for post

Subtask #1A

We have the current GPS point and an array of coordinates. So we have to find the 2 closest coordinates.

To do this we have to find the distance between the current GPS point and each individual track coordinate. This is simple enough, with this handy function. The only thing is this formula assumes, perfectly round globe, I’m sorry Mark Sargeant. Jokes aside, we will do a more sophisticated altitude adjustment later on.

Then sort by smallest to largest. And take the top 2 coordinates.

Can we apply heuristics to reduce computation?

Well, from the first point to the last, as we calculate the distances. We can imagine something like the following result:

  • Really big distance
  • Big distance
  • Medium distance
  • Close distance
  • Really close distance
  • Really close distance
  • Close distance
  • Medium distance

Basically after finding a minimum, the distance does seem to get bigger and bigger as we go through the array. So why should we check the entire array, because the race track could be thousands of points, shouldn’t we do an early stopping strategy and be pretty confident that it’s the subsection.

The problem with that is that the race track could be loopy and winding. And actually come even closer later on.

Image for post
Image for post

Furthermore, it could be the case that the closest 2 points are actually not the points we’re looking for.

Because chronologically speaking, green point just came after another green point from the other direction.

Image for post
Image for post

So in essence, even though the purple points are the two shortest distance points, what we are actually looking for is the green points.

To get the right closest two points, we introduce a biasing parameter into our distance calculation. Graphically this would be:

Image for post
Image for post

Since we don’t know exactly how much weight we should give the biasing parameter. I think, the best approach would be to pretty much let the biasing parameter decide which pair gets chosen from the lets say top 3 pairs. Because I’m assuming at most it will be surrounded on 3 sides.

At this point, we have the right pair of coordinates.

Subtask #1B

So given 2 points, we can create a line joining them.

With a line and a point, we can create another line joining those. This is our end goal, in the sense that we need to check if we should consider this GPS point acceptable.

That’s the basic math. But of course, we have to consider tolerances and sampling rates. Further more, we have so far assumed that the path joining two coordinates on the race track is a line. Which of course is not true, in an ideal world, we’d have smoothed curves. But for short distance, it’s a good compromise to significantly speed up calculation, I’m not a mathematician but we talking polynomial time savings, ya bish.

  • Sampling rate of the GPS maybe 5 to 10 seconds on the low end and every second on the high. As the cyclist gets slower, the dots get closer and assuming a line between them is not an issue. However on the high end, a cyclist can go about 10 m/s. With a slow GPS that samples every 10 seconds. Our cyclist could be 100m away.
  • Given that a city block can be about 100m by 300m. Our cyclist could be at the next intersection. So the tolerance should be around 100m and maybe the coordinates could be 100m apart at the complicated parts and if it’s a straight line, no limitation. Meaning if the race track is from point A to point B and no other midway check points, it would be a straight line on the surface of the earth. And a typical race of hundreds of kilometers, we might have to use the haversine formula.
  • The next consideration is the accuracy of GPSs themselves. We can reasonably expect about 5m accuracy under normal conditions and maybe 30m under really bad conditions.

With all this information, and with consideration for all of the worst conditions and most complicated paths. We can start to build our solution.

The haversine distance is curved, which is more accurate but eventually we do have to find the distance from the point to the line, using this formula:

Image for post
Image for post

where Ax + By + C = 0 is the line equation and (m, n) is the point. This stuff assumes cartesian plane. So we will have to convert our longitude and latitude (λ, φ) into (x, y).

You can simply use the horizontal axis x to denote longitude λ, the vertical axis y to denote latitude φ. The ratio between these should not be 1:1, though. Instead you should use cos(φ0) as the aspect ratio, where φ0 denotes a latitude close to the center of your map. And r is the radius of the Earth (6,371km).

  • x = r λ cos(φ0)
  • y = r φ

Now we need a φ0, which is really just a latitude of the centroid between the 3 data points in the iteration.

Image for post
Image for post

So now basically you would convert the yellow points into x and y coordinates. Then turn it into a line equation, and write in Ax + By + C = 0 form.

Then turn the green point latitude and longitude into x and y. Plug it into:

Image for post
Image for post

And see if the distance is less than than our tolerance, which I think should be 50m to 100m.

All that’s left is writing the code to loop through every thing.

Strava API

The strava API has an object called segment which looks like this:

Specifically the polyline element which is an encoded polyline that in this example is:

Which looks like:

Image for post
Image for post


Which we can decode into an array of coordinates. Therefore making this whole problem solvable.

One Additional Consideration

There could be a situation where the race takes place over multiple days. And there could be multiple segments. But that’s still very easy to check. From several segments, we can check for partial matching on the race track. Obviously with way more lenient tolerances.


Everything connected with Code & Tech!

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store