MySQL 8 and maps. A power combo?

Jochem Fuchs
Spartner
Published in
5 min readAug 7, 2019

--

Some time ago I wrote a blog about “The best way to locate, in MySQL 8”. In that article I promised to follow up with some metrics to show how MySQL 8 performs. I’ve been dragging my feet, for the simple reason that I don’t believe it would be a very interesting read. Feel free to leave a comment if you think otherwise!

Instead, let’s have some fun with a practical example we’re going to build together!

What are we making?

You’ve probably all seen them and have used them many times with varying results. Whether you’re looking for stores, real estate or a place to stay, they all work on the same principle. You’re looking for location based items, and you want to be able to scroll through the map to find more.

In some cases these maps are lightning fast, but in many cases it actually takes a few seconds to load the results on the map.

We’ll first take a look into why that is, then we’ll discuss some often used optimisations to mitigate that, and finally I’ll guide you through building a solution that is fast without any of those optimisations.

So why is the traditional approach slow?

In a database without (or with limited) spatial functions and without using optimisation or caching techniques, you would have to calculate the distance of each point in your database from the centre of the currently visible map. Scrolling the map would restart that entire process each time.

If you’ve read my previous article, you know I mentioned the Haversine formula. It’s this formula that is calculated for EACH and EVERY point in your database, at EVERY request. We’re disregarding caching for now, as this does not solve the issue, just mitigates the resulting slowness to some extent.

Let’s have a quick look into how this would look:

SELECT *
,(6371 * Acos(Cos(Radians(map_center_latitude)) * Cos(Radians(lat)) * Cos(Radians(lng) - Radians(map_center_longitude)) + Sin(Radians(map_center_latitude)) * Sin(Radians(lat)))) AS distance
FROM locations
HAVING distance < 25
ORDER BY distance;

Now apart from the actual formula being very hard to read, you can quickly see the issue here. For each row in the database it would need to calculate the distance to the map_center. Then the results of that calculation are sorted and those that match the HAVING clause are returned.

Even with indexing of lat and lon you’d still see the database scanning all rows. You can see this when we use EXPLAIN:

EXPLAIN SELECT * ,(6371 * Acos(Cos(Radians(map_center_latitude)) * Cos(Radians(lat)) * Cos(Radians(lon) - Radians(map_center_longitude)) + Sin(Radians(map_center_latitude)) * Sin(Radians(lat)))) AS `distance` FROM `locations` HAVING `distance` < 25 ORDER BY `distance`\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: locations
partitions: NULL
type: index
possible_keys: NULL
key: lat_2
key_len: 10
ref: NULL
rows: 9980
filtered: 100.00
Extra: Using where; Using index; Using filesort
1 row in set, 1 warning (0.00 sec)

Now imagine having to do this same query over and over, with only the map_center changing. Even with some kind of debounce solution in the frontend and caching in the backend, it will always feel sluggish.

Ok, but some maps are very fast, how?

There are many ways to approach this, but most of them will be some form of caching, preloading or even NoSQL solutions.
Let’s quickly run through a few of the options and their downsides.

  1. Caching
    This seems like the obvious first step. It’s quick, and it’s easy to implement in most frameworks like Laravel. But how would you implement caching if your only variable is every possible coordinate within your search area? In the end, this solution isn’t really viable.
  2. Preload all the things!
    While I haven’t been able to prove it, some of the maps I’ve studied for this article seem to load with the entire list of markers already there. There is no additional loading when scrolling, it’s just already there. And this is probably a perfect solution if you have a fairly limited area to search (though the Netherlands seems to be small enough), and with a set of points that don’t change very often and can therefore be cached easily.
    The limit, of course, is the size of the dataset. The bigger that is, the longer it will take for the frontend to load and interpret it. Until at some point you reach a tipping point and suddenly your frontend is the bottleneck. And you can’t really control the different browsers and internet connections the end users will possess.
  3. NoSQL / ElasticSearch / Algolia and others
    This is a well known and very viable solution for our problem. Many of these kinds of services/systems have built-in features for quickly searching in location based data. And if you’re already using something like this, then there really is no need for looking any further.
    If you aren’t already using such a system, then the investment of implementing one might not be worth it. After all, the location based searching is just one of many features they offer, and many of these services are paid subscriptions, or they might simply be very expensive to run and implement yourself. That’s budget you might not have or a system you might not be able to support.
  4. PostgreSQL with PostGis
    Ok, so this is cheating a bit. If you have the know-how and experience to use PostgreSQL, then by all means go for it. In fact ALMOST ALL of the advantages MySQL 8 offers over older versions have already existed in PostgreSQL for a long time.
    So what’s the downside you may ask? It’s still a niche product, relatively speaking. There is far less information, far less tools, and a far smaller community than there is for MySQL. The learning curve is steeper and if you don’t already have experience with it, it’s most likely too costly to implement it for a fairly simple feature like we’re going to build here.
    But that’s pretty much the only downside I can think of. So if you do have the knowledge and experience, you’re pretty much set and you can probably follow my guide here and implement the same app with PostgreSQL as well.

So where does MySQL 8 come in?

Well I’m glad you asked! Next week we’ll continue with a quick tutorial to build a demo implementation of this concept. The end result will look something like this:

Written by Jochem Fuchs, Software engineer at Maatwebsite.

Do you need help working with geolocation in your app or do you need help building a Laravel application? At Maatwebsite, we are there to help you on a commercial basis. Contact us via info@maatwebsite.nl or via phone +31 (0)10 744 9312 to discuss the possibilities.

--

--