Elasticsearch — Why and How

Andrea Gioco
7 min readSep 9, 2018

--

During the last year, I worked at an Agency in London where I was lucky enough to be changing projects quite often. I also had the freedom to choose the technologies that I thought were the best solution for the challenges I was faced with.

The main technologies that I usually work with are PHP (with Laravel) with MySQL (yes, the standard approach for Web Developers…).

I noticed that on the last 3 projects I worked on, I found it necessary to use a different tool to solve particular (but quite common) use cases.

I’m going to tell you what use cases I’m talking about and why I chose Elasticsearch (ES).

Use case #1: Text search

The first project that I worked on was a classic blog post platform which was open to user contributions. Users were able to create posts; with a title, body, comments and categories.

As part of the site, users could type into a search text field, which would then search specific fields (title, body, etc) against published content.

The match was intended to be a partial match, for example in MySQL it would have been something like:

SELECT content_id FROM content WHERE title LIKE ‘%string%’ OR content LIKE ‘%string%’ OR …

You don’t need to be a SQL expert to know that these types of queries will kill the performance of your project fairly quickly.

It was time to explore a new solution.

Solution

This was the first situation where I decided to use Elasticsearch, and it ended up being the right choice to make.

Elasticsearch is the perfect solution for textual searches against large amounts of text and it gives you functionality like “fuzzy query’s” to help users find what they are really looking for with exceptional performance (the fuzzy query is a particular type of query that searches for inexact matches to create the classic “did you mean..?”).

Use case #2: Typeahead with suggester

On my second project, I had to develop an API for a common scenario: a location typeahead.

On the website, the users were able to search for a location (city, street or postcode) with an autocomplete field. If you work as a web developer, this is a challenge you will probably need to solve at some point in your career.

At first, my mind went to the big G solution, with the Google API product “Place Autocomplete”. This would effortlessly do the job for you.

But, I had a restriction — we were using a data source provided by a third party company that we were working with. This data source contained a huge amount of records with a location name, latitude, longitude and id.

The MySQL solution would have been something like this:

SELECT location_id, location FROM location WHERE location LIKE “string%” ORDER BY location;

This would have worked fine — but it’s too slow for the standards that we were looking for.

Solution

For the second time, performance was the main reason why I choose Elasticsearch, but this time I used the “completion suggester” feature.

The Completion Suggester in Elasticsearch is a different way to map and query your documents. The result of this type of query is a list of documents that begin with the searched string.

Quoting the Elasticsearch documentation:

The completion suggester provides auto-complete/search-as-you-type functionality.

This was exactly what I needed.

Use case #3: Position search with radius

Using the location that we have just obtained, we started a new search to find all the properties within a 10 mile radius of the specified location.

A few years ago, I encountered a similar problem although ES was still a mystery to me.

At the time, my approach was to convert the latitude and longitude into a grid system (working with integers instead of doubles), save these new values in a MySQL database and then use these integers to search for the records I needed.

This solution gives you good performance but working with a plane grid system instead of trigonometry functions means less position precision.

An alternative solution is to use the trigonometry functions provided by MySQL, but this solution will exclude all the indexes you set on your table which results in bad performance.

This is where Elasticsearch comes to the rescue!

Solution

Elasticsearch provides a specific data type called geo point that I used to map the position of my properties (with a few other details), and then it provides a query feature named geo distance that is used to perform searches against the positions I indexed, starting from a defined latitude, longitude and radius (10 miles in this example).

The results can then be easily sorted by distance and the radius can be specified using miles or kilometers (I still can’t understand why in the world we still use different metric systems..please start using km, everywhere..please).

In this case, my decision wasn’t driven by just the performance, but also by the simplicity that Elasticsearch provides for this type of query (and by the comparative complexity of the MySql query you would need to use if you chose the trigonometry solution with MySql).

Use case #4: Search points within a border

This is the juicy one and probably the most complicated use case in this article: searching for properties in London based on a specified borough/district.

This is now quite a standard use case for applications used to search properties but it was a new scenario for me, so I had to do some research before I started coding.

The first problem was finding the borough’s borders which were going to be used as the perimeter for my search. Then I needed to work out how to store these borders, and how to use them to perform a search.

I found a free government data source that provides a list of all the counties/boroughs in the UK and thanks to this source I came across a new (for me at least) type of data structure — the GeoJson.

GeoJson is a standard that defines a way to describe a shape composed of a list of points with each point specified by a latitude and longitude. You can also include in your GeoJson a list of details that describe your shape (I found a nice tool called geojson.io where you can inspect your GeoJson and create one from scratch).

Now that I had my boroughs structured in a cool way, I needed to understand how to store them.

Elasticsearch helped me again!

Solution

With Elasticsearch you can easily save (and use) your GeoJsons structured information using the geo shape data type.

Not only does the geo_shape data type give you a well-structured way to store your shapes, but you can also then use these shapes to perform complex queries like:

Give me all the points contained in a specific shape

Well, in this case, Elasticsearch didn’t just give a powerful alternative, but it gave me an easy solution to a problem that I didn’t know how to solve.

Any other use case?

These are the use cases that lead me to use Elasticsearch so far, but I’m not an expert and I’m pretty sure that there are many other situations where ES can be a great solution.

If you have any other use cases or any different suggestions to solve the cases I specified, please feel free to leave a comment and let me know.

How did I interface my PHP application with Elasticsearch?

The first thing that I’d like to mention is that ES comes with a great tool called Kibana, that you can use to interact with your indexes and provides you with a list of useful information about them.

Kibana console tool

I suggest that before you start coding in PHP like a lunatic, take a few minutes to understand how Elasticsearch works, how to run queries and how to store your data. You can perform tests using Kibana or if you prefer, you can interact with your ES instance directly with curl.

You will notice that all the Elasticsearch interactions are API calls (as default to your localhost:9200) that use a Json parameter to perform any action — from creating an index to running a query against it.

That said, to start building a layer between your PHP application and the Elasicsearch data, I suggest using the library provided by Elasticsearch, which is named Elasticsearch-PHP. It’s completely free and covers pretty much everything you need.

All you need to do is structure the data that you will pass through the Elasticsearch library as an array that will be converted into Json for you by the library.

I won’t go too deep into this, I just wanted to mention a few key points and I’m going to add links to what I found useful in the Elasticsearch-PHP documentation.

Elasticsearch-PHP
Dealing with JSON arrays and objects
Indexing documents
Search operations

The end.

Feel free to add any useful comments, and I’ll also kindly accept useless ones :D

I hope you found something interesting in this article and if so, like/clap/whatever or just let me know.

--

--

Andrea Gioco

Software engineer, Born in Milan, Live in London, PHP, Java and Flutter