Using ElasticSearch for Scalable Applications

One of the problems we face on a daily basis at Treehouse is how to appropriately scale our web application with the amount of data we have. Whether that data is posts and answers in our Community forums, video content produced by our teachers, or simply the incredible amount of points and badges students earn daily, there is a delicate balance in how we handle things in a way that allows us to ensure data integrity while also providing quick and efficient access to that data.

Our primary web application is a Rails application that uses MySQL for data storage, but we’ve introduced a variety of different technologies into the stack to alleviate the burden on MySQL when appropriate. Over the past few years, that’s included Amazon CloudSearch and Redis, but more recently we’ve started to incorporate ElasticSearch as opportunities have arisen.

Why ElasticSearch?

There’s a limitless number of reasons to use ElasticSearch, but let’s walk through some that have been the most rewarding for us at Treehouse.

Because it’s fast.

We have a lot of data and sometimes we need to be able to search that data. Because MySQL’s primary purpose is data storage, even with indices on all the right tables, the amount of strain a search can place on a database cluster is astounding. The benchmarks might not sound too mind-blowing but as more students visit Treehouse, the weight of a half-second query starts to add up as it hammers our database servers and from there the burden on our infrastructure begins.

A query that might take half a second to fetch through MySQL? Indexed in ElasticSearch correctly, it could take a mere 10–30 milliseconds to retrieve. That’s quite a difference! We simply use ElasticSearch for indexing searchable data and once we’ve found the results we’re looking for, lean back on MySQL to fetch the full set of data for items in the result set on demand as they are needed.

Because sometimes we want to search denormalized data.

MySQL is a relational database and our tables our normalized. We store our users in one table, badges they earn in another, and points they earn in yet another. Logically, this makes perfect sense, until you need to perform a search against data from two or more of these tables simultaneously.

Using ElasticSearch, we can build a variety of indices with different schemas that take different pieces of data into account. Perhaps in one index, we include community forum posts along with the student’s location, name, and the track they’re currently enrolled in. This would have been a beast to search using MySQL with at least 2 table joins involved, but when we index all of this data together, it becomes fairly simple because we’re performing the search against a single index with a predefined schema.

Because it can be intelligently fine-tuned to learn and grow to match how students are using the search.

ElasticSearch has something going for it that MySQL or other database servers cannot offer: its own intelligence. Like many search technologies, ElasticSearch provides aggregations (formerly facets) and bucketing, which allow you to organize data more clearly. For example, we might have our courses on Treehouse aggregated by their topic so that you can quickly view all courses related to iOS development.

Using these tools, we also have access to partial results and score boosting/weighting, which can help us determine if some results are more relevant than other results. For example, we might decide that if we were to index all of the videos available on Treehouse, ones that are published more recently are given a score boost because they are likely more relevant.

One of the most important benefits of ElasticSearch is that as more and more students search, the results become more and more refined based on previous searches conducted. That’s huge.

Ultimately, these types of options put a lot of control into the hands of our engineering team to help craft a system that yields the best results.

Ways We Don’t Use ElasticSearch

Just as there are plenty of good reasons to use ElasticSearch, there’s plenty of ways to abuse or misuse it as well.

We don’t use ElasticSearch as a primary data store.

There’s a number of reasons why we don’t use ElasticSearch for a primary data store but the most obvious one stands out clearly above all: It can lose writes. We have no interest in losing students’ hard-earned progress, so we make sure our data is permanently stored where we feel it is safest. In our case, that’s a regularly backed up MySQL database.

ElasticSearch can be used in plenty of creative ways, but we choose to use it largely for the purpose written right in its name, its search functionality, and leave data storage to technologies that we know can handle the responsibility.

We try to handle synchronization using realtime tools in lieu of “nightly” batch indexing jobs.

One of the pitfalls of having multiple types of data stores is the increased likelihood of inconsistencies and keeping those data stores in sync. We can’t afford to run large batch indexing jobs asynchronously because in most cases, we want search results to reflect, in real time, what is available, especially when it comes to course content.

For that reason, we do individual record reindexing when the database record is touched. There’s a number of really great Ruby gems out there that handle this painlessly, including the official ElasticSearch gem and the gem we use at Treehouse, Searchkick. These allow us to quickly add indexing by simply adding a directive on our ActiveRecord models, like so:

class User < ActiveRecord::Base
  include Aspects::Searchable

There’s room for performance growth here, though! As we continue to grow and use ElasticSearch, we may decouple this type of indexing from ActiveRecord models and instead adding instrumentation to the model’s life cycle that sends a notification to an observer class to trigger an asynchronous reindex.

Final Thoughts

As you can see, ElasticSearch has great potential for improving search functionality as a system begins to scale. The important take-away is that it is not a fix-all, but simply a tool that we have integrated into our stack for a very particular role. It does not replace other tools — it adds to them.

Aimee is a Web Developer on the Engineering team at Treehouse. We’re on a mission to design, build, and maintain the best online learning service in the world. Sound awesome? Join us!

If you enjoyed this post, we’d be thrilled if you clicked the ♡ below! And be sure to give the Treehouse Engineering publication a follow. Until next time 👋