Simplify Relational Database + Elasticsearch architecture with TiDB

Li Shen
6 min readJan 16, 2023

--

RDS + Elasticsearch is a popular combination

As a developer, you know that a relational database (RDS) such as MySQL, or PostgreSQL is great for storing structured data and handling online transactional workloads. However, when it comes to handling analytics, and indexing large amounts of data, a different storage engine such as Elasticsearch is often needed. It is a popular choice for online services with hybrid workloads.

The architecture would be like this:

  • Using RDS as the primary database (source-of-truth) for the operational data
  • Using Binlog/CDC (Change Data Capture) to push the latest updates from RDS to the downstream
  • Transforming and loading the data into Elasticsearch
  • In the application, routing simple queries to RDS and analytical queries to Elasticsearch

The benefit of this architecture is obviously.

  • Elasticsearch would scale easier and better than RDS.
  • Offload the complex and computing-intensive workload from RDS. You can get better performance and lower latency for filter and aggregation.
  • Elasticsearch is a more natural solution to handle log and JSON data.

With this architecture, developers could add new features which require fast filtering (on multiple dimensions) and aggregation on large data volume, like the real-time dashboard of a SaaS application. At the same time, since some heavy workload is offloaded from your primary database, it is also good for RDS’s stability.

What are the drawbacks of this architecture?

RDS + Elasticsearch architecture can work well together but it also introduces other challenges. There are a few pain points to consider when using this architecture:

The complexity for Developers

Developers need to handle two types of API: SQL and Elasticsearch API. This is not only about different wired protocols, but also different behaviors while processing data. Developers need to build two modules to interact with both side. This would introduce complexity to the code.

There should be a routing module or rules in the application to decide which backend storage should a particular query go to. How to manage the routing information? How about handling the ad-hoc queries?

All of these burdens will introduce more engineering work and slow down the speed of new feature development.

Data consistency and replication lag

Keeping data consistent across RDS and Elasticsearch can be challenging, as the two systems have different data models and update mechanisms. Developers will need to ensure that data is properly indexed and updated in both systems. Or it could cause trouble for applications.

For example, to answer a request which needs to consult Elasticsearch first to filter the data, then query RDS according to the result from Elasticsearch for the details. What if the data in the two engines are not consistent? You will meet errors or incorrect results. This will give your customers a bad experience or even cause financial loss.

The data consistency issue and replication lag will happen from time to time. Getting data from RDS to Elasticsearch can be a complex process, involving data extraction, transformation (potentially), and loading (ETL). Keeping the fragile process running may meet unpredictable issues. This process can be time-consuming and error-prone and may require significant development effort.

How to do the schemas changes?

To build new features, developers need to do schema changes in the upstream RDS from time to time. But how about the downstream Elasticsearch? Could it be consistent with RDS? What if the newly coming data can not match the current schema?

The schema in Elasticsearch is a mapping that describes the the fields in the JSON documents along with their data type, as well as how they should be indexed in the Lucene indexes that lie under the hood.

If the schema in Elasticsearch can not match the data, it will meet an error or can not index the data.

The complexity of monitoring and maintenance

Monitoring the performance and availability of both RDS and Elasticsearch can be complex, as it requires monitoring different metrics, alarms, and logs for both services. Maintenance of both services is also different, so developers need to be familiar with both systems.

High cost

The complex architecture requires more instances (or services) to have several components in the system. It would cost you more money to buy instances or services. When the data volume becomes large, the cost would significantly high. Besides the hardware and service cost, it also requires more experts in your team who are familiar with two (or even more) systems.

Is there a good way to address those challenges? The key is how to simplify the architecture.

Why TiDB

TiDB is an open-source, distributed SQL database. It is already adopted by more than 2000 companies for production, like Square, Databricks, Airbnb, Pinterest, and Flipkart, just to name a few.

TiDB provides a highly available, horizontally scalable, and fault-tolerant architecture that allows it to handle high write and read loads. It also has built-in support for distributed indexes, which eliminates the need for separate indexing and analytics engines like Elasticsearch. Additionally, it’s compatible with the MySQL protocol, making it easy for developers to use and integrate with their existing applications.

With those capabilities, using TiDB to replace the RDS + Elasticsearch architecture is possible. Since TiDB doesn’t support full-text search now. We only consider the scenarios in which developers use Elasticsearch as indexing of the RDS or using Elasticsearch for analytics queries.

  • Easy to scale out

With the distributed nature, horizontal scaling is pretty easy for TiDB. The only thing you need to do is add more nodes to your cluster.

  • Fast Filtering

TiDB supports global-index. Developers do not need to care about whether the data is sharded or not. With the smart SQL planner and distributed computing framework, TiDB could leverage indexes and multiple nodes to do fast filtering.

  • Fast Aggregation

TiDB has a parallel and vectorized computing engine. Developers could do fast aggregation by introducing other systems, like a data warehouse. Here is the TPC-H benchmark report of TiDB.

One of the main advantages of TiDB over RDS + Elasticsearch is simplified architecture. Developers don’t need to worry about data consistency and lag. Applications could be built on a unified API to simplify the development. DevOps do not need to worry about maintaining the complex and error-prone system.

Another advantage of TiDB is its support for the SQL language. This means that you can use the same SQL queries and schema design that you use with MySQL, making it easy to migrate your application to TiDB. With RDS + Elastic, you may need to learn new query languages and data modeling techniques, which can be a significant obstacle to adoption.

TiDB also provides advanced features such as automatic data sharding, real-time analytics, and built-in backup and disaster recovery. These features can help simplify your database architecture and improve the reliability and performance of your application.

Rea-world Use Case

NFTScan is a multi-chain NFT data infrastructure that supports access to leading blockchain networks such as Ethereum, Solana, and BNBChain. Previously, NFTScan used MySQL and Elasticsearch as their core database solution. MySQL stored all application data, including that for analytics and processing, from enterprises and end consumers.

NFTScan experience the pain points of MySQL + Elasticsearch architecture mentioned in this post and decided to move to TiDB. After the migration, the whole architecture is simplified greatly and gets better performance. For more details, please refer to this blog post.

Besides NFTScan, I have seen several similar use cases in which developers benefit a lot from TiDB.

--

--

Li Shen

Author of TiDB, Focus on Modern Infrastructure Software, Opinions are my own