Tarantool: in-memory DBMS and application server

Applications for which data access speed is critical are increasingly often powered by in-memory DBMSs. But real-life projects also need fault-tolerant storage, support for transactions, secondary indexes and stored procedures, and other functions provided by regular database management systems.

Data storage solutions can be divided into two broad categories:

  1. In-memory systems that store data in RAM, with latency of under a millisecond, and allow processing hundreds of thousands of requests per second on a single CPU core. Such systems can be optimized to handle high workloads, that is a large number of parallel requests (Memcached, Redis).
  2. On-disk systems that provide the feature set of a full-blown open-source DBMS (MySQL, PostgreSQL and others): data persistence, support for transactions, integrity, accessibility and so on.

However, in real-life projects, people often need to have the best of both worlds. Suppose you’re creating an authorization and authentication service that works with user profiles and handles a great number of simultaneous requests. You need to ensure data integrity, implement support for secondary and composite indexes and stored procedures that help detect hacking attempts and any malicious activity, while keeping latency as low as possible. One seemingly good idea is to use several products in a bundle: a caching solution that handles all read requests and an on-disk DBMS that tackles write requests and then synches all the data with the cache. However, in this case, the cache data and the DBMS data often becomes out of sync, and synchronization logic can be tricky to implement. Usually, this task is given to a developer who has to write the code from scratch, with all the possible scenarios in mind. It also takes quite a few servers to simultaneously support two products. As a result, neither product works as expected.

Modern developers tend to resort to different solutions for different types of requests, so various products end up not being used to their fullest. In such mixed configurations, data is read from the cache, so the capabilities of the main DBMS are wasted. In case of write requests, for example, it’s impossible to implement transactions that would work both with the cache and the on-disk DBMS. Therefore, a hybrid system won’t fully support transactions, which means it is likely to have inconsistent data at some point in time. As if that wasn’t enough, another downside is the increased efforts required to develop and support such a system.

In-memory DBMS

Fully integrating products from different vendors is no mean task, and the ideal solution would be to use a single product that has all the required features. One such product is Tarantool, an open-source all-in-one document-oriented in-memory DBMS and application server that combines the best of caching and persistence offered by full-fledged DBMSs. This system has been developed by Mail.Ru Group since 2008 and is now used by Avito, Yota, Badoo, Qiwi and other companies.

Tarantool stores its data in RAM, which ensures that read requests are executed really fast, and with write requests, before updating any RAM-resident data, it saves the required changes to the on-disk write-ahead log (WAL) — this way in-memory and on-disk data always stays in sync. Write requests are also executed relatively fast: for Tarantool, writing data is just sequentially adding new transactions to the end of the WAL. Even regular hard-disk drives (and also SSDs) have high sequential write speed — up to 100 Mb/s; and sequential writes are tens of times faster than random access and random writes. To put it into perspective, with 100 bytes per transaction, Tarantool running on a server with an HDD can write to the WAL at 100 Mb/s, executing up to a million transactions per second. The bottleneck here is not an HDD, but a CPU.

On the other hand, regular on-disk DBMSs arrange data in memory in a particular way and use certain locks and structures designed specifically for the on-disk scenario. That’s why even if you put a disk image into RAM, on-disk DBMSs perform worse than in-memory ones, which are initially created for RAM-resident data. With on-disk DBMSs, even if you use the cache (buffer pool), the number of random disk read and access operations is overwhelmingly large. For example, using B/B+ trees (MySQL, PostgreSQL) requires random disk access in case of write requests, although data is written sequentially to the WAL. Even relatively new LSM trees (Google RocksDB, Facebook LevelDB) that don’t need random disk access for write requests can’t eliminate the problem of random reads.

When all the data fits in memory, Tarantool demonstrates higher read and write performance than traditional on-disk DBMSs, while not violating any of the ACID (atomicity, consistency, isolation, durability) requirements. If, however, the data is too large to fit in RAM, Tarantool offers a specialized technology based on LSM trees. And if Tarantool is used solely for caching, the WAL can be disabled altogether. Since Tarantool supports the Memcached protocol, it can be regarded as a Memcached substitute. To sum up, whether you’d like to store your data entirely in RAM or on disk or both on disk and in memory, Tarantool supports any scenario.

Apart from maintaining the WAL, Tarantool takes periodic database snapshots, which helps during a restart (in case of power outage, for example): first, all the data from the latest snapshot is loaded into RAM and then all the post-snapshot transactions contained in the WAL are executed sequentially. This approach significantly speeds up recovery after failures: say, it takes about 5 minutes to recover a 20-Gb database from a snapshot. Yahoo! Cloud Serving Benchmark (YCSB) results show that Tarantool outperforms some in-memory DBMSs: Memcached, Redis, Aerospike and VoltDB.

As a demonstration of Tarantool’s capabilities, below are a chart and a table detailing its caching performance on a Microsoft Azure virtual machine with two CPU cores and 14 Gb of RAM.

I’d be remiss not to mention support for asynchronous master-master and master-slave replication. In heterogeneous physically distributed data centers, it’s often necessary to scale both read and write operations. Tarantool allows adding and updating data on the servers of two data centers (of course, if the same data is updated, some changes may get overwritten) and then asynchronously replicating the changes between the data centers in an automatic mode. Also supported is a master-slave configuration, where only one server is the source of changes. In particular, Tarantool supports even MySQL-to-Tarantool replication.

Application server

Tarantool isn’t only a DBMS, but also a Lua application server that allows calling procedures written in C and Rust. The Lua language is quite simple and developers usually don’t have any difficulties mastering its syntax and core functionality. But despite its simplicity, it’s a full-blown high-level programming language used for creating scripts in games (for example, World of Warcraft), client applications (Adobe Lightroom, VLC, Wireshark) and web servers (Apache HTTP Server, Nginx).

Tarantool to Lua is approximately what Node.js is to JavaScript. Tarantool has non-blocking I/O, green threads in user space (threads are controlled by a virtual machine, not the OS), cooperative multitasking and other features. Despite some differences in the implementations, you can do with Tarantool whatever you can do with Node.js, but the converse is false. Say, unlike Tarantool, Node.js doesn’t have a built-in DBMS and can’t perform real-time processing of hundreds of gigabytes of RAM-resident data to calculate complex correlations and ratings, to detect anomalies and so on. Data and the code that processes it are placed as close to each other as possible. When processing data, Tarantool can issue requests to external data stores based on MySQL and PostgreSQL, acting as a single point of entry, and present results via RESTful services.

Modern server-side applications quite often work as a middle layer between the database and the client (browser, mobile app or API client), that is they construct database queries, convert the result to JSON and present it to the client. Tarantool alone can take care of the whole pipeline (data processing and transformation).

There exist a large number of modules that extend Tarantool’s functionality. For example, there’s a module that implements queues so that Tarantool, apart from being an all-in-one DBMS and application server, becomes a queue server that allows building key industrial systems. On the other hand, there are projects where Tarantool performs only one task, acting as a Memcached substitute or a MySQL cache. This flexibility gives users the freedom to choose exactly what they need for their specific case.

Working with Tarantool

You can start using Tarantool directly in your browser by going to tarantool.org/en/try.html and executing a sample script or by following the installation and setup instructions for macOS, popular Linux distributions or FreeBSD. The most convenient way to run Tarantool on Linux, macOS or Windows is via Docker:

docker run -d tarantool/tarantool:1.8

Then you need to connect to the Tarantool console running in the Docker container called mytarantool and execute the console command in interactive mode:

docker exec -it mytarantool console

After that, it’s necessary to create a space for storing data — it’s analogous to a collection in MongoDB or a table in traditional relational DBMSs:

tarantool> box.schema.space.create(‘customers’)
Tarantool data model

Data is stored in tuples inside spaces. You can define indexes on tuple fields.

Box is the main module that contains Tarantool’s core functionality and provides a full-blown Lua interpreter, where you can define variables and functions and load and call third-party Lua packages. It’s possible to issue requests to Tarantool from other programming languages. Currently supported are connectors for Node.js, PHP, Go, Java, .NET, R, Erlang, C, Perl and Python. Tarantool 1.8 also supports the ODBC and JDBC connectors.

Tarantool IIoT

Tarantool can run both on powerful servers with hundreds of gigabytes of RAM on board and on regular virtual machines offered by cloud service providers. But what’s even more important is that Tarantool works on ARM-powered devices, which means it can be used in the Internet of Things and the Industrial Internet of Things applications. There’s even a specialized Tarantool version for IIoT devices — Tarantool IIoT. It supports major protocols for working with sensors (MQTT and MRAA) generating massive amounts of data that needs to be processed in real time. Tarantool IIoT’s toolbox allows creating scripts that describe how to obtain data from sensors installed on various industrial devices, how to process this data and how to save it to a local database. Tarantool’s built-in features help replicate data to IIoT hubs — and from there to one or several data centers. As a result, developers don’t need to worry about unreliable channels between devices and a data center or to painstakingly model the client-server interaction — they can just focus on implementing the business logic instead.

Modern applications need fast data access in many situations. Replicating on-disk DBMSs is a suboptimal solution, as it entails a greater number of servers, more difficult support and higher costs. That’s why increasingly often developers choose in-memory DBMSs. However, most in-memory solutions can’t act as the main database. An all-in-one document-oriented DBMS and application server Tarantool allows designing systems that combine the features of several products (cache or traditional DBMS) and work both on powerful servers and on ARM-powered IoT devices. In-memory and on-disk data is always in sync, which ensures its consistency. Tarantool’s tools enable writing the code that processes data right where it resides.