Tarantool vs competitors: racing in Microsoft Azure
Tarantool is a NoSQL DBMS which is developed and widely used in Mail.Ru Group. To get a better idea of Tarantool use cases, see the following articles:
Mail.Ru Group has recently published a virtual machine with Tarantool at the Microsoft Azure Marketplace:
- Tarantool page at the Microsoft Azure Marketplace
- Tarantool, an open-source DBMS from Mail.Ru, certified and published on the Microsoft Azure Marketplace
We decided to check out how well Tarantool works in Microsoft Azure if compared with similar solutions: Azure Redis Cache, Bitnami Memcached, Aerospike and VoltDB. Saying “how well” we actually mean “how fast”, as we’ll be benchmarking the throughput in RPS (requests per second).
Azure Redis Cache
We’ll need an instance of Azure Redis Cache running on a Basic C4 (13GB) virtual machine without SSL (for fair comparison, we don’t need SSL; and the Basic tier is needed to avoid replication). Azure Redis Cache is offered as a service, so we don’t have access to the virtual machine. We don’t know its settings, nor can we change them. The approximate monthly fee for a Redis Cache instance of our size is $156.
We’ll need one virtual machine, Tarantool VM Standard D11 (14GB with HDD). The approximate monthly fee for this machine is $145. We’ll be testing Tarantool in two modes: with enabled write-ahead logging (for data persistence) and without, because we cannot say for sure whether this feature is enabled for the Azure Redis Cache instance.
To change the write mode, we open file “/etc/tarantool/instances.enabled/example.lua” and change option “wal_mode” (the value can be “none” to disable WAL, “write” to enable WAL, and “fsync” to enable filesystem cache bypass while maintaining the WAL).
We will use HASH index in Tarantool which should be a fair match to index types used in the competing databases.
We took a Standard D11 virtual machine with a pre-installed instance of Bitnami Memcached.
At the Microsoft Azure Marketplace, there is one more Memcached solution — Memcached Cloud by Redis Labs — which is presently available only in the USA, so we couldn’t test it this time.
Upon starting the virtual machine with Memcached, we used the option –S to disable authentication in the configuration file memcached.conf.
Memcached cannot ensure data persistence.
For Aerospike, we took an official image published at the Microsoft Azure Marketplace (Standard D11).
Unfortunately, VoltDB is not published at the Microsoft Azure Marketplace yet. We had to use an empty VM image (Ubuntu 14.04 LTS) and install VoltDB manually from source files. On the bright side, we were pleasantly surprised with the out-of-the-box web administration page that displayed live graphs for many statistic figures, including the current RPS.
Let’s start with a so-called “sync/async test”. Here we’ll be using a synchronous interface, but deeper inside we’ll be working with the connection in the asynchronous mode. This kind of test allows us to simulate multiple synchronous clients using a single connection. To eliminate any doubts regarding equal conditions for Redis Cache, Tarantool VM and Memcached, let’s move the common test logic away to an abstract class named NoSQLConnection, and then inherit three more classes from it: TarantoolConnection, RedisConnection and MemcachedConnection (see the benchmark source code).
The abstract class has two queues (it’s just the standard std::list), OutputQueue (for sending queries to the socket) and InputQueue (for receiving replies from the socket), and two methods, SendThreadFunc and ReceiveThreadFunc, each running in a separate thread and, if the corresponding queue is not empty, sending/receiving information in a batch using the methods Send and Receive (these are purely virtual methods, implemented further in the derived classes).
The synchronous interface is implemented with the method DoSyncQuery, which puts a query into the OutputQueue and waits for reply in the InputQueue. The virtual machine for our tests should be powerful enough (we used Standard D3), and it should be geographically located close to the database (we used the location “West US”).
Due to the specifics of client libraries for Aerospike and VoltDB (embedded event-loop), we had to tailor our test code for them individually.
Well, everything is ready now. Let the race begin! Within the initial range of 1 to 10 “client” threads (the number of threads grew with the increment of 1 thread), the throughput is close to the fully synchronous mode (one thread being the synchronous mode actually). The graphs illustrate that the throughput grows more or less linearly. Redis and Memcached are running side-by-side, Tarantool is faster, Aerospike is the leader, while VoltDB is the slowest one.
At the next graph (10 to 100 threads, growing with the increment of 10 threads), the trend of linear growth continues for Tarantool, Redis and Memcached, while Aerospike and VoltDB slow down, each at a different point.
Further on, the number of threads reaches 1000 (with the increment of 100 threads). Here we observe no more rapid growth, and for Memcached there’s no growth at all.
Finally, we increase the number of threads from 1000 to 8000 (with the increment of 1000 threads). The growth stops for all the racers. After 4000 threads, Memcached went down: it closed the connections, so we couldn’t test it with the ultimate workloads. VoltDB went down even earlier, at 3000 threads.
All in all, this test reveals Tarantool as the absolute leader for high workloads (for moderate workloads, the leader is Aerospike).
What about a purely synchronous test?
Now let’s launch our sync/async test with a single thread to make the test purely synchronous. But wait, to simulate multiple clients, we need multiple connections… All right, let’s launch several synchronous tests in parallel and sum up the results then.
We didn’t run this test for Aerospike and VoltDB.
As you can see, in the purely synchronous test our racers hit a “ceiling” which is lower than in case of the sync/async test. This is the result of network overheads.
All the tested solutions — Tarantool, Memcached, Aerospike and VoltDb — are distributed for free, so in Azure we pay a fee only for using the virtual machines where they run. We mostly used Standard D11 machines (14GB of RAM) offered for approximately $145/month. The price for running an instance of Azure Redis Cache was slightly higher, approximately $156/month for a basic C4 instance (13GB of RAM). Let’s illustrate these figures with graphs.
Hmm, the prices are pretty much the same. How to compare them?.. Well, as we saw it earlier, the tested databases have different throughput rates (measured in RPS). What if we calculate the price for processing, say, 10 billion requests instead? Let’s start with the price for 10 billion write requests from 1000 clients.
VoltDB is too explicitly an outsider here. Let’s remove it from the comparison.
Now let’s remove Aerospike and Memcached to better estimate the leaders’ result.
Now let’s check what happens if we process 10 billion read requests from 100 clients.
Let’s take a closer look at the leaders.
During the tests, Tarantool loaded the CPU by up to 70% in the sync/async test and by up to 100% in the purely synchronous test. The graphs indicated that Tarantool VM outperformed the other racers in all tests, no matter what the WAL mode was. It’s noteworthy that enabling/disabling WAL doesn’t affect Tarantool’s read rate (on the graphs, the orange and the grey curves stick closely together), because Tarantool doesn’t use the disk to read data. Moreover, Tarantool VM proved the cheapest solution both in terms of per-month and per-request prices.