Caching 1 billion tweets on a laptop
“Why would I need to do that — to cache 1 billion tweets on a laptop?” a curious reader may ask. Well, think about it: people love all kinds of challenges and competitions. Whether it’s finding out who’s the fastest 100-meter runner, who can eat the most burgers in three minutes, or even who’s the smartest math student at the local community college (okay, maybe that last one’s not so exciting). We’re fascinated by feats of human (and sometimes machine) achievement, and we can’t resist sharing the weirdest records from the Guinness Book with our friends.
So, let’s consider this task of caching a billion tweets as one of those challenges — an attempt to set a new record, or perhaps satisfy the whims of a boss with grand ambitions (let’s call him E.M. from a company that starts and ends with “X”). Imagine if he asked you to figure out if it’s possible to use the combined power of your company’s fleet of laptops (say, 1,000 employees) to offload all tweets from the data center, sell the servers, and buy some shiny new AI gear to write and read tweets for you. Sounds fun, right? Well, let’s dive in and see how feasible it really is.
First, we need to find 1 billion tweet messages. Then, we’ll write an application to load these tweets onto… well, some kind of server. Since we’re dealing with caching, our usual suspects are Redis and Memcached, right? While Redis and Memcached are well-known and highly regarded, used by millions of companies worldwide, there’s a newcomer in the field: Memcarrot. Let’s add it to our list of contenders.
Disclaimer: Memcarrot is the product of Carrot Data, and I am its lead developer.
It turns out we don’t need to write our own client application (as my grandma used to say, “all the good books have already been written”). Luckily for us, there’s an existing tool that can help with our challenge: Membench. It’s the first benchmark application specifically designed to test memory usage in caching servers. Currently, it supports both Memcached and Redis, which works perfectly since Memcarrot speaks Memcached’s language. Membench allows you to test the performance of caching servers (specifically memory usage) across 10 diverse real-world datasets — Twitter data being one of them. So far, so good…
Disclaimer: Membench is also our product and it is completely free, I mean — open source. Try it.
Membench requires some initial setup (you can find all the instructions on its github page). You’ll need to download and extract the twitter_sentiments dataset. After that, the process is straightforward: start the server of your choice, run Membench, check the memory usage, and compare the results. Membench extracts Twitter messages from this dataset and loads them into the caching server using a compound key: "KEY:n"
, where n
ranges from 0 to 1,000,000,000. This gives an average key size of 13 bytes. The average tweet size is 76 bytes, and we also need to account for the expiration time—4 bytes for Memcached, 8 bytes for Redis, and only 2 bytes (!!!) for Memcarrot (This is a topic for a separate discussion on how we can fit an expiration time ranging from 1 second to 16,536 minutes with 99.7% accuracy into just two bytes). Altogether, we need approximately 93–97 bytes per tweet, which is just the raw data, excluding any overhead added by the specific caching software. This is 93–97 GB of RAM for 1 billion messages. Our laptop maximum RAM size is just 64GB (MacBook Pro M1), so it does not appear possible. Let us prove it to our boss.
A few words about our dataset before we continue… It contains 1.6 million unique tweets, but we need 1 billion. Membench simply reuses them, so there will be around 600+ copies of each message in a dataset of 1 billion. This is fine because neither Redis, Memcached, nor Memcarrot prioritize data deduplication in this scenario. Caching servers are not deduplication software; they serve a different purpose.
Test setup
Hardware: Mac Studio M1, 64GB RAM
OS: Mac Sonoma 14.6
Command line to run Membench:
bin/membench.sh -b twitter_sentiments -n 1000000000 -t 8
We specify total number of records to load -n 1000000000
, dataset — -b twitter_sentiments
and number of application threads -t 8
.
Redis 7.2.5
We launch redis-server
with the default configuration, where maxmemory=0
(no maximum memory limit). We've already come to the realization that it's nearly impossible to fit 1 billion tweets into 64GB of RAM—at least for Redis and Memcached. So, we’ll start by testing with 100 million tweets first.
Start Redis server:
$ redis-server
Start Membench:
bin/membench.sh -b twitter_sentiments -n 100000000 -t 8 -l redis -p 6379
- Membench output:
RPS=112K
. It took 890
seconds to load 100 million tweets into Redis. While Membench isn’t fully optimized for Redis yet (it doesn’t use pipelining, which could improve performance), this result clearly highlights the limitations of Redis’s single-threaded implementation.
- RAM usage
It’s 20.24GB
, which gives an estimate of 202GB
for 1 billion tweets.
Memcached 1.6.29
Start Memcached:
$memcached -m 60000m -v
Start Membench:
bin/membench.sh -b twitter_sentiments -n 200000000 -t 8
We load 200M tweets to get an estimate for 1 billion.
- Membench output:
RPS=678K
, time to load 200M tweets is 294
seconds.
- RAM usage
It’s 32.90GB
, which gives an estimate of 164.5GB
for 1 billion tweets.
Memcarrot 0.14
We ran this benchmark on a standalone server (not in a Docker container). Since the default configuration isn’t sufficient for a 1 billion tweets benchmark, we need to tweak the settings before running the test:
- Configuration file
conf/memcarrot.cfg
can be found in thememcarrot
installation directory. These are settings for this run:
#
# Data segment size
# Default is 4MB (Memory)
data.segment.size=16m
#
# Maximum storage limit to use for cache
# Default - 1GB
storage.size.max=60g
#
# Initial size of the MemoryIndex hash table, i.e size= 2**valueOf('index.slots.power')
# Default: 16 - 64K slots (2**16). To reduce objects collision probability keep this
# number close to the maximum for your application. Let us say, that you estimate for
# maximum number of objects in the cache is 100M. Your maximum size for the memory index
# hash table is 100M/200 = 0.5M and maximum value for this configuration option is log2(0.5M) ~ 20
#
index.slots.power=23
#
# Class name for main queue index format implementation
# BaseIndexFormat costs 20 bytes of RAM per object
# There are several formats to choose from whose RAM overhead ranges between 6 and 20 bytes
# The most optimal for performance-size are:
# index.format.impl=com.carrotdata.cache.index.SuperCompactBaseNoSizeWithExpireIndexFormat - 11 bytes per object
# index.format.impl=com.carrotdata.cache.index.SubCompactBaseNoSizeWithExpireIndexFormat - 12 bytes per object overhead
#
#index.format.impl=com.carrotdata.cache.index.BaseIndexFormat
index.format.impl=com.carrotdata.cache.index.SuperCompactBaseNoSizeWithExpireIndexFormat
#
# Compression dictionary size
# Usually the bigger - the better, but not always
#compression.dictionary.size=65536 (default)
compression.dictionary.size=1048576
#
# Compression level (-1-22 for ZSTD)
# Recommended options: -1 (faster) and 3 (still fast, compression is better)
# There is no need to go below -1 (this will be covered in a future by LZ4 codec,
# which have comparabale compression ratios but better speed)
#
compression.level=10
# Make cache persistent
save.on.shutdown=true
Start Memcarrot:
bin/memcarrot.sh start
Start Membench (first try 100M tweets):
bin/membench.sh -b twitter_sentiments -n 100000000 -t 8
- Membench output:
Nice, RPS=796K
, time to load 100M tweets is 125
seconds.
- RAM usage:
Yes, it’s a Java application, and Memcarrot is almost entirely written in Java, with some native code to support data compression. As you can see, it’s using only 5.5GB
, so we might have a winner here. Let’s go ahead and try loading 1 billion tweets.
- Membench output:
RPS=728K
, time to load 1 billion tweet messages is 1373
seconds.
- RAM usage:
It’s 50.83GB
for 1 billions of tweet messages. This is 4x less than Redis and 3.2x less than Memcached. We have a winner!
RAM usage
What about persistence?
Both: Redis and Memcarrot support data persistence, which allows them to survive server’s host reboot. How do they compare?
Redis 7.2.5 (100M tweets dataset)
It took 143
seconds to save 100M tweet dataset, which gives us approximately 1430
seconds for 1 billion tweet dataset.
Memcarrot 0.14 (1B tweets dataset)
It took 51
seconds to save cache on shutdown, which is ~28x faster than Redis.
Summary
Memcarrot 0.14 uses 4 times less memory than Redis 7.2.5 and 3.2 times less memory than Memcached 1.6.29 on the Twitter dataset. Additionally, Memcarrot 0.14 is up to 28 times faster in cache saving and loading operations compared to Redis 7.2.5.
So, will X be able to cache all tweets on employees’ laptops? The answer is — no, of course not. Even with Memcarrot’s outstanding memory performance, only slightly over 1 billion tweets can be cached on a single laptop, or 1 trillion across the entire company’s laptop fleet, which is roughly equivalent to a year’s worth of tweets (give or take). In comparison, Memcached and Redis fall far behind, handling approximately 312 million and 250 million messages on a single laptop, respectively. Now we know who’s going to earn a spot in the Guinness Book of Records. 😄
References
- Memcarrot : https://www.github.com/carrotdata/memcarrot
- Membench : https://www.github.com/carrotdata/membench
- Carrot Data : https://trycarrots.io
- Redis : https://redis.io
- Memcached : http://memcached.org