Understanding Database Partitioning in Distributed Systems : Key-Value Data

Published in

The Developer’s Diary

8 min readJan 15, 2024

In the realm of distributed systems, database partitioning is a fundamental concept that often gets overshadowed by more talked-about topics like replication. But what exactly is database partitioning, and why does it hold such importance?

Database Partitioning: A Simple Definition

Imagine a library with thousands of books. If all these books were stacked in a single, giant shelf, finding a specific book would be like looking for a needle in a haystack. Now, what if we divide these books into several shelves, categorizing them by genres, authors, or publication years? Suddenly, finding that one book becomes much more manageable. That’s essentially what database partitioning does. It divides a large database into smaller, more manageable segments, often based on specific criteria like the range of data or the type of data.

Why Partitioning Matters

Partitioning is crucial for several reasons:

Performance Improvement: Just like our organized library, partitioning makes data retrieval faster and more efficient.
Scalability: As the amount of data grows, it becomes easier to expand a partitioned database across multiple servers.
Availability: If one partition experiences issues, it doesn’t necessarily cripple the entire database.

Partitioning vs. Replication: Understanding the Difference

Image ref: Designing Data-Intensive Applications

In previous articles on Medium.com, we’ve discussed replication extensively. It’s important to note that partitioning and replication are not the same. While replication involves creating copies of data for redundancy and availability, partitioning is about dividing the database to improve manageability and performance. Think of replication as having multiple copies of a book for backup, whereas partitioning is about organizing these books into different shelves.

Partitioning is an integral part of designing an efficient distributed system. By understanding and implementing it effectively, we can ensure our systems are not only robust but also scalable and performant. In our next sections, we’ll dive deeper into the types of partitioning and how to implement them in real-world scenarios.

Balancing the Load: Partitioning Key-Value Data

Imagine you’re at a party and there’s a giant cake to share. Now, if everyone gets an equal slice, it’s fair and satisfying. But what if one person ends up with a much larger piece? That’s a bit like what happens when we partition key-value data in distributed systems, except instead of cake, we’re dividing data and workload among multiple nodes.

Why Even Partitioning Matters

When we have a massive amount of data in a system, we need to decide which data goes to which node. The goal is simple: we want to spread the data and the workload evenly across all nodes. Think of it as having 10 workers and 10 equal piles of work. If each worker takes one pile, the work gets done efficiently. But what if it’s not even?

The Problem of Skewed Partitioning

Sometimes, partitioning can be unfair, leading to what we call ‘skewed’ distribution. This is like having 9 workers idle and 1 worker overwhelmed with all the work. In database terms, this creates a ‘hot spot’ — a node with a disproportionately high load. It’s like if all party guests crowd around one tiny section of the cake, leaving the rest untouched.

Random Assignment: A Double-Edged Sword

One way to tackle this is by randomly assigning records to nodes. This is like slicing the cake without looking, hoping each piece is roughly the same size. It’s great for distributing data evenly, but it can make accessing (reading) the data more complex. While random assignment has its challenges, there are methods to make it more efficient, especially for handling read loads

Partitioning by Key Range

Let’s talk about partitioning by key range, a method that’s a bit like organizing a bookshelf by genres. But instead of books, we’re dealing with data, and instead of genres, we’re using key ranges.

Partitioning by Key Range: The Basics

Imagine you’re dividing a long list of contacts into sections. One way to do this is to assign each section to a continuous range of letters (A-F, G-L, and so on). In database partitioning, we do something similar by assigning each partition to a continuous range of keys. Knowing which partition holds which range makes it easier to find what you’re looking for.

Choosing Range Boundaries Wisely

It’s crucial to select range boundaries carefully to distribute data evenly. For instance, partitioning alphabetically may seem straightforward, but it’s not always balanced. Think about it: there are far more words starting with ‘A’ than with ‘X’, ‘Y’, or ‘Z’.

Efficiency in Sorting

Each partition can maintain a sorted order, using structures like SSTables or LSM trees, to make range queries more efficient.

A Practical Example: Network Sensors

Consider an application that stores data from network sensors, where the key is the timestamp of measurement (like ‘2024–01–15–10–30–00’). Range queries, like asking for all measurements within a certain hour, are highly efficient in this setup.

The Challenge of Hot Spots

But there’s a catch. For certain access patterns, this can lead to hot spots. If the key is a timestamp, and we’re always writing new data, all the writes end up in the same partition. This can overload one partition while leaving others idle.

A Smart Solution: Partition by More Than Time

To avoid this, we can partition by more than just the timestamp. For instance, by adding a sensor name as a prefix to the timestamp (like ‘SensorA_2024–01–15–10–30–00’), we first partition by sensor, then by time. Assuming sensors are active at different times, this method spreads the load more evenly.

A Trade-Off

This approach does have a trade-off. To fetch values from multiple sensors, we need to perform separate range queries for each sensor, which can be a bit more complex but ensures a more balanced system.

Partitioning by Hash of Key

Hash Function: The Great Equalizer

In many distributed data systems, a hash function is used to avoid uneven data distribution, or what we call ‘hot spots.’ A good hash function is like a magical machine: it takes skewed, uneven data and turns it into something uniformly distributed. Imagine it like a blender turning a variety of fruits into a smooth mix. Once you’ve got this smooth mix, you can easily divide it into different containers, or in database terms, partitions.

How It Works

After hashing, each partition gets a range of these hashes. Any key whose hash value falls within a partition’s range is stored in that partition. It’s a bit like sorting different types of fruit based on their blended color.

The Trade-Off: Range Queries

However, this method has a downside. Remember how key-range partitioning let us easily find data in a specific range? Hash partitioning scrambles this order. Keys that were neighbors before hashing might end up in totally different partitions. It’s like trying to find specific pieces of fruit after they’ve been blended — they’re all mixed up.

In systems like MongoDB, if you use hash-based sharding (a way of partitioning), any range query has to be sent to all partitions, because the original order of keys is lost.

Cassandra’s Compromise

Cassandra, a popular database system, offers a middle ground. It uses a compound primary key with several columns. The first part of the key is hashed to determine the partition, while the other parts are used as an index for sorting within that partition. This means you can’t do range queries on the first column, but you can efficiently scan ranges of the other columns.

Real-World Example: Social Media Posts

This approach is perfect for something like social media, where you might want to see all the posts from a user within a certain time frame. By partitioning users and then sorting their posts by time, you can quickly find all the updates made by a user in a specific range.

Tackling Skewed Workloads and Hot Spots

Even with the best hashing techniques, some hot spots can still occur, especially in extreme cases where most read and write operations end up targeting the same partition. Let’s delve deeper into this issue and explore practical solutions.

The Celebrity Scenario

Imagine a social media platform with celebrities having millions of followers. Any action by these celebrities — a new post or a tweet — could trigger a storm of activity, concentrating a huge load of reads and writes on a single key, like the celebrity’s user ID or the post ID that fans are commenting on.

Beyond Automatic Solutions

While many databases can automatically adjust to some extent, they can’t always fully compensate for such intense activity spikes. It often falls upon the application itself to creatively manage these hot spots.

A Simple Yet Effective Technique

One straightforward approach is to append or prepend a random two-digit number to the ‘hot’ key. This technique effectively splits the load that would have gone to one key across 100 different keys. For example, instead of having all interactions tied to a key like ‘Celeb_Post123’, they get distributed across keys like ‘01_Celeb_Post123’, ‘02_Celeb_Post123’, and so on, up to ‘100_Celeb_Post123’. This disperses the activity more evenly across the system.

The Trade-Off: Read Complexity and Bookkeeping

While this method is effective in reducing hot spots, it introduces its own complexities. Now, to gather all interactions related to ‘Celeb_Post123’, the system has to retrieve and combine data from all 100 variations of the key. This not only complicates the reading process but also requires additional bookkeeping. Therefore, it’s a strategy best used sparingly, reserved for only a few keys that are known to be particularly ‘hot’.

Conclusion

As we conclude our exploration of database partitioning in distributed systems, we’ve navigated through the intricacies of key-range and hash partitioning, and tactics for mitigating skewed workloads and hot spots. This journey underscores the delicate balance between efficiently distributing data and avoiding performance pitfalls. While we’ve discovered methods like key modification to alleviate hot spots, they come with their own set of complexities, particularly in reading and managing data. This series on partitioning will continue, delving next into the realm of secondary index partitioning, an advanced and crucial aspect of modern databases. Our exploration is a testament to the ever-evolving nature of database technology, reminding us that in the world of distributed systems, the pursuit of knowledge is an ongoing and dynamic journey.

Ref: Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems