What is Database Sharding?

Anvesh
SilentTech
Published in
2 min readJul 7, 2024

Sharding, also known as horizontal partitioning, is a database partition approach that divides the database schema and distributes them across multiple instances or servers into smaller parts that are faster and easier to manage.

Database sharding splits a single dataset into partitions or shards.
Each shard contains unique rows of information that you can store separately across multiple computers, called nodes.
All shards run on separate nodes but share the original database’s schema or design.

When a database is sharded, a replica of the schema is created. This is then used to divide data to be stored in a shard based on a shard key.

Software developers use a shard key to determine how to partition the dataset.

Database sharding operates on a shared-nothing architecture. Each physical shard operates independently and is unaware of other shards.

Types of Data Sharding:

Range-based sharding
Hashed sharding
Directory sharding
Geo sharding

Range-based Sharding:
Range-based sharding, or dynamic sharding, splits database rows based on a range of values. Then the database designer assigns a shard key to the respective range.
For example, the database designer partitions the data according to the first alphabet in the customer’s name. (or) by defining the Maximum number of rows in the single shard.

Hashed Sharding:
Hashed sharding assigns the shard key to each row of the database by using a mathematical formula called a hash function. The hash function takes the information from the row and produces a hash value. The application uses the hash value as a shard key and stores the information in the corresponding physical shard.

Software developers use hashed sharding to evenly distribute information in a database among multiple shards.

Directory Sharding:
Directory sharding uses a lookup table to match database information to the corresponding physical shard.
A lookup table is like a table on a spreadsheet that links a database column to a shard key.

Geo Sharding:
Geo sharding splits and stores database information according to geographical location.

Other Database Scaling Techniques:

Vertical Scaling
Data Replication

Vertical Scaling:
Vertical scaling increases the computing power of a single machine. For example, the IT team adds a CPU, RAM, and a hard disk to a database server to handle increasing traffic.

Data Replication:
Replication is a technique that makes exact copies of the database and stores them across different computers. Database designers use replication to design a fault-tolerant relational database management system. When one of the computers hosting the database fails, other replicas remain operational.

Thank you for reading.
you can follow me on LinkedIn and Medium

--

--