Database Sharding features and how it works.

Gagan Jain
2 min readMar 12, 2023

--

Database sharding is a technique that involves partitioning a large database into smaller, more manageable pieces called “shards”. Each shard is stored on a separate server or instance, and incoming requests are routed to the appropriate shard based on a shard key.

image Ref: https://www.notion.so/blog/sharding-postgres-at-notion

Here’s an example of how database sharding works:

Suppose you have a large e-commerce website that stores information about customers, orders, products, and reviews in a single database. As your website grows and the volume of data increases, you start to experience performance issues, such as slow query response times and high database loads. You realize that you need to scale your database to handle the growing workload.

To scale your database, you decide to implement sharding. You choose to shard the customer data by country, so each shard will contain the customer data for a specific country. You choose a shard key based on the customer’s country code, which will be used to route incoming requests to the appropriate shard.

Here’s how the sharding process works:

  1. Partitioning: You partition the customer data into shards based on the shard key (country code). For example, you might create one shard for customers in the US, one for customers in Canada, one for customers in the UK, and so on. Each shard contains only the customer data for a specific country.
  2. Routing: You configure your database or load balancer to route incoming requests to the appropriate shard based on the shard key (country code). For example, if a user from Canada logs in, their requests will be routed to the shard that contains the customer data for Canada.
  3. Replication: You replicate each shard across multiple servers or instances to provide redundancy and improve availability. For example, you might replicate the Canada shard across three servers to ensure that the customer data is always available, even if one server fails.
  4. Management: You use a management tool or dashboard to monitor and manage the shards. For example, you might use a tool that allows you to add or remove shards, rebalance the workload across servers, and monitor the performance of each shard.

By implementing sharding, you can improve the performance, availability, and scalability of your database. Each shard contains a smaller subset of the data, which can be processed more efficiently and quickly. Additionally, you can add or remove shards as needed to accommodate changes in the workload.

--

--