Partitioning in Azure Cosmos DB

chamathka maddugoda
Partitioning in azure Cosmos DB
4 min readDec 29, 2021

Azure Cosmos DB is a globally distributed database service.

Why do we need a globally distributed database service?

well if you think of a scenario where there is an online retailer site , the customer base will not be specifically from one country and customers will be distributed all around the globe. Hence you will need to make sure our application is available to the customers all the around the world without any delay and downtime. In order to cater this requirement, we will need to distribute the instances of our applications in the data centers that are closest to them. And these applications will need globally distributed database services. So, Azure Cosmos DB is a great choice in there !

What is Partitioning?

Partitioning is a crucial concept in globally distributed database services like azure cosmos db. Partitioning denotes how our data gets distributed. Mainly there are two types of partitions.

  1. Physical Partitions
  2. Logical Partitions

Physical Partitions:

physical partitions are the physical servers available in order to store our documents in azure cosmos db.

Logical Partitions:

Unlike the physical partitions, logical partition is a virtual concept. It can be demonstrated as a virtual bucket where the data gets wrapped around based on the partition key we select.

Our Main goal here is to make sure that our documents get evenly distributed among the available physical servers .Then only we can make sure our customers receive the services without any downtime or delay. Distribution of data across these physical partitions will be handled by Azure Cosmos DB itself. However, before the data/documents getting distributed across these physical partitions they will be first wrapped with logical partitions as can be seen below.

demonstration of physical partitions
demonstration of logical partitions

When we are creating a container in azure cosmos DB , we get to select something called a partition key. Based on this partition key, our documents will be virtually wrapped with logical partitions.

partition key on azure portal

Since the grouping of logical partitions will be decided based on the partition key we select, we should be much careful when we are selecting the partition key. Inappropriate selection of partition key can make us get caught into two types of traps.

  1. Hot partition
  2. Query fan Out

Hot Partition

If the workload is write-heavy(write is more common than read) , we should make sure our data gets distributed to services evenly without getting directed to one partition all the time. If all the documents get written into one logical partition( If all the documents contain the same partition key) that partition will become a hot partition.

Similarly ,If the workload is read-heavy(read is more common than write) , we should make sure our requests gets even distribution to all the partitions. If all the requests gets directed to a single logical partition that partition will become a hot partition for reads. In most scenarios, date is a weak choice for partition key and often leads to hot partitions as current date always becomes hot for any day.

We should avoid both of these types of hot partitions when performing read-write operations in azure cosmos Db.

Query fan Out

This condition occurs when when the partition key we selected is different to the query we use. For example, if we have partitioned our data according to the product name and if query being used to filter is also product name , azure cosmos DB will be able to efficiently detect the partition with required product name. But, in case the requirement changes ,it will cause to hit each and every physical server drastically reducing the performance and leading to a query fan out.

Hence , when selecting the partition key we should think twice and select the best key which will avoid getting caught into these two traps. Hope to carry out a detailed description on how to select the best partition key. Until then good bye !!

--

--