Clean Architecture — DataTables Support using Partitioned Repository with Cosmos DB
Extend Microsoft doc on pagination, querying, and LINQ by demonstrating server-side processing for DataTables.
This article aims to complement the following Microsoft documentations by demonstrating server-side processing for DataTables in an ASP.NET Core API application using Cosmos DB .NET SDK V3 and Partitioned Repository Pattern. Here is an link to the GitHub project.
- Pagination in Azure Cosmos DB | Microsoft Docs
This article explains how the queries in Cosmos DB execute and split the query results into multiple pages. When working with Cosmos DB, you probably expect to have a large dataset and splitting query results certainly improve the performance.
- Tutorial: Query Azure Cosmos DB by using the SQL API
This article provides sample SQL queries to read data from Cosmos DB. For anyone with a SQL Server background, this article is pretty straightforward to follow. The only follow up question is: how do we write such queries in an ASP.NET Core API application? Since we will not use raw SQL directly in our code, we will use LINQ, which will translate our IQueryable into queries. For Cosmos DB SQL API, IQueryable will be translated into SQL queries.
- LINQ to SQL translation in Azure Cosmos DB | Microsoft Docs
This article explains how a LINQ query is translated into a Cosmos DB SQL query, how to properly use IQueryable, and the supported LINQ operators.
Why Server-Side Processing?
If you have not used DataTables before, check out Material UI DataTable or JQuery popular plugin called DataTables. A data table is basically an advanced table that supports pagination, sorting, searching, and export functionalities. The easiest way to run DataTables is to load all data to the client/browser, and the DataTables plugin shall take care of the rest. When working with small amount of data, this is amazing since you need almost no effort!
Imagine you have a business that is doing amazing and you have 10 million customers, now you want to build a data table for your customers data. This is when you can not load 10 million records to the client end and when you would want to use server-side processing. When working with server-side processing, the server-side will have to implement pagination, searching, and sorting. Fortunately, database is designed to perform such tasks anyways and will do it well!
First, in the IRepository.cs class, let’s make sure we have these two data access contracts defined. One will be used for searching, the other should return the total number of records for pagination purpose.
Second, implement the above contracts in our Cosmos DB repository, CosmosDbRepository.cs. A couple of notes:
- Within GetItemsAsync() method, we are checking if there are more results in a while loop because Cosmos DB may return the results in multiple pages. GetItemsAsync() shall return an IEnumerable of records, which shall be used to populate the rows in a data table.
- Within GetItemsCountAsync(), we define the IQueryable and let the database do the count at the database end, which means we should not need to retrieve all the records and do a count (a very dangerous approach). GetItemsCountAsync() shall return the total count, and the total count will be used for pagination purpose in a data table.
Now we have the building block repositories ready. We can create our search class, Search.cs. Notice:
- Search query requires the Start and Page Size, which shall translate to OFFSET LIMIT clause in Azure Cosmos DB.
- Search query also allows Sort Column and Sort Direction, which shall translate to ORDER BY clause in Azure Cosmos DB
- Search result will include the current page, total number of records, and the actual list of data records (data rows).
- FluentValidation is used to define the validation rules for the search query.
- When defining our IQueryable, we use specifications to separate the query logic into a shared file. But it is pretty much the same thing as how you would define an IQueryable directly:
var qry = from cust in customers where cust.Name = “Joe”.
The one benefit of using specification is code reuse. Notice two specifications are used, ToDoItemSearchSpecification and ToDoItemSearchAggregationSpecification. Both of them build the IQueryable, one contains sorting for pagination purpose, the other one has no sorting for counting purpose because you can not run a query like SELECT COUNT(*) FROM customer ORDER BY Name.
- MediatR Query/Command Pattern is used here. The handler returns the response to the calling Controller.
Specification for searching Todo Item, ToDoItemSearchSpecification.cs. All it does it to build an IQueryable, which only defines what needs to run against the database, but won’t run until it is cashed. E.g. ToList() is called or IQueryable is iterated using for-each loop. ToDoItemSearchAggregationSpecification.cs is just ToDoItemSearchSpecification.cs minus the sorting part since we can not run aggregations like SUM and COUNT along with ORDER BY.
Now we should have everything we need to build our API endpoint! Let’s see our ToDoItemController.cs. Notice the DataTableResponse class defines what is required by the data table. Depending on the exact data table you are using, the response may vary slight, but data and total count properties are pretty universal.
QA and TEST! Let’s see it in action in Swagger UI!
Side note — Partition Key Design in Cosmos DB
Cross partition queries are expensive. In order to make our queries single partition queries, one trick to store all related data in one single container is to add a meta type field to your data and use that as your partition key. For example, in an ecommerce database, all product category items can have a MetaType field with value “category”, then all you need to call is SELECT * FROM c WHERE c.MetaType = “category”. All product tags items can also be stored in the same container, and they would just have a MetaType field with value “tag”. This container now would contain product tag items and also product category items. Since each logical partition has 20G storage size, 20G is more than enough for either category data or product tags data.
Microsoft has a great video discussing Data modelling and partitioning in Azure Cosmos DB. If you have time, check it out!
Implementing server-side pagination and searching in an API backed by Cosmos DB is not too much different from an API backed by a SQL Server, thanks to the help from LINQ. Entity Framework support for Cosmos DB is still pretty new and limited, so I would recommend using the Cosmos DB .NET SDK, which gives you full access to all the functionalities.
Thanks for reading! Enjoy making the world a better place with clean code!