Clean Architecture — Azure Functions Using Partitioned Repository Pattern with Cosmos DB

Discussing an All-in-One starter project for Azure Functions to work with Cosmos DB using Partitioned Repository Pattern, with features like Dependency Injection, structured logging, strongly-typed configuration, email service, etc..

Shawn Shi
The Startup
7 min readNov 3, 2020

--

Screenshot by Author

Goal Overview

The goal of this article is to discuss a starter project that can be used to work with Azure Functions and Cosmos DB and be (almost) production ready without having to worry about solution design. This project is designed in Clean Architecture, and includes many of the common features that you would expect on a production environment, such as structured logging using Serilog, strongly-typed configuration, Dependency Injection for Inversion of Control, email service, etc..

Before we dive into the actual project, let’s do a quick overview.

Azure Cosmos DB — 10,000-Foot Overview

Azure Cosmos DB is the managed NoSQL database service provided by Microsoft. It supports APIs like SQL, Cassandra, and MongoDB etc., which allows the developers to use query language of their choice. The most appealing feature to me myself as a Data Engineer is that Cosmos DB takes care of:

  • Database horizontal partitioning. This allows the developer to strategically design the partition keys so that relevant data is in one single partition, which in return results in fast read and write queries. Even though according to Microsoft, each partition has a storage limit of 20G, horizontal database partitioning also allows the database to scale out to infinite number of partitions and store infinite amount of data without sacrificing performance. Compared to a traditional Azure SQL database, storage size alone is a big plus when working with big data, since Azure SQL has a limitation of 1T for standard tier S12 and 4T for premium tier P15 as of this writing according to Microsoft.
  • Database replicas. This means each partition has multiple replicas that are synced up. When one of the three replicas fail, data is automatically recovered from the other two replicas. Most often, at least one replica is stored in an entirely different geological region to reduce the chance of three replicas failing at the same time.
  • Database backups. Database backup is different from database replica, the former is offline while the latter is online. Database backup is often done at a scheduled time so take a snapshot of the database and store it offline. Full database backup supports point-in-time rollbacks.

Cosmos DB in production application code

If you come from a SQL background, for production applications, instead of opening SQL connections and sending SQL commands directly, you may be likely to use tools like Entity Framework (Core) and LINQ. Similarly, while Microsoft has provided great documentation and example code to work with Azure Cosmos DB using the Data Explorer in Azure portal or a specific container directly using .NET SDK, you may want to use some sort of design pattern, such as repositories and services to abstract out the lower level logics. I have published one article discussing how to use partitioned repository pattern to work with Cosmos DB in an ASP.NET Core REST API project following Clean Architecture, which has setup a good foundation. If this is a mouthful of words or if you want a fresher on these concepts, please see more details in article Clean Architecture — ASP.NET Core API using Partitioned Repository Pattern and Azure Cosmos DB.

Azure Functions — 10,000-Foot Overview

Azure Function is the server-less computer service provided by Microsoft Azure, similar to products like AWS Lambda and Google Cloud Functions. Azure Function really shines when you need a piece of code to do one single task and do it well in terms of scaling in and out, without having to worry about the underlying hardware or servers. I have used Azure Functions extensively for projects like:

  • Time-serial data from IoT devices stored Azure Event Hubs
  • Background ETL tasks, such as importing flat files in cloud storage to database, generating PDFs, etc.
  • Realtime data fan-out

Azure Functions are also the first class services that are integrated with Azure Cosmos DB Change Feed, which can be used for tasks like data propagation.

Goal

All of Azure Functions projects I have created share some sort of common functionalities, such as structured logging using Serilog, strongly-typed configuration, Dependency Injection for Inversion of Control, email service, etc.. This motivated me to create an All-in-One project as a starting point for anytime I need to start a new Azure Function, and I’d like to share it with the GitHub community.

The goal of this article is to quickly cover how these common features are setup inside the Azure Functions project, and also how the project itself fits in the Clean Architecture solution. The solution also hosts a REST API that interacts with Cosmos DB.

It may be easier to follow if you actually navigate through the code, please see GitHub link to clone the repo.

Solution Architecture

This demonstrates the different layers of a Clean Architecture solution:

  • Core, which defines the business entities, business logic abstractions, interfaces, etc..
  • Infrastructure, which provides the implementations like SQL or NoSQL data storage, EF Core, Repository implementation, etc..
  • Web API, which is a sample API project
  • Azure Functions, which is the highlight of this article.

Azure Functions project

This project is Azure Functions 3.0 targeting ASP.NET Core 3.1. There are just a few files within the project because other supporting files reside in Core project or Infrastructure.

  • host.json, which has runtime settings for application level, such as timeout setting.
  • local.settings.json, which is similar to appsettings.json in an ASP.NET Core Web project, includes configuration settings for services like Serilog for logging, SendGrid for emailing, etc..
  • StarterFunction.cs, which is a timer triggered function.
  • Startup.cs, which is similar to the Startup.cs class in an ASP.NET Core Web project, register the services and their lifetime throughout the application using Microsoft built-in dependency injection container. Microsoft has a good article covering how to use dependency injection in Azure Functions, if you would like more details.

Let’s explain some major pieces in the code.

Startup.cs

A few take away notes on this class are:

  • The FunctionsStartup assembly attribute is added to specify that Startup.cs should run during startup, which is kind of like the Main() method being the entry point at startup in a Console application.
  • For configurations, Json file local.settings.json is added as one of the configuration providers.
  • SendGridEmailSettings.cs is used for strongly-typed configurations. In order to use it at runtime, it can be injected as IOptions<SendGridEmailSettings> to constructors. See example usage in SendGridEmailService constructor.
  • Serilog is added as the logging provider to replace the default logger. When you run the project, you should be able to see the log in both the Console as well as the file specified in code.
  • Email service is registered and can be injected in constructors when needed.
  • The database related settings like connection strings, database name, container names and partition keys are stored in local.settings.json, and are read into configuration object.
  • AddCosmosDB() is an extension method defined in the Infrastructure project so that it can be used in both the API project and the Azure Functions project. AddCosmosDB() registers a singleton instance of CosmosDbContainerFactory, which is a wrapper class for CosmosClient provided by Cosmos DB .NET V3. According to Microsoft documentation on CosmosClient, a single instance of CosmosClient should be used. For more explanation on this setup, please see article Clean Architecture — ASP.NET Core API using Partitioned Repository Pattern and Azure Cosmos DB.
  • Because we are using repository pattern, we also register the ToDoItemRepository, which will internally know what Cosmos DB container to use and what the partition key value is.

StarterFunction.cs

A few take away notes are:

  • This class and its entry point Run() method are no longer static.
  • Run() method is now asynchronous, in order to call asynchronous methods.
  • Required services like log, email service, todoitem repository, are injected in the class constructor. They will each have their own lifetime (singleton, scoped, transient) as specified in the Startup.cs class.
  • ILogger log is no longer in the signature of the Run() method, instead, it is a private class property and injected using dependency injection.
  • RunOnStartup is set to true for debugging purpose. Depending on your requirement, you may want to set it back to false before deployment.
  • Specification Pattern is used to query Cosmos DB.
  • For demonstration purpose, the function will retrieve all the todo items that are not completed yet, and send an email to the specified email address.

Conclusion

We now have an Azure Functions project sitting in our Clean Architecture solution, so that it can be used to take care of long running tasks or tasks that do not fit in the API workflow, such as data propagation when one record changes. The Azure Functions project includes popular functionalities that are often required by a production ready solution.

You may have also noticed that because we are using Clean Architecture and have setup the Core and Infrastructure, we did not have to repeat any code and are able to keep our Azure Functions specific code really dry and clean.

Many thanks for reading!

--

--

Shawn Shi
The Startup

Senior Software Engineer at Microsoft. Ex-Machine Learning Engineer. When I am not building applications, I am playing with my kids or outside rock climbing!