Hybrid Multi-tenant SaaS application — System Design
What is a Multi-tenant SaaS application?
A multi-tenant application is nothing but a software application basically a web app, where a single application handles multiple tenants (organization/company). Software as a Service (SaaS) applications are usually multi-tenant in nature.
I have been developing applications using Microsoft stack for 7 years, so references which I will be sharing in this article will be based on ASP.NET Core. But similar kind of concepts is applicable in other tech stacks like JAVA, Golang, Ruby on Rails, etc.,
In this article, I will describe in detail how a hybrid multi-tenant system can be architected to support a single database per tenant or a single database for multiple tenants system design using a single database schema.
Different types of architectures in Multi-tenant system
There are different types of architecture possible while designing a Multi-tenant system. Depending on your domain needs, you can select the appropriate architecture that suits your application. I recommend having a look at an article by Microsoft on Multi-tenant SaaS design patterns.
Tenancy Model Comparison
Basically, 4 types of tenant models are commonly used
Standalone — Separate application and separate database per tenant.
Database per tenant — Single application but database is separate per tenant.
Shared Multi-tenant — Single application and single database shared across tenant through virtual separation by software. Usually, a cluster of databases and application is deployed for a set of tenants.
Hybrid Multi-tenant Model — Single application supporting both kinds of database architecture, database per tenant, or multiple tenants sharing the same database.
The following table summarizes the differences between the main tenancy models.
Hybrid Multi-tenant system architecture
A hybrid multi-tenant system is a system where a single web application handles multiple tenants. Application is designed in such a way that it can support Databases per tenant or shared databases across multiple tenants architecture.
The diagram below depicts the overall system design.
Terminologies
Some basic terminologies used in this article
SaaS Provider — A company which sells a SaaS product
Organization (Tenant) — A company or a client who has purchased a license from the SaaS provider for his/her organization.
Domain — The main domain of SaaS provider, in this blog it is assumed as issuetracker.com
Sub Domain — A subdomain, which a tenant has registered.
For Eg: org-a.issuetracker.com
Catalog Database — A central database containing information on who has registered and what are their registered domains. It is mainly used as a lookup database for finding tenants.
Application (Org) Database — This database is the main database where actual application data for a tenant is stored.
Use cases of hybrid Multi-tenant model
If you are developing a SaaS application, then your product might have different pricing plans. Say for example you offer two pricing plans for your product,
1. Freemium Plan — Uses a shared database between tenants to reduce operation cost
2. Enterprise Plan — Uses separate database per tenant to meet privacy compliance needs and a higher degree of data security requirements.
Usually, from freemium customers, you won’t get any revenue. So you would like to have operating costs at minimal to serve those customers. For this plan shared database is a good choice. In a single shared database, you can store data of 1000’s customers (tenants). In this way, you can avoid the operation cost of running separate databases for each tenant.
Enterprise customers, these are your actual paying customers. Usually, Enterprise may have several privacy compliances needs and require a higher degree of data isolation. For this type of customer, you might need to provide a separate database for each.
Microsoft Azure SQL database has a concept called Elastic Pool designed for this purpose. A single elastic pool can handle at most 500 separate databases.
So your application should be designed in such a way that a single codebase without any modifications should be capable of supporting both pricing plans.
Application design for handling multi-tenancy
At the application level, tenants are identified and virtually separated by domain. Usually, a SasS provider allows a customer to select a sub-domain for their organization at the time of registration. Some offer an option to map custom domains also.
For example, consider a SaaS provider selling an Issue Tracker SaaS product. The main domain of SaaS provider is issuetracker.com
Now, suppose an Organization named “ org-a” buys an Issue tracker. At the time of registration, customer-a chooses a domain
org-a.issuetracker.com. Similarly, one more organization org-b registers a domain org-b.issuetracker.com. Likewise, n number of customers can register. Each customer has a unique distinct domain.
In the multi-tenant system, the main identifier for identifying an organization (tenant/customer) for virtual separation is sub-domain. Using a sub-domain each tenant is identified at the application level and isolated data is served. In this way, each customer sees their own version of the site customized by their admin.
To explain this process further let us consider a scenario,
- A user loads a URL org-a.issuetracker.com in the browser.
- Once a URL is loaded a Web application mapped to the domain receives a request.
- The application then resolves a tenant, i.e., finds the location of the database where requested customer data is located.
- After finding data application then processes a request, and returns the requested data to the client.
- Here n no. of users will be requesting the same application.
- Based on where the request comes (URL) a tenant is resolved and data for a particular tenant is returned to the requested user.
- Note: Here a single application is handling all the requests of multiple tenants and serving isolated data to clients.
Tenant resolution using sub-domain
The tenant resolution process involves finding the location of the application database to process requests.
The flow of how tenant resolution works are explained below:
- A user from Org A requests a page named Tasks through browser
org-a.issuetracker.com/tasks - The request is received by the application.
- The application finds the domain using the requested URL. For URL org-a.issuetracker.com/tasks domain is org-a.issuetracker.com
- After finding the domain, the application looks up in the Catalog database for an entry whose domain value matches exactly org-a.issuetracker.com (I will explain later in this article, schema for Catalog Database)
- Once a matching entry row is found in the database data related to the tenant is retrieved.
Org ID — Unique Identifier for a tenant
Connection String — Database connection string (Location of actual application database which contains data for Org A) - The application now connects to the database using the connection string found in the previous step.
- Application process data and returns result back to the client. Now the client sees the data and version of the site customized for his/her organization.
- The same process is repeated every time a request is received.
Database design
The database schema explained below is just a sample schema to explain the concept. Based on application needs or domain in which you are working you may expand the schema.
You would need to create two databases for your SaaS application at a minimum.
- Catalog Database — This database is a centralized database that will help you in resolving tenants. It will hold data like who has registered, what are the tenants URLs, domain mapping, email mapping, etc.,
- Application / Organization Database — This database is a core database that will contain application-specific tables. Based on the type of user Freemium / Enterprise. A single database may have data of more than one tenant, virtually separated using Tenant Identifier ID (Org ID).
Catalog Database schema
A catalog database will have a table named “ Tenants” which will hold a collection of registered tenants and their respective database location. It is recommended to store the database connection string in encrypted form instead of plain text.
Tenant table details
org-id — A unique identifier for a tenant. Eg: OrgID = 1
org-name — A name of a registered organization. Eg: Organization A
org-domain — A subdomain / a custom domain mapped to the organization. Eg: org-a.issuetracker.com
db-connection-string — A encrypted connection string of application database (application database schema explained in next section).
pricing-plan-type — Freemium/Enterprise
Initially, just one centralized catalog database is enough. As your product grows you can deploy a cluster of databases in different regions. Each cluster may have its own centralized catalog database.
It is recommended that the catalog database is always at high availability.
Application (Org) Database schema
The application database is the main database where actual application data a tenant is stored.
The image depicts the schema of the Issue tracker database. Just one table is used to explain the concept of multitenancy. Actual Issue tracker database may have 100s of tables.
An application database name will be like org-1, org-2, org-3, org-freemium-1, org-freemium-2, etc., Each is a separate database with the same schema.
In the Catalog database explained in the previous section there was a connection string column. In the connection string column, we will store the server address of the application database connection string
For Eg: for Org 1 (Enterprise plan), the connection string value will be the server and database name of the org-1 database.
If later a need arises to move the database to a different zone/region. Then just a change in the connection string in the catalog database is required to switch.
For Organization 2 (Freemium plan), the connection string value will be the server and database name of the org-freemium-1 database.
For Organization 3 (Freemium plan), the Connection string value will be the server and database name of the org-freemium-1 database.
Both organizations 1 and 2 share the same database so both will have the same connection string. Later, if you plan to shard data and move org-3 to a separate database then just a change in the connection string is required to switch the database.
If your application has a separate database for reading and writing. Then in catalog DB, you can add two connection string columns one for the read and the other for write values of the connection string
Organization Table — This table holds the name and ID of the organization (Tenant). Note this table was present in the catalog database also with the name Tenants. In the application database, the organization table is maintained with limited information just to map foreign keys.
Catalog Database Tenants table will have a whole master collection list of tenants. But application database will have just the organization details whose data is maintained in the application database.
In the Freemium plan, tenants will be sharing the database, then the organization table will have all the organizations who are sharing the database in the application database. Say for example org-freemium-1 holds 100 organization data then just that 100 Organization data is maintained in the Organization table.
For enterprise customer data is stored in a separate individual database not shared with any organization, then the organization table will have just one entry of that enterprise.
Org-1 Database will have just Org 1 organization details
Since the model is a hybrid schema is common to support shared and single database architecture. This requires no change in code. Just you have to switch the connection string in the catalog database.
Brands Table — If your application supports a multi-brand concept for an organization. Then This table will have the brand information of an organization. The brand is nothing but a subdivision of an organization.
For Eg: A Organization may have different branches in different locations. Then that organization can create different brands for each branch to filter/isolate data. Then brands table holds a list of brands registered by an organization.
Tasks Table — The task table holds the data of issues/tasks in the Issue Tracker application. Note here each row has a field Org-id, brand-id, org-brand-id for virtual separation of data.
Org ID — It is the unique identifier key field which is used to virtually isolate data for a tenant. Each organization’s data is virtually separated using org-id.
In a freemium plan — a single database can hold multiple organization data. So while querying any data in where condition respective org-id filter is must to include. Using this filter virtual data separation is achieved.
SQL Query - Select * from tasks where org-id=1
Adding org id in where the condition is must, if it is missed then data of another tenant might expose to all tenants.
In an enterprise plan — a single database will hold only one organization’s data. So while querying it is not necessary to apply an org-id filter. Based on plan type filter can be added or removed.
SQL Query - Select * from tasks
Brand ID - If an organization has multiple brands then in query brand ID condition can be added to filter respective brand data.
SQL Query - Select * from tasks where org-id=1 and brand-id=1
OrgBrand ID - You may think we have Org ID, Brand ID then why we need this extra column. The main purpose of this column is to speed up the query.
Org Brand ID - It is a combination of Org ID and Brand ID. Eg: if your system supports a maximum of 1000 brands per tenant then you may use this format
org-brand-id = Org ID + 4 digit of Brand
Eg: org-brand-id = 10001 here First digit indicates Org ID, Last 4 Digits indicates Brand Id = 0001
How to use this in the query condition
Scenario 1: If you want to load all data of the organization you can use Org ID alone
SQL Query - Select * from tasks where org-id=1
Scenario 2: If you want to load data for a particular brand say Brand 1 of organization 1 then your query will be like
SQL Query - Select * from tasks where org-brand-id=1001
Note we can't use just brandid=1 omitting org-id, in a shared database (freemium plan) brandid=1 may belong to org-2, org-3... org -n
You might think we can query like below, SQL Query - Select * from tasks where org-id=1 and brand-id=1
There is nothing wrong with this. It is just for performance reason org-brand-id is included, in this query ( Select * from tasks where org-brand-id=1001
) you are applying just one filter in where condition (org-brand-id) in next query you are using two conditions ( Select * from tasks where org-id=1 and brand-id=1
).
Note: You should be very careful while querying data in a shared database, skipping to add org condition will expose data of other organizations.
In ASP.NET Core there is a possibility to add a Global Filter in Entity Framework, using that you can apply org-id condition at the global level to avoid the risk of developers missing to include org-id condition in each query.
Check out this blog to know more about Global filter- Global query filters in Entity Framework Core 2.0
- Depending on the nature of your application you can decide upfront how many organizations’ data a single database can handle by doing load tests, to avoid sharding of databases at later stages when your product grows.
- You can maintain a cluster of databases to load balance database loads. Your application design code needs no change for the support cluster. Just in the catalog database, you have to maintain the appropriate connection string of the database where a particular organization data resides.
- Use a middleware in the application for resolving tenants.
- Use caching techniques like storing the domain lookup information in Distributed / In-memory cache to speed up the tenant resolving process.
- If you are using a micro-services architecture for running batch/background jobs you can use a new centralized shared (orchestrator) database for storing time scheduled information/temporary data for batch processing jobs. If you store this data in the application (Organization) database then you might need to loop all application databases to find jobs to run. This will cause a performance overhead and unnecessary checking of an organized database that doesn’t have any jobs to process.
Hope, by now you might have got a basic idea of how Multi-tenant SaaS application works and how to design the core of it. In this article, I have described in detail how a hybrid multi-tenant model system can be designed with a flexible architecture to support a single database per tenant or a single database for multiple tenants using a single application codebase.
If you are going to develop a Multi-tenant SaaS using ASP.NET Core. Then I recommend reading this blog post for getting started to develop the core of your application.
SaaS Kit Multitenant framework by Ben Foster
Multi-tenant ASP.NET Core series by Gunnar Peipman
For ASP.NET Core open-source libraries checkout
Awesome curated collection of libraries to build Software as a Service (SaaS) Web application using ASP.NET Core
Originally published at https://blog.sharetechlinks.com on November 21, 2020.