Multi-Tenancy in Flask

Mahshooq Zubair
9 min readJan 24, 2024

--

Hello World,

Today, I would like to share a nice and challenging task that I worked on during my early days as a Junior Software Engineer — creating a multi-tenant-based Software as a Service (SAAS) application using Flask.

Disclaimer: The architecture detailed in this blog post stems from my experiences as a Junior Software Engineer. It is research-based and may have limitations. Your feedback and alternative approaches are welcome. This is a technical blog, thus there may be more code samples than in a typical content blog. Please be aware of this. Furthermore, the example code that is provided here is meant to be illustrative and might contain unhandled exceptions or security flaws. Before putting the code into a production environment, validate it.

The Challenge: Multi-Tenancy in a Flask Application

The primary requirement was to develop a SAAS-based application where tenant data needed to be meticulously isolated from other tenants and public data. Drawing an analogy from platforms like Shopify, each store represents a tenant. While general information such as tenant details and user login information could reside in the public database/schema, tenant-specific data like stored items and orders had to be stored in a dedicated tenant database/schema.

To summarize, the goal was to create a Flask application with a robust database structure, featuring a public schema for general information and a separate schema for each tenant’s specific data. This setup is needed to seamlessly support database migrations and other db operations.

And to put it briefly, the blog is all about how I overcame the following challenges while developing.

  • How to organize a Flask application for a multi-tenant SAAS architecture with isolated tenant data?
  • How to segregate the public schema for general information from tenant-specific schemas?
  • How to establish user authentication and link users with specific tenants?
  • How to handle database migrations separately for the public schema and tenant schemas?
  • How to dynamically create a new schema in the database for each new tenant during the registration process?
  • How to implement middleware to switch between the public and tenant schemas based on the route of incoming requests?
  • Why opt for schemas over databases for each tenant, and what performance improvements are observed?

Exploring Existing Solutions

Like any developer, I started by searching Google and reading through a few multi-tenancy-related blogs and publications. While some discussed similar topics, the architectures presented were diverse. Some utilized different databases for different tenants, while others employed different databases for specific tables only (a feature provided by many libraries and frameworks, such as Binds in SQLAlchemy and Django). What I needed was a specific set of tables in both the public and tenant schemas for every tenant.

Let’s Dive into the Code

For ease of understanding, I’ll reference a demo app repository https://github.com/MQ-xz/multi-tenancy-in-flask and its commit history. Imagine a SAAS project resembling Shopify, where a public schema holds User and Tenant tables, and each tenant has its schema with Product and Order tables.

Note: This blog will focus on addressing challenges faced during development. For a comprehensive understanding of the code, refer to the repository.

Project Setup [b7aa388]

I began by establishing a basic Flask REST API project and incorporating the necessary dependencies. To facilitate development, I included a Docker Compose file.

Flask basic API Project structure

The project setup follows a standard Flask configuration, and I won’t delve into it extensively here. For a detailed overview of the changes, please refer to the commit history.

Public Schema [a698765]

Initially, I defined the database model (Table) for the public schema, incorporating the User and Shop tables. This involves a standard SQLAlchemy model setup, typical in Flask projects.

public schema tables

Here, database operations are carried out using flask-sqlalchemy, and database migrations are handled via flask-migrate.

So I just configured flask-migrate and flask-sqlalchemy like I would for any other Flask project. However, in this case, the public schema was migrated using a specified directory. to facilitate the management of the migrations for the tenant schema and public schema independently.

The flask migrates command’s “-d migration_dir” argument can be used to accomplish this.

flask db migration

Authentication [1e9b227]

Thus, the public schema is now available. The user’s authentication must now be added. for us to verify and carry out the basic tasks, such as setting up a new shop (tenant), etc.

I therefore used flask-jwt-extended for authentication. and added the configuration that was needed for it. and included the registration and login endpoints.

Moreover, I updated the JWT token with an additional_claim, which is the shop_id as a tenant, for us to utilize it to determine the tenant for the request (The real usage will become apparent later.). you can also use other methods like in headers to store the id.

login endpoint

Initialize Tenant Schema [1479a2d]

Having prepared the public schema and authentication, the next step involves adding the tenant schema. To accomplish this, a new migration directory for the tenant schema was created, accompanied by essential configurations.

For the creation of the tenant schema, initial steps included defining table models (Product and Order) in a standard manner. Subsequently, a dedicated migration directory for the tenant schema was initialized in a separate location (in this case, migrations/tenant, as indicated in commit 1479a2d).

tenant migration

Before generating a migration version for the tenant schema, a couple of updates were implemented:

  1. The model for the public schema was enhanced by adding two additional attributes: __bind_key__ = ‘public’ and __table_args__ = {‘schema’: ‘public’}. Additionally, a new configuration was introduced in settings, namely SQLALCHEMY_BINDS
updated user model

- __bind_key__ is typically utilized to define multiple database connections, allowing different models to be associated with different databases. However, in this context, it serves the purpose of identifying the schema. For the public schema, __bind_key__ == ‘public’was specified, while nothing was defined for the tenant schema.

- SQLALCHEMY_BINDS is crucial for SQLAlchemy to identify the schema. In this case, it was configured SQLALCHEMY_BINDS = {‘public’: ‘postgresql://postgres:postgres@localhost:5432/public’} to specify the connection details for the public schema.

2. The get_metadata function in the env.py file within the public migration directory underwent an update. The modification involved changing target_db.metadatas[None] to target_db.metadatas[‘public’]. This adjustment ensures that only the metadata for the public schema is retrieved during migration, taking into account models with the __bind_key__ = ‘public’ attribute.

Customizing Migration Script [bae11bb]

When executing tenant migrations, a customized migration script is required to update each tenant schema individually. The following modifications were implemented:

  1. Introduced a list to store the schema names for each tenant (utilizing the shop ID as the schema name). This list is populated by fetching tenant information from the Shop table in the public schema. Running the migration script will subsequently update the migration for each tenant schema. Offline migrations for tenant schemas were blocked, as they depend on the public schema and necessitate running the application to retrieve data from the model.
customized tenant migration file

2. Updated the run_migrations_online function to customize the migration for each tenant schema.

The default function was completely removed and replaced with a custom function. In this custom function, a database connection named `connectable` was created. The script then connects to `connectable` and iterates through each tenant in the tenant list.

While looping, the schema is switched using the SET search_path TO <schema_name> query, followed by running the migration script for the respective tenant schema.

Additionally, to prevent the creation of multiple migration versions for the same tenant schema when running the flask db migrate command, a check was implemented to block the loop during migration file creation.

customized migration script

Initial Migration for Tenant Schema [6e5aec2]

When utilizing the tenant list method, where no tenants are initially available for migration, a manual approach was adopted to create the first migration version for a tenant schema. This involved creating a schema in the database via the psql shell, for example:

psql manual schema creation

Subsequently, this tenant was added to the tenant list, and the migration script was executed to generate the migration version for the tenant schema. Once the migration was complete, the test tenant was removed from the list.

Creating Schema while registering a new Tenant [c469181]

Now that the tenant schema is prepared, the next step involves creating a new schema for each tenant during the registration of a new shop. To streamline this process, a utility class Database was introduced, featuring functions like create_schema, get_engine, get_session, and others.

Upon the completion of the shop registration, the create_tenant_schema function within the Database class is invoked. This function executes the following steps:

1. Create a Schema in the Database:
The creation of the schema is achieved using the Sqlalchemy CreateSchema function.

schema creation function

2. Create Tables in the Schema:
Tables are created in the schema using the db.metadata.create_all function. This function generates all the necessary tables within the newly created schema.

3. Run Migration for the Schema:
To manage future migrations in the newly created database, the current migration version is added to the alembic_version table. To implement this, the alembic_version table is manually created within the schema, and the current migration version is obtained using Flask-Migrate’s command flask db heads. The retrieved version is then added to the alembic_version table.

migration updates

This process ensures that a new schema is seamlessly created for each tenant during the shop registration, complete with the necessary tables and migration version management.

The Middleware [30bd3db]

To handle user requests to both the tenant and public schemas, a middleware was implemented using the before_request function in Flask. This middleware runs before every request, allowing for the seamless management of requests to different schemas.

The flask routes blueprint for the public and tenant was already separated, enabling easy identification of whether the request was intended for the public or tenant. When a request is directed towards the public route, no additional actions are needed. However, for requests to the tenant route, the schema needs to be switched.

As previously mentioned, the shop_id is included additional_claim in the JWT token, facilitating the identification of the tenant for which the request is intended. The switch_schema function in the Database class is then utilized to switch to the corresponding tenant schema.

request middleware

Why do we choose schema per tenant instead of a database?

The decision to use schemas instead of separate databases for each tenant was driven by several factors, including performance improvements and ease of integration. Here are the key reasons behind choosing schemas:

1. Performance Enhancement:
Initially, we employed distinct databases for each tenant. However, after experimenting with schemas, we observed a substantial performance improvement. The reduced overhead associated with managing multiple database connections contributed to more efficient operations.

2. Flexibility with Models:
Schemas provide greater flexibility in integrating public models with tenant-specific models. This flexibility is particularly advantageous when common structures or entities (public models) need to be shared across all tenants while maintaining separate tenant-specific data.

3. Minimized Database Switching:
Managing different databases for each tenant often involves frequent switching between databases based on the tenant context. By utilizing schemas, the need for such frequent database switching is minimized. This streamlines data access and contributes to better performance.

4. Seamless Access to Public Schema:
Schemas allow for seamless access to the public schema without the need for explicit schema switching. This not only simplifies data access but also improves overall application efficiency.

5. Foreign Key References:
Using schemas facilitates foreign key references from tenant schemas to the public schema. This is particularly beneficial when establishing relationships between entities in tenant and public schemas, contributing to a more integrated and coherent data model.

In summary, the decision to use schemas over databases was driven by performance gains, increased flexibility with models, minimized database switching, and the ability to establish foreign key references between tenant and public schemas. This approach aligns well with the specific requirements and design considerations of the multi-tenant Flask application.

Wrapping Up

Ya, that’s it, it’s done. Explore the entire code in the repository (https://github.com/MQ-xz/multi-tenancy-in-flask), and refer to the commit history for a step-by-step understanding of the changes.

For any questions or suggestions, feel free to reach out.

Happy coding!

--

--

Mahshooq Zubair

A Software Engineer, who build things that breaks into systems.