Storage layer in General Purpose Azure SQL Managed Instance

Azure SQL Database Managed Instance is a cloud version of SQL Server Database Engine implemented on Azure cloud. Managed Instance has so-called General Purpose architecture where compute is separated from storage layers. There are two layers in General Purpose architecture:

  1. Compute layer that runs SQL Server Database Engine process and it is used for query processing and data caching. This is stateless compute VM that is re-initialized on another node if the process or underlying infrastructure crashes.
  2. Data layer where all database files (.mdf, ldf, .ndf) files are placed. The data layer is implemented using Azure Premium Disk storage that has built-in replication and redundancy that guarantee that the data will be preserved.

Every database file is placed on separate Azure Premium disk (one file per disk). Azure Premium disks that are used in storage layer have fixed sizes: 128 GB, 256 GB, 512 GB, 1 TB, 2 TB, and 4 TB, and Managed Instance uses minimal disk size that is required to fit the database file with some size. For example, database file with 200 GB will be placed on 256 GB disk, a database file with 800 GB will be placed on 1 TB disk, etc. Every disk automatically grows if the database file size reaches the disk storage limits.

Important thing that you need to understand is that file performance depends on the disk size. Bigger disks have bigger throughput and IOPS, as shown in the following table:

This might be surprising because the larger databases files are faster than the smaller ones. In some cases you might get better performance just by increasing the file size. Also, if you have multiple files in database and your workload can be parallelized, you might get great performance because the database files do not share IOPS and throughput. However, if you have multiple small files and your workload touches the same file, you might experience performance issues because multiple files are smaller and smaller IOPS and throughput.

Every Managed Instance has up to 35 TB of total internal storage reserved for Azure Premium disks storage layer. If you create many files and the total size of the underlying disks reaches 35 TB, you will get internal storage limit error. This means that once you provision Managed Instance you have two storage limits:

  1. Managed instance user-defined storage is the Managed Instance max storage size for your database files that you choose on portal and you pay for this amount of storage.
  2. Internal physically allocated azure premium disk storage that cannot exceed 35 TB.

As a result, you cannot have more than 280 files on the General Purpose instance because 280 files placed on the smallest 128GB disks will reach 35TB limit.

You can reach this 35 TB limit even with the smaller number of files (less than 280). For example, a Managed Instance could have one database file with 1.2 TB that is placed on a 4 TB disk, and 248 files each 1 GB size that are placed on separate 128 GB disks. In this example, the total disk storage size is 1 x 4 TB + 248 x 128 GB = 35 TB. However, the total reserved space for database files on the instance is 1 x 1.2 TB + 248 x 1 GB = 1.4 TB. This illustrates that under certain circumstance, due to a very specific distribution of files, a Managed Instance might reach the 35 TB reserved for attached Azure Premium Disk when you might not expect it to.

In this example existing databases will continue to work and can grow without any problem as long as new files are not added. However new databases could not be created or restored because there is not enough space for new disk drives, even if the total size of all databases does not reach the instance size limit. The error that is returned in that case is not clear.

You can use Transact-SQL scripts that can help you to see are you reaching this internal 35 TB storage limit on General Purpose Managed Instance.

First, we will create a schema and view that wraps standard sys.master_files view and returns the allocated disk size for every file:

CREATE SCHEMA mi;
GO
CREATE OR ALTER VIEW mi.master_files
AS
WITH mi_master_files AS
( SELECT *, size_gb = CAST(size * 8. / 1024 / 1024 AS decimal(12,4))
FROM sys.master_files )
SELECT *, azure_disk_size_gb = IIF(
database_id <> 2,
CASE WHEN size_gb <= 128 THEN 128
WHEN size_gb > 128 AND size_gb <= 256 THEN 256
WHEN size_gb > 256 AND size_gb <= 512 THEN 512
WHEN size_gb > 512 AND size_gb <= 1024 THEN 1024
WHEN size_gb > 1024 AND size_gb <= 2048 THEN 2048
WHEN size_gb > 2048 AND size_gb <= 4096 THEN 4096
ELSE 8192
END, NULL)
FROM mi_master_files;
GO

Now we can see the size allocated for the underlying Azure Premium Disks for every database file:

SELECT db = db_name(database_id), name, size_gb, azure_disk_size_gb
FROM mi.master_files;

Sum of the azure disk sizes should not exceed 35 TB — otherwise you will reach the azure storage limit errors. You can check total allocated azure storage space using the following query:

SELECT storage_size_tb = SUM(azure_disk_size_gb) /1024.
FROM mi.master_files

Using this information, you can find out how many additional files you can add on a managed instance (assuming that new file will be smaller than 128GB):

SELECT remaining_number_of_128gb_files = 
(35 - ROUND(SUM(azure_disk_size_gb) /1024,0)) * 8
FROM mi.master_files

This is important check because if this count became zero, you will not be able to add more files of database on the instance.

As a conclusion, Azure SQL Database Managed Instance might be very interesting option for hosting your SQL Server databases if you want to easily lift your data into the Azure cloud. However, since this is the platform that is different than your environment, you should be aware what are the constraints.