Multi-Instance GPU (MIG) of NVIDIA GPUs

Published in

Computing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)

5 min readJan 2, 2023

Introduction and History

This post delves into the MIG feature of NVIDIA GPU devices. GPUs are identified as the primary parallel general purpose processors for tasks like rendering graphics and training deep learning models. If you need to learn more about how these processors work, read the following post:

How do GPUs work?

This post reviews all that is needed to understand how GPU executes code. The assumption is that you know the basics of…

medium.com

NVIDIA company introduced MIG capability with its Ampere architecture, powered A100 40GB, in 2020, May. Furthermore, this feature is supported in A30 and H100 GPUs as well. Note that H100 is from a newer architecture, called Hopper, which was introduced in 2022, March.

MIG

MIG feature makes it possible to split a large data center GPU like A100 into smaller GPUs that their memory, cache, and compute cores are completely isolated from each other. The advantage of this feature is the high QoS offered because of the isolated resources for each GPU instance, which is a result of no resource interference (competition) among applications. When we use a single GPU to run more than one application, those applications compete on using GPU resources like memory, cache, and compute cores, which is known as resource contention. However, in MIG, an GPU instance is completely put aside for a specific application. Noting that GPU instance can also host more than one application as it is an independent GPU. But, it is a good practice to do! Otherwise, the user knows exactly what she is doing.

For understanding the resource contention, consider the scenario we have an application with a large memory and compute demand that can saturate a GPU for weeks to finish and a small program with 2 hours execution time that uses half of the same GPU. Executing those together on a GPU will cause dramatic slowdowns for the small one (finishing execution in 5 hours — the finish time is more than doubled!), which results in bad user experiences. Both of the programs will experience slow down, but for the smaller one it will be more noticeable as it is supposed to finish soon by the user submitting the task.

Following figure shows how MIG technology can divide a A100 GPU into a number of smaller GPUs from the highest level perspective. When we say smaller GPU, it means GPU with less memory and less number of computing cores than the initial GPU.

MIG divides a GPU into smaller GPUs (image credit: NVIDIA MIG website)

Besides, the following figure shows how a GPU can be divided corresponding to the needs of users. Consider the scenario that we have a A100 40GB GPU and have four students A, B, C, D. We know that student A needs 20GB GPU memory for her project and other students only need to have GPUs for their projects. We build 4 GPU instances, one specially for that student with 20GB, and 3 others with 10GB, 5GB, 5GB memories. We can also specify the number of computing cores based on the whitepaper of the A100.

(image credit: NVIDIA MIG use guide website)

A100 and H100 GPUs can be divided into maximum seven (7) instances. But, A30 can be divided into maximum four (4) instances. For checking the possibilities, check NVIDIA documents here.

Some Commands to ease life on using MIG

To check all available GPUs and GPU instances, the following bash command will list all of them. But, first, it is needed to have NVIDIA driver installed on the system.

$ nvidia-smi -L

The output of the above command will look like as follows.

Example output of “nvidia-smi -L” command

To select a specific GPU with nvidia-smi, the index of the GPU, which is detectable with the output of the previous command, the following command will show the selected GPU’s information as it comes after the command.

$ nvidia-smi -i <index_of_the_GPU_visible_in_the_previous_command>

The first step in configuring a MIG-enabled GPU is to enable its MIG mode, which requires sudo privilege.

$ nvidia-smi -i <GPU_index> -mig 1

You will encounter a message saying “Warning: MIG mode is in pending enable state for GPU XXX:XX:XX.X:In use by another client …”, in this scenario, you need to stop nvsm and dcgm, then try again.

$ sudo systemctl stop nvsm
$ sudo systemctl stop dcgm
$ sudo nvidia-smi -i 0 -mig 1

The following command lists the configurations that users can choose from:

$ nvidia-smi mig -lgip

Note that if you do not have any MIG-enabled GPU on the system, the above command will say:

For making instances based on the ID of the profiles, the following command splits a GPU into 3 instances of different-sized resources.

$ sudo nvidia-smi mig -cgi 19,14,5 -C

If you want to learn more and want more explanation and detailed information, read NVIDIA documentation for MIG. You are very welcome to share your opinions or ask questions regarding MIG.

NVIDIA Multi-Instance GPU User Guide

User guide for Multi-Instance GPU on the NVIDIA® GPUs.

docs.nvidia.com