vmstat explained

5 min readSep 8, 2018

I was recently asked to explain the output of vmstat; each field and what significance that field has on the system. Lets start with what vmstat is. vmstat (virtual memory statistics) is a system utility that collects and displays information about system memory, processes, interrupts, paging and block I/O. Using vmstat, you can specify a sampling interval to observe system activity in near-real time.

Procs

The r and b values are calculated by traversing the /proc filesystem and reading the stat file on each process i.e. /proc/<PID>/stat. Within the stat file, one field is extracted; field three (state). If the state field is in a running state, denoted by ‘R’, then the runnables counter is incremented. And if the state is ‘D’ then the uninterruptible counter is incremented; all other states are ignored. For a full list of states, run man 5 proc.

What is an uninterruptible sleep state? An uninterruptible sleep state is one that will not handle a signal right away. The process resumes only if the waited-upon resource becomes available or if a timeout occurs during that wait (if the timeout is specified when the process is put to sleep). The uninterruptible sleep state is mostly used by device drivers waiting for disk or network IO. When the process is sleeping uninterruptibly, signals accumulated during the sleep are evaluated when the process resumes.

Memory

The swpd field indicates how much swap space has been used; this value increases when your systems physical memory is full and the Linux kernel starts to use the swap partition/file. When the systems physical memory and swap space has been exhausted, the Linux OOM (Out Of Memory) killer executes and kills process(es) that are consuming the largest amounts of memory. This swpd value is retrieved by traversing the /procfilesystem and reading the stat file on each process. The free field indicates how much memory has not been allocated. The buff fields indicates temporary storage for raw disk blocks. The cache field shows in-memory cache for files read from disk; the Linux kernel will use available memory for disk caching, unless it’s required by a running program. Both buff and cache values are retrieved from /proc/meminfo.

Swap

The si and so fields indicate that the systems physical memory is full and the swap partition/file is being used.

IO

The bi and bo fields indicate the number of blocks being read from disk and block being written to disk, as shown in the High IO Write Load example and High IO Read Load example.

System

The interrupts and context switch values are read from the /proc/stat file; The values are then manipulated to give you the current values rather than the values since the system booted.

duse= *cpu_use + *cpu_nic;  /* CPU USER + CPU NICE   */
dsys= *cpu_sys + *cpu_iow;  /* CPU SYSTEM + CPU WAIT */
didl= *cpu_idl;             /* CPU IDLE              */
Div= duse+dsys+didl;
hz=Hertz; /* get ticks/s from libproc */
divo2= Div/2UL;
printf(format,
...
...
 (unsigned)( (*inter                 * hz + divo2) / Div ),
 (unsigned)( (*ctxt                  * hz + divo2) / Div ),
...
...

So what is the context switch referring to? Context switching is where the Linux scheduler removes the currently running task from the CPU and switch it out with another task that is scheduled to be run; this is is handled by the __schedule() function defined in kernel/sched/core.c.

What are interrupts? An interrupt is simply a signal that the hardware can send when it wants the processor’s attention. A common example of this is when a packet is received by the network card; when a packet is received, an interrupt is thrown and the CPU deals with the packet; you can list all the interrupts that have occurred on your system by viewing the /proc/interrupts file.

CPU

The CPU column provides an overview of where most of the CPU time is spent. These values are retrieved from the /proc/stat file.

The us field represents the CPU time spent in user space, where normal processes such as Nginx run. The sy field represents the CPU time spent in kernel space; this is for processes similar to kworker, which executes in kernel space. The id field represents the CPU idle time. The wa field represents the CPU time spent waiting for I/O, usually disk or network; a common issue where you might see this value increase is if an NFS mount point has been mounted with hard options and the mount-point has become stale, which is elaborated in our examples. The final field st represents the percentage of time a virtual CPU waits for a real CPU while the hypervisor is servicing another virtual processor. Essentially, the steal time cycle counts the amount of time that your virtual machine is ready to run but could not run due to other virtual machines competing for the CPU.

Examples

The majority of the examples shown below have been taken from the RedHat article 1

CPU User Load

A standard audio file will be encode as an MP3 file by means of the lame encoder in this example. This process is quite CPU intensive and also demonstrates the execution of vmstat in parallel with a user CPU time of 97%

CPU System Load

In this example, a file will be filled with random content using dd.

$ dd if=/dev/urandom of=500MBfile bs=1M count=500

For this, /dev/urandom will supply random numbers, which will be generated by the kernel. This will lead to an increased load on the CPU (sy – system time). At the same time, the vmstat executing in parallel will indicate that between 93% and 97% of the CPU time is being used for the execution of kernel code (for the generation of random numbers, in this case).

RAM Bottleneck (swapping)

In this example, many applications will be opened (including VirtualBox with a Windows guest system, among others). Almost all of the working memory will be used. Then, one more application (OpenOffice) will be started. The Linux kernel will then swap out several memory pages to the swap file on the hard disk, in order to get more RAM for OpenOffice. Swapping the memory pages to the swap file will be seen in the so (swap out — memory swapped to disk) column as vmstat executes in parallel.

High IO Read Load

A large file (such as an ISO file) will be read and written to /dev/null using dd.

$ dd if=bigfile.iso of=/dev/null bs=1M

Executed in parallel, vmstat will show the increased I/O read load (the bi value).

High IO Write Load

In contrast with the previous example, dd will read from /dev/zero and write a file. The oflag=dsync will cause the data to be written immediately to the disk (and not merely stored in the page cache).

$ dd if=/dev/zero of=500MBfile bs=1M count=500 oflag=dsync

Executed in parallel, vmstat will show the increased I/O write load (the bo value).

CPU Waiting for IO

In the following example, an updatedb process is already running. The updatedb utility is part of mlocate. It examines the entire file system and accordingly creates the database for the locate command (by means of which file searches can be performed very quickly). Because updatedb reads all of the file names from the entire file system, the CPU must wait to get data from the I/O system (the hard disk). For that reason, vmstat running in parallel will display large values for wa (waiting for I/O):