The impact of blocksize on Persistent Disk performance

Colt McAnlis
3 min readJul 12, 2018

NimbusDisk offers a filesystem product that runs in the cloud. They spend a lot of time indexing, reading, and streaming file data from the vm disks

After looking through most of their disk setup, it seemed they had the right disk sizes, right disk types, and right cpu setups for the performance they were wanting, but they still didn’t seem to be able to get the right throughput levels they were looking for.

And after digging around in some of our documentation, I stumbled upon this small gem:

“The default block size on volumes is 4K. For throughput-oriented workloads, values of 256KB or above are recommended.”

Unbeknownst to me, this single sentence had serious performance ramifications for NimbusDisk.

Why block size matters

Without diving deeper than necessary, a block is simply a unit of data that is read during an I/O operation. It could be 1Byte, or 1Megabyte, but all IO operations will fetch these units in their entirety during an operation. Which means that if you are trying to read a subsection of a chunk in your code, the disk/OS will typically fetch the whole chunk, just incase future reads are in the contiguous area.

With this in mind, it’s important to observe the following formula wrt disk performance:

Throughput (M/sB) = IOPS * Blocksize.

How this evens out is that IOPS and block Size tend to have an inverse relationship to each other. As block size increases, it takes longer latency to read a single block, and thus the # of IOPS decreases. Inversely, smaller block sizes yield higher IOPS.

Persistent disks and block sizes

To figure out how GCP’s persistent disk behaved as the block sizes changed, I set up a simple 32 core machine with a 1TB Standard Persistent Disk attached to it, and then ran a sequence of FIO tests, each with various block sizes, and charted the IOPS and IO throughput :

What we see is that as the window size grows, our IOPS get lower, and our throughput gets higher.

But, just to make sure we’re kicking all the tires, let’s take a look at how this plays out with an SSD PD:

We see a similar reduction in IOPS as the block size increases, however, we see a hard cap in throughput @ 16k window size; which I’m pretty sure is more a function of vCPU / HDD limitation than it is of window size. By increasing our PD-SSD size, or moving to 64 cores, we should be able to remove that limit.

The fix is in

By default, GCP’s Persistent disks use a block size of 4k, which is perfect for higher IOPS workloads (like relational databases (SQL, NoSQL, Mongo, etc)).

NimbusDisk however, was concerned very much with throughput, which meant that moving to a 1MB block size on their StandardPD would increase their performance by about 30x.

Not bad for a simple command line flag…

--

--