GlusterFS quota accounting mismatch

Hari Gowtham
3 min readJun 9, 2020

It is recommended to read the previous part to understand how accounting works in quota. Also this is for advanced level of understand of quota. The link is provided below.

Why an accounting mismatch can happen:

  1. The value sent by the bricks to aggregation might be wrong. (The value that marker sent for the quotad to aggregate is messed up)

This can happen for a number of reasons:

  • The files haven’t updated the directory. (it should have happened throughout the file system) about the file size.
  • The updates did happen but with wrong values.
  • Some race between a update and the delete or movement of the file.

2. The aggregation itself might be wrong (the way quotad aggregates is wrong) or the size given by the backend filesystem for the file is wrong (quota is stable in these scenarios)

So the focus has to be on the first case.

The files not having updated their contribution to size of the directory is a common case when the quota is enabled on a huge volume. The way quota sets these size values is through a crawl. It is done during a quota enable.

Once enabled, the crawl happens for a while (the time is based on the depth, breadth, and number of files and the resources the machines have) it can go up to hours in a pretty huge setup. So it is necessary that we wait for that to finish and then check it. If the crawl didnt finish properly there are chances that the accounting goes for a toss. This can be verified by making sure all the directories have the quota related xattrs set. This is a tedious task, so we follow the sets mentioned in the next blog to fix it.

The wrong values being updated is another reason for accounting mismatch.

This can happen when a combination of operation like move, rename, rm, create and all these operations that affect the size are done in a certain order and they race against each other. In such cases the value updated could be wrong.

Let’s see what happens in the background to understand this.

Let’s assume you have a filesystem which is 5 levels deep and in each level 5 files are present. And there are writes happening to file in the third level and the file in the fifth level. Every time we write something on Gluster, on the way back once we have finished the write on the disk, we initiate a transaction from that file in the back-end server. This transaction updates the size to the parent and so on till the root. There are chances for the writes from both the file to get merged and sent as a single update. Think of the same for multiple writes/deletes and so on. The same way we have few transactions overlapping and resulting in a wrong value. This is one way for a mismatch to happen. These are related to the marker translator in GlusterFS which quota uses.

It’s hard to find which operation resulted in such a mismatch. Because we don’t have enough information such as what was the first update that the brick received and what was the next update that changed it and when this exactly happened in the filesystem at which part of the file system.

These are the reason why quota mismatch happens in Gluster volume. Unlike why a mismatch happens, its easy to understand how to fix it. I’ll explain how to fix it in the next blog spot.

--

--

Hari Gowtham

A Software engineer at Red Hat working on Glusterfs a distributed filesystem. Here to write mostly technical blogs.