Metrics Shape Behaviors

Published in

Agile Content Teamwork

5 min readJul 3, 2018

The first time I heard about this article’s title was in an Extreme Programming course for software developers, in Sao Paulo, about 1 year ago. One of the personalized gifts was a sticker containing this phrase, and the course instructors repeated it at least 5 times.

I confess: at the time, I hadn’t given due importance to that. I was more interested to learn unit test practices, continuous integration, etc.

Now, 1 year later, something curious just happened.

I developed a module for my current project, a payment integrator library, which communicates with one of the payment gateways available. As a new project, I was so excited to apply the best practices, create lots of unit test to cover 100% of the code, and so on.

Soon after I finished the project, I ran a tool available in Visual Studio (the software we use to develop), which analyzes the code, and outputs some quality metrics. One in special caught my attention:

Index Maintainability provided by Visual Studio

Maintainability Index. My first thought was: “Nice, 84 seems a good value, at least It’s green.” After some time, I was wondering what this value really meant, and what are the thresholds which limit a good and a bad code in terms of maintainability. I was obsessed to increase this number, changing my code to optimize it as much as I could. Almost one entire work day for this, lots of new commits created, for every class in the project.

Microsoft says: “Maintainability Index calculates an index value between 0 and 100 that represents the relative ease of maintaining the code. A high value means better maintainability. Color-coded ratings can be used to quickly identify trouble spots in your code. A green rating is between 20 and 100 and indicates that the code has good maintainability. A yellow rating is between 10 and 19 and indicates that the code is moderately maintainable. A red rating is a rating between 0 and 9 and indicates low maintainability.”[1]

I thought these thresholds very distant from each other, especially between green and yellow. So I started to research more about it.

Then I discovered the Maintainability Index was first introduced in 1992, in a software conference. Long story short, it’s about a mathematical formula which describes how maintainable a piece of code is.

Maintainability Index Formula

I won’t dig into the details of it, but it relies on four variables:

Halstead Volume: essentially the number of operators you use
Cyclomatic complexity: measures how many paths (decisions) your code has, including if and loop clauses.
Lines of code: how big your module is (in terms of number of code lines)
Percentage of code commentaries (yes, that’s true). This variable, in particular, was added later by SEI (Software Engineering Institute)[2]

Observation: Microsoft also changed this formula a little bit to fit into a 0–100 scale.

I started to feel a little confused because there was no consensus about the values of the thresholds. The original paper mentions 65 and 85 values.[3]. I understand Microsoft fits into a different scale, but the distance between their good and medium values is very big.

Then I found the following article: “Think Twice Before Using the ‘Maintainability Index’”[4]. That really caught my attention. Essentially the author argues and questions about the efficiency of this metric. Furthermore, he mentions a study [5] from the University of Oslo. They hired four different companies to develop a software with the same requirements. Then they required them to do some maintenance tasks on it and measured the time and results with some software quality criteria. The following table summarises it:

One of their conclusions was: there’s no direct relation between the Maintainability Index and the efficiency of the maintenance tasks. But there’s a direct relation between lines of code and the efficiency of the tasks.

My conclusions

At first, I was incredulous: all the best practices applied to my code, including SOLID principles, unit testing, use of inheritance and polymorphism over decision clauses (if, while, for). Don’t I have to consider all of this, because, essentially it’s a waste of time and effort? Then I started to think more about it, and I concluded some points:

Metrics really shape behaviors. My willingness to reach the highest Index possible was obfuscating what’s really important: to keep the code maintainable, but simple. Of course, it doesn’t mean I have to code monolithic classes and forget about some best practices, but apply them as I need, when I need, keeping them in a developer toolbox.
The Maintainability Index (among other metrics) is important, but it’s more a guideline than a value you must pursue to be the highest possible at all costs. In my opinion, it’s more beneficial to run it once, check the results, and sometime later, after a lot of modifications in your code, run it again. The important thing here is to compare with your own code, regardless of the value first obtained. This value itself is not so important. But I tend to agree that when there’s red flag, it is always worth aiming for a code refactoring.
The study from the University of Oslo clarifies a lot of points, but some of them are questionable: the Maintainability Index of the four projects is very close to each other. What if we compare the results with good and bad written code, more distant in terms of quality? Also, for me, four projects are not enough to verify some of the results. For a comparison closer to the market reality, it could also consider older projects, written and rewritten by dozens of different people. That really makes the difference when you’re about to do any maintenance task.

References (accessed on 06/28/2018):

[1] — Code Metric Values — https://msdn.microsoft.com/en-us/library/bb385914.aspx

[2] — C4 Software Technology Reference Guide — https://resources.sei.cmu.edu/asset_files/Handbook/1997_002_001_16523.pdf

[3] — Using Metrics to Evaluate Software System — http://www.ecs.csun.edu/~rlingard/comp589/ColemanPaper.pdf

[4] — Think Twice Before Using the “Maintainability Index” — https://avandeursen.com/2014/08/29/think-twice-before-using-the-maintainability-index/

[5] — Questioning Software Maintenance Metrics: A Comparative Case Study — https://www.mn.uio.no/ifi/personer/vit/dagsj/sjoberg.anda.mockus.esem.2012.pdf

Metrics Shape Behaviors

My conclusions

Written by Danilo Ruziska