The ONE reason why commenting code goes wrong

Published in

CodistAI

5 min readApr 9, 2020

I commented my code…but I still don’t know what I did or why…

A few years back I was working as an IT project manager in a large corporate (no name shaming here 🤫 ). With all the best intentions in the world, we wanted to implement software quality assurance over a 15-year old code-base. This code-base was supporting the core business of one specific unit of that company. It was about 10M lines of code in total.

So we conducted a study to assess the amount of work needed to be compliant with the Group software quality standards. Well,…it was not pretty to look at, it was even scary. Implementing the full requirements would have lasted over 5 years for 3 full-time employees. The first big problem was the test coverage: 3 years of work to reach 70% coverage. The second was the code comments: 1 full year to comment properly and consistently the code.

How the hell could we have gotten so wrong about our code documentation?

Why commenting?

When you write code, you write it for two main audiences: users and developers (including yourself). Both audiences are equally important. Commenting is describing your code to/for developers, the next maintainers, and developers of the code.

In general, and even more in a context of different developers working on the same code, well-written code plus good comments ensure a readable, easy to understand and easy-to-maintain program.

This is why, back then, my company decided to invest some good amount of money in re-commenting all their code-base.

“It doesn’t matter how good your software is, because if the documentation is not good enough, people will not use it.“ — Daniele Procida

And that’s true for a lot of companies. If your code-base base is a nightmare to onboard you can be sure that either new developers will quit fast or start everything from scratch after a certain point of time. This is not a sustainable way to maintain or scale an IT system!

You can’t improve what you can’t measure!

It actually boils down to that well-known principle: you can’t correct and improve what you can’t measure.

Developers agree that comments in code represent a main source of information for source code understanding with respect to development and maintenance. Despite that, there is no consistent and obvious way to assess and measure the quality of comments in source codes.

Today’s quality analysis tools for software provide a binary measurement of comments in the code. They calculate if there is a comment (Yes/No) and then provide a ratio of comments lines of code over a total number of lines of code.

Spoiler alert…this is ABSOLUTELY NOT enough.

And if you ever doubt it, I’ll give you a short story. I was discussing with a lead dev who told me that their team was incentivized on how well they commented the code. Combining lack of time, the pressure to release and no proper measurement of code commenting quality, that lead dev ended up asking his team to write random stuff in the comments area to speed up the process. So going through software quality analysis everything was green but you can be sure that six months down the line taking over the code would have been a nightmare.

So….comes the next question. How can we actually measure the quality of comments in code?

A comment is not just a comment

Comments in code can take at least 7 different forms :

General instruction (ex. readme.md): can be at the beginning of a file or in a separate file
Header: They are usually found at the beginning of each file... They give an overview of the functionality of the class and provide information about, e. g., the class author, the revision number, or the peer review status.
Section comments address several methods/fields together belonging to the same functional aspect. A fictitious example looks like // — — Getter and Setter Methods — — and is followed by numerous getter and setter methods.
Function comments describe the functionality of a method/function.
Inline comments describe implementation decisions within a method body
Code comments contain commented out code which is source code ignored by the compiler. Often code is temporarily commented out for debugging purposes or for potential later reuse
Task comments are a developer note containing a remaining todo, a note about a bug that needs to be fixed,or a remark about an implementation hack.

Each category of comments contributes differently to the global understand of the code :

How code comments impact your code understanding

It’s important to distinguish the different categories of comments. Today’s tool does not allow us to do it directly.

At Codist we are using Machine Learning on Code to automatically cluster the comments by types. We will develop this in a different article.

Assessing the quality of your comments

Once you know what kind of comment you have in your code, the next important step is to assess their quality. 4 aspects of code quality are important to point out and analyze:

Coherence between code and comments. Member comments should be related to the method name as this is a strong indicator for an up-to-date comment and a meaningful method identifier. Further, developers expect members and inline comments to explain the non-obvious by providing information beyond the code to enhance understanding implementation and design details. Member comments, in particular, should provide more information than just repeating the method name.

/*
* Check for symetru, then construct the eighen value decomposition  * @param A square matrix 
*/ 
public void calc (double [][]A) {  
for (int i =  0; i<n ; i++) {  
for (int j = 0 ; j<n ; j++) { 
V[i][j] = A[i][j] ; 
} 
} 
tred2 (); // Tridiagonalize 
tql2() ; // diagolnalize.

Completeness of your comments over the codebase. Header comments should be present in each file to document the system design. Documenting every method and field with a member comment helps an API user to select public methods and fields.
Usefulness of a comment. Comments should clarify the intent of the code. If you can suppress the comment without making the code harder to understand, that means the comment is not useful. Clarifying, helpful comments make all activities about understanding and using code easier.

/** remove all defined markers */ 
pubic void removeAllMarkers() {...}

Consistency of your comments formatting. For example, comments should be written in the same language (e. g., English) for better code reading. Each file should be under the same copyright in a consistent format, promoting knowing copyrights and authors.

Comments in code represent the main source for system documentation and are key for source code understanding with respect to development and maintenance. Maintaining a high standard code commenting quality is important for the code scalability in the future. The question you need to ask now is: do I have the proper tool to measure that quality?

The ONE reason why commenting code goes wrong

Why commenting?

You can’t improve what you can’t measure!

A comment is not just a comment

Assessing the quality of your comments

Written by Maeliza S.