Sergey Vasiliev
Apr 14, 2015 · 9 min read

A Few Words about Static Code Analysis

In this article, I’d like to briefly introduce the readers to static analysis tools by the example of three products — Cppcheck, PVS-Studio, CppCat. It’s not a detailed comparison of these tools but rather a brief overview of their features and functional capabilities. As examples, several samples of bugs detected by each analyzer are cited, accompanied with brief commentary.

About static analysis

Software product development has a number of aspects to it such as coding, debugging, and bug search. The last two take up a considerable bulk of development time, and you may call yourself lucky if bugs are easily detected. A much worse thing is when you deal with constructs which are syntactically correct yet logically incorrect. And the worst of all is when such errors don’t show up at every run — in this case, it is especially costly to find and fix them. And if a bug present in earlier product versions shows up in later ones, then fixing it takes even more effort since the later it is found, the more expensive it is to fix.

It is to fight these troubles that static analyzers are designed for. Generally speaking, static analysis is a type of software analysis carried out, unlike dynamic analysis, without actually running the program. Static code analyzers are software products intended to aid programmers in this process by detecting errors and suspicious code fragments. This helps save plenty of programmers’ time and effort and, consequently, reduce the general software development cost.

In this article, we are going to talk about a few representatives of this type of tools — PVS-Studio, CppCat, and Cppcheck. It’s rather a brief overview and demonstration of their capabilities than a thorough analysis of their pros and cons.

Static analyzers

Cppcheck

It’s a very simple and easy-to-use tool which is a nice choice for getting started with static analyzers in general. Its major advantage is that it is open-source and free.

Cppcheck is not as mighty as its commercial counterparts. It’s nothing to wonder at but just keep it in mind. If the development team really wants to squeeze the best out of the static analysis methodology, they certainly will have to supplement Cppcheck with other tools.

But getting back to Cppcheck, it is a perfect choice for the first meeting with code analysis tools. The wonderful thing about it is that you can run it for the first time without setting up or adjusting anything. Just specify the folder with the project you want checked, and the analyzer starts working. Without customization, Cppcheck’s output won’t be as informative and useful as it potentially can, of course, but it’s really great that you can just install it and tell it: “Check this folder with the source code!” — and it checks it and finds bugs there! This may strike a person trying a static code analyzer for the first time. Once you have seen real bugs found in your code, you will fall in love with these tools and be willing to read manuals and customize static analysis tools thereafter, for you’ve gotten clear evidence of how useful they can be. Tools that need to be adjusted before the first launch do yield to Cppcheck in the easiness of attracting new users.

It won’t be an exaggeration to say that Cppcheck is a great popularizer of the static analysis methodology. Many thanks to those who help develop this project.

Below are a couple of bugs Cppcheck can detect.

static UStringPrepType
getValues(uint32_t result, int32_t* value, UBool* isIndex){
....
}else{
*isIndex = FALSE;
*value = (int16_t)result;
*value = (*value >> 2);
}
if((result>>2) == _SPREP_MAX_INDEX_VALUE){
type = USPREP_DELETE;
isIndex =FALSE;
value = 0;
}
....
}

The analyzer has suspected something wrong in the following lines:

isIndex =FALSE;
value = 0;

Indeed, why would one change the values of pointers passed as arguments? The mismatch is especially prominent for the ‘isIndex’ variable. This is a pointer to the ‘UBool’ type and it is assigned the FALSE value. The code will compile successfully only thanks to sheer luck as FALSE is actually nothing but 0.

Sure the code should have looked like this:

*isIndex =FALSE;
*value = 0;

To inform the programmer about suspicious fragments like that, the analyzer displays the message “Assignment of function parameter has no effect outside the function”.

And now an example with mistakenly swapped arguments:

void CMeasureSection::SortSections( void )
{
....
memset(sortarray,sizeof(CMeasureSection*)*128,0);
....
}

Cppcheck’s diagnostic message: memset() called to fill 0 bytes of ‘sortarray’. measure_section.cpp 180

The second argument of the memset() function defines the value to fill a memory area. The number of bytes to be filled is defined by the third argument. These two parameters are easy to mix up and that’s exactly what happened here. From the viewpoint of the C/C++ language, there’s no error here and so the compiler will keep silent. But because of this mistake, the filling of the memory area with zeroes will fail.

And the last example. Static analyzers are generally not very good at detecting such errors as array overruns or resource leaks: this is rather the domain of dynamic code analyzers (find out more). Nevertheless, sometimes static analyzers can be of help here, too. Here’s an example of Cppcheck having detected that a file might stay unclosed:

int SmdExportClass::DoExport(....) 
{
....
FILE *pFile;
if ((pFile = fopen(szFile, "w")) == NULL)
return FALSE;
....
if (!CollectNodes(pexpiface))
return 0;
....
}

Cppcheck’s diagnostic message: Resource leak: pFile smdlexp.cpp 164

The file is opened and then closed somewhere further in the code. But notice the following fragment:

if (!CollectNodes(pexpiface))
return 0;

If ‘Nodes’ fail to be collected, the function will be left, the file remaining opened. The correct version of this code is as follows:

if (!CollectNodes(pexpiface))
{
fclose(pFile);
return 0;
}

Conclusion. If someone asks you which static analyzer to pick for a start, tell them about Cppcheck. It’s simple and free, which are crucial qualities when you’re only learning something new.

CppCat

CppCat is in fact a lightweight version of PVS-Studio. It has all the features removed from it that a Visual Studio programmer will hardly ever need. Also, in this tool, the authors have already decided for the user as for which diagnostic messages they need to see and which they don’t. It allows you to start using the analyzer right away, without hard pondering over the settings parameters. Unfortunately, static analyzers are inclined to have over-complicated settings, which may scare off programmers who haven’t dealt with tools of this type yet. And what’s even worse is that the abundance of settings may result in the user ticking a wrong checkbox by mistake and getting awful output. Unable to figure out whom or what to blame, they are very likely to be turned off static analyzers for a long time.

Offering a simple and friendly interface and settings, CppCat is a perfect candidate for getting started with the static analysis methodology. I see no reason to describe CppCat’s diagnostic capabilities, for you can learn about them from the PVS-Studio description. CppCat carries all the best diagnostics implemented in PVS-Studio. For details on the difference between the two tools, see the article “Comparing Functionalities of PVS-Studio and CppCat Static Code Analyzers”.

PVS-Studio

This is a static code analyzer capable of detecting a wide range of various bugs. The analyzer can be used both under the Visual Studio IDE and standalone. It supports the languages C, C++, C++11, C++/CLI, C++/CX. The tool provides means to integrate it into the development process in a number of ways. For example, there is a feature of automatic analysis of recently modified files, support for MSBuild, a compiler monitoring mechanism allowing tracking compiler launches and collecting information necessary for project analysis.

The main emphasis in the development of this product is put on detecting bugs the user doesn’t suspect of. These are basically typos and issues when the compiler behaves in an unexpected way. Let me clarify on this point.

As for typos, programmers are aware of their existence. But in practice, they often find it very hard to keep them in mind and take precautions to avoid them. The programmer is preoccupied with making sure that pointers are checked for NULL before being used or that all the class members are initialized. The fact that one can easily make a typo while coding is usually not considered. Unfortunately, it is eliminating this type of bugs that takes an awful amount of time. So the analyzer can be of great help to the programmer by assuming the largest bulk of the bug detection job.

Here is an example of a typo detected by PVS-Studio. A closing parenthesis is put in a wrong place:

template<typename Scalar> EIGEN_DEVICE_FUNC
inline bool isApprox(const Scalar& x, const Scalar& y,
typename NumTraits<Scalar>::Real precision =
NumTraits<Scalar>::dummy_precision())
template< .... >
void evalSolverSugarFunction(....)
{
....
const Scalar psPrec = sqrt( test_precision<Scalar>() );
....
if (internal::isApprox(
calc_realRoots[i], real_roots[j]), psPrec)
{
found = true;
}
....
}

The isApprox() function has two mandatory arguments and an optional one. And this is the source of the error.

The comma before the third argument is an operator in this case, while the ‘psPrec’ variable becomes its right operand.

Now a small note for those who are not well familiar with the comma operator. The comma operation has the lowest precedence among all the C++ operations. It has 2 operands (left one and right one). The left operand is evaluated first and then the right operand is evaluated and returned as the result value.

So it turns out that the isApprox() function is initially called with two arguments but its return result is not used in any way. The comma operator returns the value of the ‘psPrec’ variable, which leads to the following execution logic:

internal::isApprox(_realRoots[i], real_roots[j]);
if (psPrec)

The correct version of the code:

if (internal::isApprox(
calc_realRoots[i], real_roots[j], psPrec))

PVS-Studio’s diagnostic message: V639 Consider inspecting the expression for ‘isApprox’ function call. It is possible that one of the closing ‘)’ brackets was positioned incorrectly. polynomialsolver.cpp 123

But still, why am I claiming that programmers do not suspect of typos? Well, because typos are way more common in projects than you might think. Check this article, for example: “The last line effect”. Note that it was written based on an already debugged and generally well-running code. So what to expect of freshly written code?

Another type of hidden errors causing the program to behave unexpectedly is constructs leading to undefined behavior. But we won’t talk about them now; there are more interesting issues to discuss.

The analyzer relies on an extensive knowledge base, which enables it to warn programmers about dangerous situations they may have not a slightest notion of. It’s really an awesome aid. Few programmers are aware that in the following code, there is a risk of the memset() function call getting thrown away:

char* crypt_md5(const char* pw, const char* salt)
{
MD5_CTX ctx,ctx1;
unsigned long l;
int sl, pl;
u_int i;
u_char final[MD5_SIZE];
static const char *sp, *ep;
....
/* Don't leave anything around in vm they could use. */
memset(final,0,sizeof final);
return passwd;
}

Before leaving the function, the buffer with private data needs to be cleared. To do that, the programmer uses the memset() function. That’s a mistake. The compiler knows that after the memset() function, the ‘final’ buffer is no longer used, so it has full authority to remove the call of the memset() function. Moreover, it’s exactly what it will do when building the release version. To learn more about it, see the article “Overwriting memory — why?” and description of diagnostic V597.

To prevent this from happening, you should use special functions the compiler is not permitted to remove.

PVS-Studio’s diagnostic message: V597 The compiler could delete the ‘memset’ function call, which is used to flush ‘final’ buffer. The RtlSecureZeroMemory() function should be used to erase the private data. md5.c 342

I hope you have found it interesting to learn about the PVS-Studio analyzer. To find out more, see the article “PVS-Studio for Visual C++”. For examples of bugs detected by the tool, welcome to this page: “Errors detected in Open Source projects by the PVS-Studio developers through static analysis”.

Conclusion

As a conclusion, I’d like to remind you once again that this article is rather of an introductory character and aims at familiarizing the readers with static analysis tools for C/C++ projects. I think it’s impossible to overestimate the value of static analyzers, especially in large projects where fixing one hard-to-hunt bug may take a whole lot of time. Therefore, analyzers help both save programmers’ time and effort and reduce the development cost. This brief overview of three representatives of the static analyzers’ world — PVS-Studio, CppCat, Cppcheck — is intended to help you choose a product to fit your own needs.

And please remember that the most important thing about static analysis is to use it regularly, not from time to time! This is what its essence and usefulness are all about.

References

Below is a list of additional reading materials on static analysis that you may find useful:

Sergey Vasiliev

Written by

Software developer. LinkedIn: https://www.linkedin.com/in/fotoshooter/

More From Medium

More from Sergey Vasiliev

Also tagged Programming

More from Sergey Vasiliev

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade