PVS-Studio Team: Switching to Clang Improved PVS-Studio C++ Analyzer’s Performance
From the earliest days, we used MSVC to compile the PVS-Studio C++ analyzer for Windows — then, in 2006, known as Viva64, version 1.00. With new releases, the analyzer’s C++ core learned to work on Linux and macOS, and we modified the project’s structure to support CMake. However, we kept using the MSVC compiler to build the analyzer’s version for Windows. Then, in 2019, on April 29th, Visual Studio developers announced they had included the LLVM utilities and Clang compiler in the IDE. And just recently we’ve gotten around to try it.
We chose SelfTester — our utility for the analyzer’s regression testing — as a benchmark. The utility analyzes a set of various projects and compares analysis results with reference values. For example, if the analyzer’s core analysis showed new false positives or stopped showing applicable ones, this would mean the latest changes to the core caused some regression that needs to be fixed. To learn more about SelfTester, see the following article: “The Best is the Enemy of the Good”.
The test base’s projects vary in code volume quite a bit. Normally, when the running computer or test server is not overloaded, it takes SelfTester the same time — within the margin of error — to test the core of the same version. If the analyzer’s productivity suffers, this significantly increases the overall testing time.
After we switched the C++ analyzer to the Clang compiler, SelfTester runs C++ core tests 11 minutes faster.
This means a 13% performance gain. That’s quite significant, considering the only change was the compiler, don’t you think?
Of course, there are disadvantages — but those are minor. The distribution’s build slowed down by 8 minutes, and the executable file’s size increased by 1.6 Mbytes — of those, 500 Kbytes come from static runtime linking.
Apparently, better performance is achieved by means of a longer LTO stage, that takes up most of the build time, and more aggressive loop unrolling and function inlining.
Now I’d like to talk more about issues we faced during the transition.
Generate a Build for Clang
CMake scripts allow us to build code with all mainstream compilers, for required operating systems.
First, we used Visual Studio Installer to install the Clang compiler’s components.
Clang-cl is a so-called “driver” that allows you to use clang with parameters from cl.exe. We expected clang-cl to interact with MSBuild transparently, almost like a native compiler.
Alternatively, we could have used official builds from the LLVM project. You can find them in their GitHub repository. However, they require an additional plugin so that Visual Studio could find the compilers. We chose the first route, so the toolset in examples below will be clangcl. If we used LLVM, the toolset name would have been llvm instead.
We specified toolchain in the solution generation command for Visual Studio:
cmake -G "Visual Studio 16 2019" -Tclangcl <src>
Alternatively, we could use GUI:
Then we opened the resulting project, built it — and got all these errors.
Fix the Build
Although clang-cl looks and behaves like CL, under the hood it is a completely different compiler, with its own quirks.
We usually don’t ignore compiler warnings, which is why we use /W4 and /WX flags. However, Clang may generate additional warnings that prevent the build from succeeding. That’s why we temporarily deactivated them:
if (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
.... if (WIN32)
Now that’s better.
The GCC and Clang compilers, as opposed to MSVC for Windows, support the int128 type out-of-the-box. This is why a while ago PVS-Studio received an Int128 wrapper for Windows. The wrapper is written as inline assembly code wrapped in ifdef — in the best C/C++ traditions. Then we fixed preprocessor definitions. We replaced the code below
with the following:
if (MSVC AND NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang")
Usually, the compiler driver, be it clang.exe or clang-cl.exe, passes a library with built-ins to the linker (lld). However, in this case MSBuild controlled the linker directly and did not know that the library was required. Consequently, the driver had no way to pass flags to the linker. So we handled the situation manually.
if (CMAKE_GENERATOR MATCHES "Visual Studio") link_libraries("$(LLVMInstallDir)\\lib\\clang\\\
Yay! The build worked! However, when running tests, we encountered many segmentation faults:
The debugger was showing some strange value in IntegerInterval, while the problem was a bit further:
The data-flow mechanism’s structures actively used the Int128 type we discussed earlier. The structures employed SIMD instructions to work with this type. The crashes were caused by an unaligned address:
The MOVAPS instruction moved a set of floating-point numbers to SIMD operation registers. For this operation to be successful, the address must be aligned and must end with 0. However, the address ended in 8. Here we had to help the compiler by setting the correct alignment:
class alignas(16) Int128
The last problem was prompted by Docker containers:
When generating builds for MSVC, we’d always employ static runtime linking that we had switched to dynamic for our Clang experiments. It turned out that Microsoft Visual C++ Redistributables were not included into Windows images by default. We decided to switch back to static linking so that our users wouldn’t encounter the same challenges.
Although the project’s preparation took a while, we were satisfied that the analyzer’s performance grew by over 10%.
We will use Clang to build future releases of PVS-Studio for Windows.