Tool Based Behavioral Code Analysis

Published in

wdp-insights

7 min readApr 21, 2021

Before investing in a company with self-developed software, a due diligence should clarify code quality, maintenance costs and possible risks. Static code analyses, as they are commonly used in due diligence processes, provide only insufficient answers. At wdp, we proceed one step beyond the market standard in our analysis and thus provide a much more accurate picture.

— You can also read this article in German.

Software-as-a-Service (SaaS) solutions and emerging technologies are experiencing a real boom as investment targets. The stock price increase of Google and Zoom during a global pandemic confirms the great interest of investors in software-based business models.

As wdp, we have been able to accompany many transactions with a corresponding due diligence. A software due diligence is part of a larger technology due diligence, in which aspects such as the IT organizations and other framework parameters are analyzed more closely in addition to the software. The analysis starts at the product perspective and ends at the details of the individual lines of code.

In the following, we present our experience in auditing these assets and outline how to make your next software investment a success with tool-supported software due diligence.

Goal

Software products are the core of value creation for many modern companies. The development and maintenance of these products cost resources, which must be estimated before an investment within the scope of a software due diligence. By default, a purely static code analysis is used. This means that the code is only considered in its current state. This is sufficient for an evaluation of the code quality, but does not provide satisfactory answers to the following questions:

· How high is the actual maintenance effort?
· How dependent is the code on specific developers?
· Do measures need to be taken to improve the code quality?
· At which points in the source code is it worth to improve the code quality?

In order to answer the above questions, we include historical data from version management in a code analysis. A version management system such as Git is a software for managing and versioning source code and is used in almost every software project. It yields the possibility to track exactly who made which changes to the code and when. This provides information, for example, about the spots in the code where work is currently being done.

To answer the above questions, this article discusses the results of an extensive tool-supported software due diligence involving historical version management data in comparison to the market standard of purely static code analysis.

Procedure and Method

The procedure of a tool-supported analysis of the software solution of the target company runs in four phases:

· Organization
· Data Collection
· Analysis
· Interpretation

The organization of a tool-based analysis should be planned early in the due diligence process. Code is an asset worth protecting for companies and, as other critical information, people are reluctant to hand it out. We address these concerns during the project kick-off and define the framework for the analysis. We ensure the protection of the company’s source code at all times during the audit.

Data collection is the base of the analysis. Based on the code and other information from version management, metadata (“data about data”) is generated using special analysis tools. Although this data is not human-readable, it can be read by appropriate tools. These tools reduce the effort for the target company to a minimum and leave the code in the company network if required.

After collecting the data, we use the same tools to analyze the collected data and visualize the results. With our experience from more than 40 due diligence processes in the software environment, we interpret the results, put them in the context of the investment and estimate the expected costs for business planning.

Results

In this section, we discuss the results of a comprehensive tool-based software due diligence. We put all these concrete partial results in context to answer the questions posed in the Goal section. We rely on a combination of static code analysis, i.e. evaluation of the current code quality based on various criteria, and behavioral code analysis, i.e. an analysis including historical data. Specifically, we deliver the following results as part of a comprehensive tool-supported software due diligence:

Legacy Code — “How much maintenance is really required?”
Old code is bad, right? Many automatically associate old code with legacy and technical debt. In fact, code that hasn’t been changed in years is the best code. After all, it hasn’t demanded any effort for quite some time and runs without problems. Accordingly, is new code that is changed frequently bad? No. Frequent changes demand effort, but they also provide efficiency because developers know the code and can make adjustments more quickly.

Code that is changed occasionally is problematic. If the last adjustment was made several months ago, the developer has to find his way around again at the corresponding point. Ideally, a code base should therefore consist of a large proportion of old, stable code that does not require any effort, a medium-sized part that is actively developed, and only a small portion of code that needs to be maintained occasionally.

Legacy Code — “Do measures need to be taken to improve code quality?”
It is widely known that poor code quality increases maintenance and development efforts. Most of the time, the quality of the entire code base is assessed during a due diligence process. However, is this the right approach? We as wdp clearly deny this.

Code quality is most relevant where there is a lot of ongoing work, and almost irrelevant in places of the code that have not been changed for years.

As part of a software due diligence, we therefore evaluate two factors. What is the current code quality in the part of the code that is changed regularly? And what trend can be seen for the entire code base in terms of quality? The latter provides us with information as to whether measures have already been taken or whether they still have to be taken with the corresponding effort.

Key Person Risks— “How dependent is the code on specific developers?”
This answers the question about risks regarding knowledge of the software. Often there are parts of the source code which are created and maintained almost exclusively by one person. This bears the risk of not being able to maintain this specific part of the software, or only with considerable effort, should the corresponding developer be absent for a longer period of time or even leave the company. It is possible that key people have already left the company and knowledge about certain parts of the code base must be rebuilt. This risk may be exacerbated by poor code quality and insufficient documentation.

Visualization of Knowledge Distribution amongst Developers in CodeScene

As part of software due diligence, we identify knowledge islands, assess the risk taking into account function, code quality, and documentation, and, if necessary, derive measures to counteract any risks. One possibility here, for example, would be the targeted development of documentation.

Hotspots — “In which places is it worth improving code quality?”
Code spots with high complexity, i.e. extensive functions with many logical branches, are never good because they demand a lot of effort from developers when changes are made. If changes are made with high frequency in a place with high complexity, we call it a hotspot. These hotspots increase maintenance and development efforts rapidly if left to grow unattended.

As part of a software due diligence, we identify hotspots, evaluate the effects and derive countermeasures. In this way, the code quality can be improved at the most important points and maintenance efforts can be efficiently reduced.

Conclusion

By combining static code analysis with historical data, you get more accurate estimates of maintenance effort, key person risk, and code quality. This allows you to optimally assess the value of your investment and derive operational measures for do address directly after the acquisition, thus ensuring the success of your investment.

As a Technology Analyst at wdp, Sebastian Krammer evaluates IT organizations and software products in the context of a Technology Due Diligence, as well as the implementation of recommendations made following a successful transaction. As a mathematician, his focus is always on data-driven decisions. He has a special interest in artificial intelligence and cryptography.

Tool Based Behavioral Code Analysis

Goal

Procedure and Method

Results

Conclusion

Written by Sebastian Krammer