Software Architecture Analysis

Miguel Pessoa
8 min readMar 31, 2021

--

🕵️‍♀️ Collect software architecture data and improve your software quality 🕵️‍♂️

Introduction

The main idea behind the software metric is to empower the developer with data in order to control and propose new designs. It’s often said that we can’t control what we don’t measure, and that is totally true while developing software

Using metrics helps us to see how the system is growing, places with higher complexity (related to best practices of object-oriented programming or architecture), and could give a hint of where maintainability will be chaos due to package relationships.

So, according to [1], software metric is defined as:

A function whose inputs are software data and whose output is a single numerical value that can be interpreted as the degree to which software possesses a given attribute that affects its quality.

To complement, the JDepend documentation states that:

“good” design quality metrics are not necessarily indicative of good designs. Likewise, “bad” design quality metrics are not necessarily indicative of bad designs.

Having tools to analyze our software is essential to characterize the design of things that we are creating, evaluate their idea, and detect aspects of quality in order to keep our software as maintainable and extensible as possible.

The next sections will be on some theory on metrics, tools, and the process of collecting quality design metrics.

Metrics

Commonly we can separate metrics into some categories: static code analysis (Object-oriented metrics, version control systems) and run time analysis (for example big O, methods usage counting).

Static Code Analysis

Static code analysis is a method of debugging by examining source code before a program is run. It’s done by analyzing a set of code against a set (or multiple sets) of coding rules. [5]

Cyclomatic Complexity

The cyclomatic complexity of a code section is the quantitative measure of the number of linearly independent paths in it. In essence, it’s the different number of routes through a piece of logic. [6]

Cohesion

This refers to what the class (or module) can do. Low cohesion would mean that the class does a great variety of actions — it is broad, unfocused on what it should do. High cohesion means that the class is focused on what it should be doing, i.e. only methods relating to the intention of the class. Source

Coupling

It refers to how related or dependent two classes/modules are toward each other. For low coupled classes, changing something major in one class should not affect the other. High coupling would make it difficult to change and maintain your code; since classes are closely knit together, making a change could require an entire system revamp. Source

Efferent Coupling

The number of classes in other packages that the classes in a package depend upon is an indicator of the package’s dependence on externalities. Efferent couplings signal outward. Source

Afferent Coupling

The number of classes in other packages that depend upon classes within the package is an indicator of the package’s responsibility. Afferent = incoming. Souce

Complexity

Implies being difficult to understand and describes the interactions between a number of entities. Higher levels of complexity in software increase the risk of unintentionally interfering with interactions and so increases the chance of introducing defects when making changes. Source

Instability and Abstractness

These measures apply to assemblies and can be used to determine the assembly’s distance from the main sequence, which is where the measures of instability and abstractness total 1. Assemblies that are far from the main sequence may be useless (if overly abstract) or painful to work with (if overly concrete and depended upon). Source

Dynamic Analysis

The main idea behind run time is to collect data while we are running an application.

The dynamic analysis tool modifies the source or binaries of the application to provide hooks for instrumentation; these hooks detect dynamic bugs, memory usage, code coverage, and other conditions. Dynamic analysis tools can also generate accurate stack trace information that allows debuggers to find the cause of an error. [7]

Some metrics: Response time, CPU usage, RAM usage, Availability, Request usage, etc. Example of some tools to collect dynamic metrics:

  • Kibana (track execution time for some methods/endpoints with a stopwatch)
  • Grafana
  • Prometheus
  • Instana
  • Honeycomb

Version Control Systems Metrics

The history of our system provides us with data we cannot derive from a single snapshot of the source code. Instead, VCS data blends technical, social, and organizational information along a temporal axis that lets us map out our interaction patterns in the code. Code-Maat

Check the following books if you want to dive deep:

Measuring Instruments (Tools)

When dealing with data, tools are useful in helping us collecting some metrics. This section is an overview of some available tools in the industry.

Example of measuring instruments

JDepend

“JDepend traverses Java class and source file directories and generates design quality metrics for each Java package. JDepend allows you to automatically measure the quality of a design in terms of its extensibility, reusability, and maintainability to effectively manage and control package dependencies”.

With data being generated with JDepend you can use some extra tools in the list to save and process it later. Check the JDepend page below for a complete overview.

🖇️ https://github.com/clarkware/jdepend

Code-Maat

“Code Maat is a command-line tool used to mine and analyze data from version-control systems (VCS)”. With Code-matt you can understand some aspects of your software design looking only for the version control (which classes tend to change together, the ones that have more reviews, coupling, etc). Check the official page below

🖇️ https://github.com/adamtornhill/code-maat

CodeMR

“CodeMR analyses your source code on your machine and saves model and graph files on your local working directory. It does not share your source code with any of our or third-party servers”.

CodeMR is very handy and can help teams to have a complete overview of the project structure and collect metrics like coupling, cohesion, complexity from it.

🖇️ https://www.codemr.co.uk/

Intellij

Yes, you’re not reading wrong. IntelliJ could help us to gather some data related to methods, package usages. Besides, you can find some plugins on static code analysis. Also, IntelliJ allows us to export data as TXT, CSV making it possible to create visualizations through time.

Dynamic Analysis Tools

Prometheus, Kibana, Grafana, etc, are monitoring tools that can be used along with logs or other approaches to collect and visualize some data. These tools could be used in order to collect some code runtime data, methods usage, IO, request times, etc. Depending on the analysis this could be really useful.

Data Store / Visualization

In order to collect and analyze software metrics easily, your team could set up some automation processes to integrate tools like Jupyter Notebook, Code-Maat, JDepend, etc. Integrating tools helps to store data and create visualizations through time.

Saving the collected data is essential to understand the evolution of your software architecture and compare results.

Measuring process

How do you know that you are measuring what you think you are measuring? [2]

Prior to start collecting some data, the following example questions could help to clarify what to look for:

1) What is the purpose of this measure?

  • Evaluate an implementation status
  • Understand the level of coupling of some package
  • Evaluate the runtime of a specific service

2) What is the scope?

  • A single method from a class
  • A project class or a project package
  • The response time of some methods in a year.

3) What are we trying to measure related to the scope?

To achieve the scope of what we’re trying to measure, you could use static code analysis? Run-time analysis?

4) What are the metrics?

Related to 3), what are the corresponding metrics?

  • Coupling
  • Complexity
  • Instability
  • Counting, Sum, etc.

4.1) Measuring instrument, tools?

In order to measure the metrics related in 4), what measuring instrument are we going to use?

For example, if we are trying to count the usage of some method:

  • (Static Analysis) Use IntelliJ to find the method used and export the results in a file. Use some tools to read the file and plot the results.
  • (Dynamic Analysis) Use Prometheus to track the occurrences of the method. If you’re counting it, create a dash in Grafana to see the results.

The measuring instrument not necessarily needs to be automated, the important thing to keep in mind is to have a tool or a process that could be repeated later on.

5) Analysis; Follow-up

At this time, we already have the purpose, the related metrics, and tools that we can use to start collecting some data.

So, we could start analyzing what is being collected. Some data analysis tools such as R, MatLab, Jupyter, etc; could be really helpful in order to load, store, and transform the data being used. Plotting results charts also helps to see the big picture.

With more data coming in, the analysis process should be repeated and compared to see the direction that we are going. This step will tell us some action points like refactoring a service.

But what good software quality will bring to your company?

Good software quality allows the service and business for more growth potential.

Working on software with a bad design is not only painful for the engineering team but also creates some problems for the organization related to costs, time to market, and user satisfaction. Mark Richards in his Youtube channel [4], explains very well these topics. Check a summary below:

📉 Reduce overall costs

  • Less number of bugs
  • Less Testing

💡 An engineering working on bugs for several hours + testing hours * the engineer hour could be pricey.

📉 Reduce Time to market

  • Less development time and testing hours
  • Reduced deployment time
  • Less time from development to release of the completed application

📈 Increase User Satisfaction

  • Less number of errors related to the application
  • Less number of reported bugs
  • Increase the application performance (for example, faster data loading reduces the user time waiting for a response)

--

--