“People and Code” — Part 2— Behavioural Code Analysis

Ivan Houston
Published in
5 min readSep 28, 2018


In part one of this mini series we took at look at “The Code Review” and how it has been evolving alongside our development methodology and team structure.

A thought provoking tweet by Neil Killick since I published that blog:

“ Code reviews are generally inferior to pairing & mobbing because: — They introduce flow delay, even in the most collaborative of teams — Quality resides in the conversations and decisions leading to the code, not the code itself — You can’t see the why behind the code /not/ there”

Static Code Analysis

Moving on from the manual Code Review I would like to share some thoughts on Code Analysis. Most people reading this will hopefully be familiar with Static Code Analysis. Rather than using human reviews for some known areas there are a variety of tools that analyse your code and highlight potential bugs, vulnerabilities and complexity. These allow you to concentrate on the logic itself when writing and reviewing the code.

Some tools will report the complexity in terms of “Technical Debt”.

These tools can report not just on your current code change but the entire codebase. I would encourage you and your whole team to have a good awareness of the health of your codebase using such tools.

Code Health

In doing so you can also learn a lot about the control and autonomy you have of your codebase from a team perspective.

(Examples of common Static Analysis Tools being PMD and SonarQube).

Behavioural Code Analysis

This leads nicely into thinking around how your team structure can influence the code you write. Yes, you guessed it — Conway’s law:

““organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”

So how can we best get an understanding of how not only your team structure but also your methodology and release cycle could potentially be impacting your code?

In his latest book — “Software Design X-Rays”, Adam Tornhill takes you on a journey into the wonderful world of “Behavioural Code Analysis”, particularly from the perspective of large legacy codebases and how you would even begin to identify the areas of Technical Debit worth investing the time in fixing.

In my review for his book I stated: “Adam encapsulates the challenges of a technical lead for a product in a large shared codebase. His social code-analysis techniques turn a dry static codebase into a living breathing ecosystem and chart its interactions over its lifetime, helping you to identity those areas worth refactoring”

….a little long but I hope it shows the depth of thinking you get from the book. Definitely a recommended read and it covers a lot more than I touch on here.

Essentially Adam highlights the fact that we are sitting on a goldmine of data in our source configuration management systems with a history of file changes and commit information. Using Adam’s techniques you can turn that data into valuable insights.

In its most simplistic forms you can identity the following:

Code Coupling — The degree by which two files are coupled. What percentage of the time are they delivered together. A high degree of coupling could be a good sign for a class and its associated test class. It could however be a code design smell on other files.

Code Ownership — Highlighting the authors and how frequently they change the code. This could indicate a high degree of ownership by a team or potential domain knowledge silos if the file is only ever changes by one of two people.

Code Communication — Highlighting the strength of relationship between the authors and reviewers. If two people have a high degree of “communication” that is like a sense of coupling. If someone is always using the same people to approve their changes it could be a smell worth exploring further.

Since then I have also tinkered with products such as EazyBI to get insights from our git repo log extracts such as:

Commit Frequency By Day — for a given repo, how many commits are delivered by day of the week. If you are using a SCRUM development methodology, could the day you start or end Sprint or have other ceremonies impact your efficiency, given that all the data I look at suggests that Tuesday-Thursday are the days that most code is delivered?

File Change Frequency — Seems simple but is a very useful measure to have. What files are changed the most frequently? Are they becoming too complex? Should we redesign them? Do they result in code conflict if changed frequently? On the other side, Could Files and Packages that are never changed be archived and pulled in as dependent resources?

You can use these insights along with your existing Static Code Analysis to help make decisions. If a file is showing high complexity but has a very low change frequency is it worth the investment and risk change over low complexity file that are always changed together?


What insights do you currently have on your overall codebase?

Once you have the basics in place why not take a look at Adam’s book and take your analysis to the next level.

Some of your People can tell you about your Code

Your Code history can tell you about your People

Can these insights help you to improve your way of working and the quality and speed of your delivery?

Turn the data you already have into knowledge and understandings you can act on.

In “People and Code — Part 3” we will take a look at a technique I have used called Mob Cleaning, to help tackle those areas of technical debt that we have deemed to be worth the investment.



Ivan Houston

Working with People in the World of Software