Maintainable source code repositories

Brett Uglow

Published in

DigIO Australia

6 min readNov 19, 2019

It’s 8:30am on Monday morning. Bob’s boss greets him.

“Mornin’ Bob! Would you be able to take a look at this project — we just need to make a small change logic.”

“No worries!”, says Bob, excited at the prospect of working on a something new. He get’s the repository URL from a colleague and clones the repo. He begins to try to understand what the project does and how it works. Slowly, his heart sinks.

There’s a README file, but it doesn’t explain what the software is supposed to do. There are no diagrams or images. Some packages don’t have a README at all. He opens up some source code files. Different formatting… different syntax… no comments… no tests!

Bob’s excitement turns to horror as it becomes clear that this software is going to be difficult to change effectively.

Bob’s next two weeks are not going to be fun at all…

Bob at 8:30am on Monday, then again at 9:00am after trying to read the repository. The stress of reading the repository combined with the realisation of how hard it was going to be to make *any* changes caused his beard to literally **fall off**.

Developers spend a lot of time reading code. That’s why one of the qualities we value at DigIO is software engineers who can write maintainable code.

Indeed, the ratio of time spent reading [code] versus writing is well over 10 to 1.
- Robert C Martin

What are the ingredients of a maintainable codebase? Maintainable code is:

easy to read
easy to test
easy to change
well documented
well organised

Let’s look at each of these qualities to see how Bob’s Monday could have been a lot better by applying a few simple techniques.

Easy to read

The following techniques help make the source code easy to read (from a visual perspective):

Consistent file naming conventions — is there really a good reason to have files using lots of different file-naming standards? E.g.:some-class.ext, AnotherClass.ext and a-third-class.ext ? No, there’s not, and using mixed-case file names can easily bite you when building code on a different operating system than the one you develop with. So pick a single file-naming standard and use it everywhere.
Using a linter like ESLint — this is a sign that the repository is probably using a code-linting tool to help the code conform to a particular written-style. For example, many projects have linting rules that enforce the use of a particular function syntax, so that everywhere a function is defined, the syntax is consistent. This helps with reading the code as the code should look the same from file-to-file.
Using a code formatter like Prettier— code formatters re-write source code to follow a set of project-defined rules. This leads to consistent styling of the code even when there are multiple authors, which makes it easier to read.
Good names for “things” (e.g. variables, functions, classes) — there are countless articles & videos talking about naming code. I would recommend Kevlin Henney’s video on the topic, as it also covers why code formatters (such as Prettier) are so important in terms of how our brain sees code.

Easy to test

Which is easier to test? This function…

… or this function …?

Obviously, functions with fewer code-paths (branches) are easier to test than functions with a lot of code-paths. The term for the number of code-paths in a function is cyclomatic complexity. The higher the cyclomatic complexity of a function, the more tests are needed to ensure that every code-path has been tested.

Code that is easy to test has:

lots of small modules and functions that do one thing (the ‘S’ in SOLID)
mostly pure functions

Both of these qualities lead to functions that are smaller (in terms of the number of lines and in terms of cyclomatic complexity).

GUI components can be easy to test if all of the non-GUI logic is in a separate file. That means avoiding putting the following code directly into the GUI component:

data transformations (e.g. responses from API calls, or other data mangling)
business logic

Simpler GUI components leads to tests that verify that the GUI changes in response to different data-input and user-input, without needing to also test the data transformations and business logic.

Easy to change

There are a few signs that a repository is probably easy to change:

There are lots of commits by different committers.
There are recent commits.
There are tools (such as tests, test-environments) that provide feedback on whether the change is “correct” or not.
It is possible to access the repository (it’s not behind a paywall or read-restricted).
There is documentation describing how the software works (see below).

Documentation

The following files should exist in every repository:

README.md — this should explain to end-users what the source code does, motivation, how to install the software, how to use/run it and license information. There are lots of examples available.
CONTRIBUTING.md — Explains to developers how to contribute to the software. This file should contain all the details for setting up a development environment, the overall software architecture, important files and configuration documentation.
docs/ (or some similar name) — should contain further repository-level documentation and diagrams, because you don’t want to clutter up the root directory of a project with tech documentation.

There is an idea in software engineering that source code should be self-documenting, which means it should be clearly understandable on its own without comments. While this is a lovely ideal, sometimes an elegant code-statement is not easily understandable, so comments to clarify the purpose of the statement are often helpful.

For example:

const DATE_REG_EX = /^(\d{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|02-(0[1-9]|1\d|2[0-8]))|(\d{2}(0[48]|[2468][048]|[13579][26])|([02468][048]|[1359][26])00)-02-29)$/;

… or …

// Parse dates in yyyy-mm-dd format, including leap years
const DATE_REG_EX = /^(\d{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|02-(0[1-9]|1\d|2[0-8]))|(\d{2}(0[48]|[2468][048]|[13579][26])|([02468][048]|[1359][26])00)-02-29)$/;

Additionally, you need documentation to explain how the different modules & sub-systems work together.

However, avoid stating the obvious (e.g.i++; // Increment i).

Well organised

A repository that is well organised should organise code by its function (e.g. purpose, feature) rather than by type (e.g. controllers/, ui ). There are exceptions to this principle, but it generally applies.

The folder hierarchy should:

exist (except perhaps for really small repositories).
not be too broad (i.e. everything is in its own folder, which makes it hard to identify how the code fits together)
not be too deep (which makes it harder to find code and move it around)

Conclusion

Bob’s week could have been a whole lot better if a little thought had been put into making the repository easy to read.

In summary, here’s a non-exhaustive list of things that help create maintainable source code repositories:

Consistent file naming
Code linting & automatic formatting
Meaningful variable naming
Tests
Lots of small modules that are responsible for doing one thing, including separating GUI code from non-GUI code
Tools that provide feedback on whether code-changes are correct (e.g. continuous integration (CI) tools)
An active development community
User & developer documentation
A well organised folder hierarchy

What makes code maintainable in your experience? Comment below if there’s anything we’ve missed!