Towards Maintainable Elixir: The Development Process
This is the first in a series of articles where we’ll be presenting our approach to building a maintainable Elixir codebase. To provide a bit of context, Very Big Things is a digital products agency with a dynamic software development tempo. New projects are started frequently and priorities sometimes change suddenly. An intensive project developed today might be put on hold tomorrow and need to be resumed after a few weeks or months. In the meantime, the developers will be assigned to other projects and once the development resumes, the new team might end up being composed of completely different people.
In such circumstances, it’s imperative to maintain some company-wide standards in order to speed up the onboarding process during the project switch. For this reason, all VBT projects are powered by the same technology stack. In particular, Elixir is used as the backend language, while the API is most often exposed through GraphQL. Libraries such as Ecto, Phoenix, and Absinthe are also present in pretty much every VBT backend project.
Going further, we try to keep different projects similar to each other with respect to code design and style. We have established a set of company-wide guidelines, conventions, and common practices for writing Elixir-powered backends. The goal is to obtain a relatively uniform code structure across projects and also to produce code that is clear and easy to read.
To make this happen, we agreed on some high-level desired code structure (which will be discussed in the follow-up articles) and a set of values and properties we want from our code. Roughly speaking, we want to produce the simplest solutions to our present challenges, and avoid building overly general and complex abstractions that aim to solve every possible future problem. Such code is usually easier to understand and is open to moving in any arbitrary direction.
Achieving this requires some constant time investment but can be done without significantly slowing down the team’s progress. The key insight is that we’re trading maximum short-term speed to ensure a mid-and long-term stable and sustainable pace. Instead of frantically banging on the keyboard as if there’s no tomorrow, we invest constant effort to keep the internal quality of our codebase.
The process
A well-organized codebase is not very likely to happen by chance. Instead, we need to devise a process that can help us ensure code quality and keep the style consistent across different projects. While doing this, it’s essential to distinguish between the properties that should hold always (or in the vast majority of cases) and the properties that are better evaluated by humans. For example, we always want the program to work correctly, therefore we can rely on tools such as CI to ensure that all the tests are passing. On the other hand, assessing the quality of the tests is better left to humans. Aiming to automatically enforce this by e.g. mandating 100% test coverage will usually be insufficient and even counterproductive.
Our core strategy is to do all the work through pull requests. We make some branches protected, such as develop or release, and no one can push the changes directly to these branches. Instead, such branches can only progress further through pull requests, which can be merged only if all the automated checks on CI are passing and the PR is approved by at least one reviewer. That way we can ensure that some essential properties are always satisfied (e.g. the code is compiling successfully and all the tests are passing).
Our CI build consists of the usual suspects. We install the prerequisites, fetch the library dependencies, compile the program, and run the tests. All of these steps must succeed or otherwise, the merge will be disabled.
While compiling the program we check for warnings by invoking mix compile — warnings-as-errors
. Treating warnings as errors helps us keep the code clean, simplifies the migration to the next Elixir version, and can even detect some possible bugs (e.g. calling a non-existing function of another module). Since Elixir supports conditional compilation, we perform this check in all three mix environments (dev, test, and prod). This leads to slightly longer build times but it can help to detect possible subtle issues.
Among some less usual checks, we also verify that database migrations are reversible by invoking mix ecto.rollback — all
after the test step. Another thing we do is a naive smoke-test of the OTP release. During the CI build, we create the OTP release (in prod env), and verify that the application can be successfully started by invoking something like:
bin/my_app eval "{:ok, _} = Application.ensure_all_started(:my_app)"
Style guide
A style guide is a great example of something that can mostly be expressed with tool-enforceable rules. In particular, we rely on the Elixir code formatter and Credo to do most of the work.
The formatter helps us achieve a basic, consistent layout and style, while Credo allows us to improve higher-level code consistency. With the help of Credo, for example, we enforce the consistent ordering of elements within each module. For this purpose, we wrote our custom Credo check that is now pushed upstream and available in Credo as Credo.Check.Readability.StrictModuleLayout.
In addition, Credo helps us enforce some desired properties, such as mandatory type specifications for all public functions. Finally, we use Credo to detect discouraged patterns, such as module renaming through aliases (alias Foo, as: Bar
), single pipes (foo |> bar()
), or complex multiline expressions in a do:
clause.
Essentially, we treat Credo as a style guide toolkit. We use it in the strict mode (which turns on some additional checks) but we also disable some checks, such as TagTODO
and ModuleDoc
, which don’t make much sense for our projects. In addition, we may sometimes disable a check only in a specific part of the code. A good example of this is the typespec check. As mentioned, we want to enforce typespecs for public functions but typespecs sometimes aren’t very useful, for example in web modules such as Phoenix controllers and Absinthe resolvers. In such modules the typespec Credo check is therefore disabled. The nice thing about Credo is that it is not very opinionated. It proposes a sensible set of default rules, providing a wider collection of checks we can opt in to, and allowing us to turn off each check on a case-by-case basis (per-line or per-file).
As a consequence we don’t need a style guide document at all, which reduces the teams’ mental load, making the code review process smoother. This also simplifies the onboarding procedure for the new team members.
Since configuring all these rules in the new project can be cumbersome and error-prone, we have created our own project generator which sets things up the way we want them to be. The generator will among other things create the default Credo configuration, the basic CI/CD config, and various other pieces that we want to have in all of our projects.
Code reviews
It goes without saying that tools alone can’t ensure a code’s clarity. A code can be perfectly formatted and conform to all the style guide rules, but at the same time be an unreadable and incomprehensible mess, even to its authors. Clarity is a property that is best assessed by the human, and code reviews are the technique we use to make it happen.
When done properly, code reviews can help us assess and improve the quality of a code, detect possible bugs, security issues, and spread the knowledge of the project and the technology throughout the team. The last point is particularly important, though, frequently overlooked. Reviews can help us reduce the risk of individual code ownership and knowledge clusters, which is equally beneficial to the team and to individual members. The team doesn’t end up depending on a single person who is the sole expert in some area of the code, while a single developer is less likely to get stuck forever maintaining some piece of legacy code just because no one else understands it.
These benefits are definitely worth the extra effort, which doesn’t even have to be significant. In my experience, reviewing a well-organized PR of about a day’s worth of work usually takes no more than 30 minutes. There will be occasional exceptions, typically when some significant changes are introduced (e.g. a large refactor, or implementation of a sophisticated algorithm) but these situations should be less common.
The key thing is to keep the PRs small, well-organized, split into a series of commits that tell the story in small incremental steps, and can be grasped by an average human. Making this work might require an initial adjustment period, but it’s worth the effort because PR organization is the key enabler that allows us to reap the code review benefits. Reviewing a PR, which is large or which packs too many changes in a single commit, is very hard, time-consuming, and energy-draining. Therefore we insist on reasonably small and clean PRs and we don’t refrain from rejecting the ones that are not suitable for review. Of course, a PR rejection is done in a polite and thoughtful manner by letting the author know that the given changes are too difficult to analyze and by providing some suggestions for improvements.
Let’s make this a bit more concrete. Suppose we’re adding the registration and login feature. The relevant PR could be split into the following commits:
- Add database table and Ecto schema (using the minimal set of fields)
- Add register context operation
- Add register API
- Add login context operation
- Add login API
If the registration form consists of a large number of fields, it’s preferred to add them and the associated handling logic (e.g. validation, normalization, etc.) in the subsequent commits or separate follow-up PRs. By striving to present the solution in small incremental steps, we’re being empathic to the reviewers, making it easier for them to understand the changes and find potential issues.
It’s worth noting that once the PR is approved, we’ll keep this fine-grained history and make an explicit merge commit that allows us to preserve the full context of the change, i.e. the feature (as represented by the merge commit and the associated PR) and the local change reason (as represented by the individual commit). In other words, we avoid squashing and fast-forward merging.
During the review, we look for various potential issues such as bugs, race conditions, missing database constraints (e.g. not null, uniqueness), lack of tests or weak test assertions, a coupling of concerns (e.g. by directly using Repo
from the web layer), etc. In addition, we pay special attention to the code clarity.
If the reviewer has problems understanding the code, it is likely not clear enough and can be addressed in various ways. Typically, the author should make some changes to the code, which can be as simple as improving the naming or adding a brief comment, or as complex as requiring a significant rewrite. It depends on the particular situation and both parties might need to arrange a meeting to discuss the best course of action. Explaining the code through the PR comment is rarely a good idea because it only helps one reader (the reviewer), leaving the members who didn’t participate in the review in the dark.
Finally, it’s worth mentioning that reviews are usually not an urgent matter. The reviewer is free to wrap up their own work before looking at the assigned PRs. In the meantime, the author can start some new work, possibly on top of the pending changes if needed. If some change needs to be merged urgently or if the review doesn’t arrive in a sensible period of time (e.g. one day), the author can ping the reviewers on the team’s chat.
Summary
Having a proper development process in place is the basic premise for producing maintainable code. At VBT we opted for a fairly lightweight process that is not ridden with complex bureaucratic procedures. We use tools such as CI and SCM to enforce as much of the rules as possible. This leaves less room for mistakes and reduces the mental strain on developers.
However, tooling is by itself not enough. Tools can help with mechanical tasks, but in the end, it is the job of the people, not machines, to evaluate the clarity of the code. This is why code reviews play a central role in our internal quality assurance process. This is the phase where we typically discover some of the more serious issues.
For reviews to be successful, the team needs to reach a shared agreement on what kind of code they consider maintainable. It is sometimes suggested that properties such as code clarity are too subjective and therefore can’t be obtained. While it’s true that different people will prefer somewhat different coding styles, the team should find some common ground, at least at the higher levels of code design and architecture. Otherwise, they will be a very inefficient team.
Finally, it goes without saying that quality is not the only output of our development process. We’re in the business of building software that provides value to the people, and it’s only natural that we want to make steady delivery progress. Finding a balance between progress and internal quality is essential. Such balance leads to a long-term stable pace, allowing the team to be equally responsive and fast after the codebase grows. This is a future we always plan for since its only alternative is the project cancellation.
Making this happen doesn’t require a significant time investment, past the initial adjustment period. Once everyone on the team adapts to the process and becomes used to the agreed coding style, the established practices become a part of the team culture and a force of habit. In the year and a half that I’ve been working with VBT, I’ve never heard anyone, developer or manager, complain that development is now slower. People seem to be generally happy with the process and the produced code, which is exactly as it should be. The purpose of the process and the code design is to help us, not to get in the way.
So is our code perfect now? Of course not, but clarity, like most other things in life, is not binary. It’s better to have some than none and it’s even better to have a lot of it. We feel that we have improved the code we produce but we don’t take it for granted. We constantly reevaluate our decisions, based on real-life experience, tweaking them if needed.