How to Make Good and Valuable Code Reviews

14 min readNov 18, 2023

Today I want to talk about my experience of making lots of code reviews in the last years. I’ve done this for developers at the company I worked for as well as for private projects. I want to share what, in my opinion, are best practices for a good code review, which brings added value for both the source code and the developer who is requesting the review.

Reviewing the code of other developers is an important part of my daily work. For me, there are several reasons for reviewing changes that are done to the code base:

There should be no merge to the master branch (which is equivalent to a release) without letting at minimum one other developer have a look at it.
People who are new to the project need to get feedback about if the new code added or changed is consistent with the existing code base. Not all deviations can be detected automatically by tools like Detekt or Konsist.
Less experienced developers need feedback about the quality and the correctness of the code that is written. A code review is also a good possibility for sharing knowledge because explaining on concrete code examples is easier than constructing synthetic ones.
All persons working on the project should have the chance to have an overview of the changes that have been done in their domain. This is especially important to prevent having tasks that can only be implemented by special persons because parts of the code are not known by the whole team.

Let’s start with an overview of the checklist that I go through when I’m doing a code review…

Review Process

All my code reviews follow a clearly defined procedure. This helps me to remember any task.

Automatically Checked

Static code analysis is doing some tasks automatically for me, so I can expect a minimal requirement of quality already is fulfilled.

The code format is checked by ktlint so that the language defaults of Kotlin are used. This helps to prevent bikeshedding about where so have the braces or things like that.
Detekt is checking for best practices grown by the Kotlin community. With this, the biggest problems and flaws are already checked.
Konsist is checking for project-specific conventions like naming, architectural boundaries or every productive class has a matching test class.

This was the automatic part, now the manual steps are following:

Description

I always start the code review by checking the description. This should give me an overview of why the changes in code are done. It is important to know the reason for the changes. There should be no changes that are not comprehensible. Often the description contains a link to the corresponding task like a Jira, Youtrack link, or a GitHub issue.

The description also contains information about what exactly should be reviewed. In case the changes are only related to a renaming or moving of files to a different package the scope of a review is a different one compared to a change in the general behavior of a core component of the application. This helps to focus on the correct things.

Related Commits

After familiarizing myself with the description the next step is to have a look at the commits that are forming the pull/merge request. The commit message should give more detailed information about why the changes have been made. The information provided should be sufficient, so that a developer not involved in the implementation can understand on a high level what and why it is done. It is not necessary to understand everything in detail from reading the commit message. For this maybe additional resources need to be read (like documentation or the linked task in the ticket system). One important thing to mention, the commit message does not need to describe what changes have been done, because this information already can be extracted from the diff.

For the format of the commit messages, Conventional Commits have proven themselves. According to the specification, a commit message should consist of the following parts:

<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

For details, you can have a look at the specification: Conventional Commits.

A short description of the changes that have been made. This is shown as a title in the history of graphical Git viewer. After an empty line, the body contains a more detailed description. The last part is a footer that can contain a link to the corresponding task in the ticket system. About the formatting of the commit message, a helpful blog post exists: https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

Scope of Changes

After I have an overview about the purpose of the pull/merge request in the next step I check the scope or size of the changes. You may wonder why this is relevant but for me the time spent reviewing code is limited. The span of concentration is for me somewhere between 30 and 45 minutes. When it takes more time to review the changes the risk of overlooking critical parts is rising fast.

As a rule of thumb, there should be no amount of changes greater than 800-1000. The amount depends greatly on the type of change. Renaming or moving a class that is used across the complete code base is quickly exceeding this limit. So changes that do not make it necessary to check every changed line (because renaming/moving should always be done by the corresponding function in the IDE) are an exception to this general rule.

Until now all the steps I’ve done do not make it necessary to have a detailed look at the code. I can use the Gitlab or Github web interface for Merge/Pull Requests to do these tasks.

For all the following tasks I need to work on the code level so I checkout the branch and open the code in the IDE.

Productive Code

In the following, I will show an enumeration of topics, I care about in the productive code (the test code is considered later separately). The order is random and there is no relation to the importance.

TODO/FIXME

Are there new TODO/FIXME comments added to the code? If yes does the related description give information about why this TODO/FIXME remains in code? Generally, I don’t like keeping them in code when it is merged to the master branch. Keeping them in code feels like something is not yet finished and also blind spots may occur (one TODO/FIXME magically attracts others).

There are exceptions to this rule.

The comment is linked to a task in the ticket system that should solve the code area marked by the TODO/FIXME.
The code to review is just part of a bigger change, but to keep the reviews small it is divided into multiple parts and the TODO/FIXME marks the missing implementation. A good example of this is the TODO() - function of Kotlin.

I also check if existing TODO/FIXME comments that are part of the changed code can be removed, because they may be obsolete or the issue can be fixed in a reasonable time. This should be part of the Boy Scout Rule every developer is following.

Compiler Warnings

Are new compiler warnings introduced? What compiler warnings are related to the code that is changed? For me, the same rules as for the TODO/FIXME are matching. In general, there should be no compiler warnings at all. The compiler shows the warnings for a good reason, so you should make sure that you do not introduce any new ones. As always there is an exception to this rule:

Deprecated warnings cannot always be directly fixed (e.g. because bigger changes are necessary). This is true for both the ones from external code and for own ones.

I also check if the existing compiler warnings can be fixed in a reasonable time. This is again related to the good old Boy Scout Rule.

Suppressions

In all of my projects static code analysis is used for having a consistent code base and also to get information about areas of code that may have problems and should be refactored. The tool of choice is mainly Detekt. The configuration does not allow the introduction of any new rule violation. For this, in the detekt.yml file, the maxIssues property is set to 0.

build:
  maxIssues: 0
  excludeCorrectable: false
  weights:
  ...

The problem with this is, that it is still possible to make Detekt happy by adding suppressions to the code.

@Suppress("LongMethod")

So during the review, I check if there are new suppressions added. For new suppressions, there are 2 possible states:

The violation will be fixed never, because it is an accepted one. In this case, instead of suppression, the violation is moved to the baseline file. So it will no longer clutter the code.
It should be fixed but not with the current code change. So there should also be a linked task in the ticket system available.

In all other cases, there are no suppressions accepted.

The linking of TODO/FIXE and suppressions to a task in the ticket system can be done by writing a Konsist rule. So it is not possible to get a successful build without a link.

Until now the tasks of the checklist can be finished without having a too deep look in the code. The following tasks are not feasible without reading and understanding the code.

Separation

Starting from the domain model all types that are used to the outside (e.g. REST adapter or database repository) should not use the same type but add a mapping layer between. This helps to be able to make changes inside the domain model without directly changing the behavior to the outside.

To check if the code is complying with this I focus on the public code related to interfaces of the application to external systems like the mentioned REST adapters or repository implementations.

Exception Handling

There are two different kinds of exception-handling strategies that I normally deal with.

Classical exception handling using try-catch-throws. Exceptions are thrown in all layers and some outer layer catches them and translates it to a failure that is returned to the caller.
Functional exception handling like Either is used and in some outer layers, the left type is translated to a failure that is returned to the caller (e.g. HTTP status code).

For both cases, I check, that the newly added and changed code does not change the behavior related to the exception handling. This covers:

A new exception is thrown.
An existing exception type is changed.
In the case of using Either there are no throws used somewhere (both in the own code and also 3rd party code that is used). Or the corresponding code locations use Either.catch to prevent it.

Changes in the handling of exceptions can have far-reaching consequences so it is important to have a detailed look in this area. Especially the changes should be covered by high-quality tests.

Immutability

Kotlin is a programming language that is designed to make the use of immutable data structures very convenient. Using immutable data structures reduces the risk of side effects, especially in a multi-threaded environment.

Every variable (var keyword) value that is used needs a good reason. A data class should always be immutable and to change the state the copy(…) - constructor should be used.

// Never
data class User(var name: String)

val user = User("John Doe")
user.name = "Max Doe"

// Use instead
data class User(val name: String)

val user = User("John Doe")
user.copy(
  name = "Max Doe"
)

The same applies to collections. In most cases, an immutable collection should be used and for changing the content a new collection is created. If there are real reasons for using a mutable collection, that scope should be kept to a minimum and never returned from public functions.

Static code analysis can help me with this e.g. to automatically violate the usage of the var - type in data classes, because this is never a good idea. In all other cases, I need to manually check the usage and decide from case to case.

Nullable Types

Another topic that I need to care about is the occurrence and handling of nullable types. In Kotlin it is necessary to specify explicitly if a given type is nullable (except when dealing with Java libraries). Dealing with nullable types makes always a null check necessary before safely accessing the variable is possible. This leads to more complexity in all areas that are working with the nullable type.

So the first question that I ask myself is - is nullability necessary? Is the absence of a value really representing a valid state?

There are cases where it is possible to replace a nullable type with a sealed hierarchy.

// No good solution
data class State(...)

data class MyObject(
  val state: State? = null
)

// Better
sealed interface State
data class ExistingPreviousState(...) : State
data object NoPreviousState : State

data class MyObject(
  val state: State = NoPreviousState
)

Another option to prevent nullable is to create two different types for modeling two different states like an unsaved type without an id property and a saved variant containing the property.

The check for nullable types also depends on the specific case so there is no general rule that always applies.

Complexity

The evaluation of the complexity is done in 2 steps. The first step is done by the detekt task which is analyzing classes and functions regarding their complexity. There are different kinds of complexity rules that exist:

the length of the structures (class and function).
the number of structures in a class.
the cognitive complexity.
the cyclomatic complexity.

For this, the defaults provided by detekt are used which already filter out a lot of code changes that lead to one of the different types of complexity. But there is still a subjective complexity that remains and that I can check. Is a function or class doing only one job? Is there one level of abstraction? (e.g. is a class doing high-level orchestration and also low-level file operations?) It should be possible to catch the functionality of the code without scrolling (because the code does not fit on the screen) and jumping in all used low-level functions. The check for complexity is one of the topics on my checklist that often leads to the most discussion.

Visibility

As a rule of thumb, the visibility of all elements should be as limited as possible (ideally private). This is not always possible because functionality needs to be accessed, but I should check every newly introduced element for its visibility.

Classes/Functions that are only accessed from inside a file should be private.
Classes/Functions that are only accessed from inside a single module should be internal.
Functionality that should only be available for derived classes should be protected.

This also includes extension functions. To not clutter the namespace it is not necessary to make them always public.

Only the remaining cases can be public if necessary.

Consistency

A very individual metric is the consistency of code. Every project is different and there is an established structure over time. This concerns several aspects:

naming conventions.
package structure.
error handling.
usage of 3rd party libraries.
testing strategy.
…

All code that is changed or added should fit in the existing structure to make it easier for everyone to find the way around. The more consistent the code is the simpler it is to make changes because there are not several possible ways of doing things. The difficulty in reviewing this category depends on existing coding conventions that are written down by the team and are accepted by all. If there is no written coding convention it may be difficult to argue why code changes are not acceptable because it is a matter of personal taste.

That are the checkboxes on my list related to the production code. Let’s have a look at the tests and the related code. Tests are as important as the productive code so I don’t skip them just because the CI pipeline successfully executed all of them.

Test Code

There are 2 checks that can be done very easily:

If new functionality is added, is there at minimum a single test added or an existing extended, too?
If the behavior of existing functionality changed, is there at minimum a single test change (that reflects the productive code change)?

Besides checking the existence of tests I also have a look at the coverage to get a hint about if the current tests are covering the use cases. It is important to note that test coverage per se is no metric for good/enough tests but can help to identify parts of code that are not recognized at all.

Disabled Tests

In general, there should be no disabled tests at all. A test has either a reason to exist or it can be removed. But as always there are exceptions to this rule:

There is a bug in the productive code that needs to be fixed before the test can be enabled. It is a good approach to first write a failing test before fixing the bug. In this case, there should be a reference to the task that enables the test again.
The test is flaky and therefore meaningless. It is also only accepted with a reference to the task that enables or removes it.

Disabled tests without a reason are not acceptable because they tend to stay in that state permanently.

Naming

In Kotlin it is possible to put the test method names in backticks so it is possible to describe it in an understandable way using plain english. The test name should exactly explain what the purpose of the test case is. If I cannot determine what the test is doing just by reading its name, others will struggle the same way.

Mocking

To be honest I’m no big fan of mocking (there are several reasons for this) but there are legitim reasons for using them. As soon as there are mocks used in the added tests I check if they are really necessary or if using the real behavior or some lightweight test implementations are equal alternatives (e.g. keeping data in a list instead of a database or only providing some exemplary data instead of a full dataset). What is the main problem with mocking for me? Using mocks needs great care because setting too broad conditions (using the any() matcher) can lead to unexpected false positives. The test is running successfully even when the behavior changes. To be honest, this can also be the case with using “real” functionality but it is more explicit.

A good starting point for me when dealing with mocks is to just comment them to see if it is really relevant to the test result. Additionally I can change the return type or matcher to see how specifically for the test case they are. Generally, either the mock uses the any() matcher for setting the conditions and call verify after the test to be sure it is called with the correct parameters or directly uses the expected parameter as the matcher.

Happy Path vs Exceptional Cases

In my experience the test cases that are written for the production code, that has changed or is new often mainly cover the happy path. That means the branches that are covered are related to the behavior that is expected if everything is successful. This is a good starting point to verify that the code is doing the expected things, but the exceptional cases are just as important. The exceptional cases include, as the name already suggests, behavior as a result of an exception (e.g. file cannot be read, the database is not available, …). To verify that this kind of test exists, running the tests with coverage helps. Branches that are not covered at all are visible immediately. All other cases need to be checked one by one to identify potential false positives (exception is not thrown but test result is green).

Mutation Testing

A tool that is helping me with the code review of the tests is mutation testing (see Mutation Testing for details). By running the mutation tests for the areas in code that have changed (because running it on the whole project takes a lot of time, depending on the size) I can get hints about which part of the code is not covered by qualitatively good tests. This helps me to focus my review on special areas and not on all tests, which can save time.

Conclusion

Making code reviews for other developers is an essential part of my daily work so it is important to have an individual checklist about which topics to focus on. Having a checklist also helps to not forget parts of the review process (e.g. when it is necessary to switch context during the code review). Over time I’ve created the above-described checklist that helps me a lot to make high-quality code reviews that help both the developer who is getting valuable feedback as well as the source code that is staying in a good state.