Clean Functions

Clean Code @ Borda: Volume II

Dawood Muzammil Malik
Borda Technology
Published in
11 min readJul 29, 2022

--

This article is the second of the clean code series. In this article, we will learn how to code clean functions.

The Computer Science department at the University of Utah define functions as “self-contained modules of code that accomplish a specific task.” The keyword here is task. Not tasks, but a single task. This is consistent with the basic principle of Clean Code:

Functions should do one thing. They should do it well. They should do it only.

— Robert C. Martin

Unfortunately, that’s not the only ingredient needed for a good function. Writing a good function is like writing a chapter in a novel. The chapter (function) must be relevant to the book (class). The chapter’s name (function) must be descriptive, and its content must satisfy its name. The chapter (function) must be kept as concise as possible, including only the details (logic) relevant to that chapter (function).

Several factors can be adapted for writing functions well. If you follow these rules, you will likely end up with neat, concise, well-named, and organized functions. Some of the critical factors are as follows:

  • Small
  • Don’t Repeat Yourself
  • Use Descriptive Names
  • Arguments
  • Boolean flags
  • Organization
  • Structure of the Function
  • Blocks and Indenting
  • Have no Side Effects
  • Command Query Separation
  • Error Handling
  • Dead Code

At Borda, we try to follow the teachings of Uncle Bob as much as possible. However, based on decisions we make as a team, we are not scared of taking slightly different paths; as long as they do not violate the fundamentals of Clean Code. Below, I discuss each rule in detail and explain how we do it differently at Borda, if applicable.

Keep the Functions Small

Uncle Bob reiterates:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.

Source: Refactoring Guru

The question is, how big is too big? Some say that if you scroll your code pane to read a function completely, you have done it wrong. However, with new fonts and better resolution screens introduced daily, there is no way to keep the check consistent. Therefore, the answer to this question is very subjective. There is a consensus that any function with more than 20 lines is too big and should be broken down into smaller functions.

However, at Borda, we are a bit lenient with this rule. We believe that the readability of even small functions can be brought down if simple formatting techniques are not adequately followed, such as not leaving blank lines to comply with the function line rule forcefully. Therefore, as a team, we agree upon 30 lines as the upper limit for a function, given that we do not avoid leaving blank lines wherever necessary.

Don’t Repeat Yourself

Duplication may be the root of all evil in software.

This quote from Uncle Bob is a testament to the effect of duplicate code.

Code is considered to be duplicate if:

  • It looks the same as some other function
  • It does the same thing as some other function

Any software project can easily fall into this trap when it has more than one developer working on it. Different developers working on different tasks may be unaware that a similar functionality has already been implemented by someone else in the team. Thus they would end up writing the same code over again.

Extract duplicated code — Source: Refactoring Guru

At Borda, we mainly make use of the extraction method to get rid of duplicate code:

  • If similar code is found inside two or more different methods in a class, we extract it in its own method and replace the original instances with a call to the new method
  • If duplicate code is found in two or more different classes in our project, we extract that piece of code and place it in a new service class as static methods
  • If duplicate code is found in two or more different projects, we place the said code in the most relevant base libraries shared by all the projects in Borda’s product suite

Use Descriptive Names

At Borda, our philosophy is the same as Uncle Bob: explain yourself in code.

The purpose of a function should be very clear just by reading its name. The reader should instantly recognize what a function does and what output is expected from it. Unfortunately, coming up with a suitable name for a function is not as easy as it seems. Sometimes we spend more time thinking of a descriptive enough name for a function than we do implementing the function.

“There are only two hard things in Computer Science: cache invalidation and naming things.”

— Phil Karlton

Ambiguous function names usually stem from our eagerness to think of a short clever name. And that’s the problem. Don’t be afraid to look for long descriptive function names: long descriptive names are better than short unclear names. Not only do they clarify the purpose of the function, but they also eliminate the need for unnecessary comments.

As a rule of thumb, if a comment is needed to understand the motive of a function, you’ve done it wrong. Start over and make sure this time your function is self-explanatory. If you feel the need to comment on a piece of code inside a function, it is better to extract that code into a new method. The name of the new function can be inspired by the comment you were about to add.

“Code is like humor. When you have to explain it, it’s bad.”

— Cory House

In our experience, if our functions are named appropriately, we will never feel the need for comments to realize their purpose. The code snippets in this section are an example of how using descriptive names for functions simplifies the design of the module in the reader’s mind:

The same function but with a descriptive function name

Arguments

Keep the number of arguments low

Uncle Bob says that a function's ideal number of arguments is zero. But let’s be honest, we’re not living in an ideal world. It is impossible to write even a simple application where all methods take exactly zero arguments. Jumping into the real world, the consensus is that a function must have two or fewer arguments; the fewer, the better. Three or more arguments should be avoided whenever possible.

Source: Refactoring Guru

Monadic functions (one-argument functions) are easier to understand and test compared to dyadic functions (two-argument functions), and dyadic functions are easier to understand and test than triads (three-argument functions).

Function with more than three arguments

The testability of a function is another reason why fewer arguments are preferred in a function. A comprehensive test strategy includes writing tests for every possible workflow in a function. More arguments create the need to write test cases for every combination of arguments, which makes it more complicated to achieve full test coverage.

In cases where a function seems to need more than two or three arguments, the best practice is to wrap some of those arguments in a class of their own.

All related arguments are encapsulated in a class of their own

Avoid using flag arguments

A flag argument is a boolean argument that is passed to a function. In practice, flag arguments alter the working of the function; it does one thing if the flag is true and another if the flag is false, meaning that the function does more than one thing. They also make the signature of the function confusing. No matter how descriptive the function name is, you are more likely to visit the function implementation again to check what the boolean value represents in the context of the function logic.

At Borda, we ensure no flag arguments exist in our production code. However, we are a little more lenient regarding our test code, specifically for mocking data for our unit testing environment. The rationale is to generate mock data with little variations without needing separate methods that would include almost duplicate logic. Until now, it has worked for us, but we are open to changing our strategy as soon as it starts getting in our way.

Flag arguments help mock domain services

Keep them organized: The Stepdown Rule

We want our code to read like a newspaper article. The name should be simple but explanatory. We want the most important and high-level concepts of the algorithm at the top and intricate details as we move downwards. We want the caller functions at the top, with the callees following them. This maps out a logical flow of the code in the reader’s mind and makes it much easier to understand.

The newspaper metaphor is a well-known practice in programming

A class typically has multiple functions, some public and some private. Some say that any given function should immediately be preceded by the function that calls it. However, we decided to follow a slightly different approach. For example, a private function is often called by more than one other function. How do we decide where to place that private function? Therefore, we choose to group our functions not on their interdependency but based on their accessibility. The public functions are placed at the top, followed by the private ones, all in the order they were called for the first time.

All the private functions follow public functions

Structure Your Function

Every function must have one entry and one exit point. Even though Uncle Bob says it’s okay to return multiple return statements as long as the function is small, we at Borda try to keep our functions consistent by ensuring that the return statement is always the last in non-void functions and that it is the only time something is returned from our functions.

Function with return as the last statement

The above function is an example of how we follow this rule at Borda. This function could have also been written in the way shown below, but the use of two return statements have been a violation of the code structure we have decided upon.

Function with multiple return statements

Blocks and Indenting

A function should not be large enough to hold nested structures. To be more precise, the indent level of a function should not be greater than one or two. If it exceeds that, the function should be refactored and shortened by separating the logic into one or more helper functions.

Complex conditions for conditional expressions should be encapsulated in separate functions instead of inserting all of the clauses between the parentheses. This gives more meaning to the set of clauses as the function can be named, making it more understandable.

According to Uncle Bob, the code blocks inside if, else, and while statements should be kept minimal, preferably up to one line, and that one line should preferably be a function call. While we follow this rule consistently at Borda, we are a little flexible about how many lines can if contain. As long as the actions performed are trivial and the number of lines does not exceed 3 lines, we are okay with putting that code inside the if block.

No Side Effects

Functions with side effects are a code smell. Side effects refer to unintended consequences of our code, meaning that our function promises to do one thing but also does other things under the hood. Code with side effects can be a leading factor in some nasty bugs, which can take a long time to debug. Therefore, special attention should be paid to ensure that our function does one thing, and they should do it only.

Avoid side effects, but not to this extent

Separate Command and Queries

Let’s see what Uncle Bob says about the command query separation:

According to Uncle Bob:

Functions should either do something or answer something, but not both. Either your function should change the state of an object, or it should return some information about that object. Doing both often leads to confusion.

The code above is an example of a program not following the CQS pattern. The AddItem method returns the Item entity, which breaks the rule that commands should not return anything. Similarly, the query GetItemById updates the Count property, which means that this operation is not just returning information about an entity but also updating its state, hence breaking the rule that a query should only return information.

A class following the command query segregation pattern

Handle Errors the Right Way

Returning error codes is a direct violation of the Command Query Separation paradigm. Instead, we should use try-catch blocks and throw exceptions to handle run-time errors. This practice prevents deeply nested structures and enables us to separate the error processing code from the happy path code.

The first rule of writing a good function is that a function should do one thing and one thing only. Error handling is also one thing. Therefore, functions that handle errors should not do anything else. If the keyword try exists in a function, it should be the first word in the function, and there should be nothing after the catch/finally blocks.

Remove Dead Code

Requirements are constantly evolving in software projects. As a result, changes and corrections are made, and the old code, which is now obsolete, is often ignored or forgotten about. Any piece of code, be it a variable, argument, method, or class that is no longer used in the project, is called dead code.

Source: Refactoring Guru

Keeping dead code is as bad as repetitive code. Functions that are not referenced anywhere should be removed. As developers, we tend to keep dead code in case the need for it arises again in the future. We knowingly stay in denial even though we know that the version control system will remember it, so there’s no harm in deleting it.

The quickest way of detecting dead code is using a good IDE. .NET provides the feature to add a custom EditorConfig file to your codebase to enforce consistent coding practice. At Borda, we use this feature to raise warnings in case any dead code is found in the project.

Visual Studio can raise error/warning in case you have dead code in your project

Conclusion

Functions are fundamental building blocks for any software project. It is imperative that our functions are easy to comprehend and revise.

One thing is for sure: writing clean functions is not a skill you can acquire overnight. It is a skill that needs to be developed by keeping these principles in mind and using them whenever you write code. It’s still okay if you end up writing long, messy functions. Just don’t forget to go back to the functions and refactor them as soon as they work!

We at Borda go through major refactoring spells almost every month just because we realize that we can make our code cleaner.

--

--