When compared towards actually delivering a working product, delivering a “clean” product seems much less significant, especially since a product’s code will not be seen by end users in production. While this is somewhat correct in the real world, but it misses some points regarding sustainable software engineering process. In the previous article, we have seen the role of Test-Driven Development (TDD) in maintaining a sustainable software engineering process by emphasizing thorough planning, consistency, and maintainability. Clean Code also emphasizes the same traits through different strategies, putting it at the same priority as TDD. However, the question remains to be asked: what constitutes as a “clean” code?
Note: This article will provide some examples of clean code practices from the Crowd+ development project by Nice PeoPLe team. Since the backend codebase is written in Python, this article adheres to Python’s standards and best practices. Some may not be relevant for other languages.
What We All Can Agree On: Clean Code == Readable Code
For software expected to be maintained long term, one of the most important things to consider is that the code base must be readable by the developer team so that any member of the team can easily contribute to it. While this is a subjective standard, a code can be easily made readable by adhering to the domain-driven development (DDD) principle. This principle requires the code implementation to reflect the intended business purpose of the implementation so that a consistent “terms glossary” can be maintained across all stakeholders, not just a particular developer or even a developer team.
Let’s take a look at this following classic example of the effects of clean code using domain-driven development:
For context, the above code is used to generate a unique code for a user from a hashed combination of the user’s email and request date. The unclean code simply shows the variables as
code , which is too general and difficult to understand without the context (whose email? what date? what code?). As a consequence, developers intending to refactor or reference these variables may need to ask the original developer, or worse, accidentally or deliberately scrapping the code and writing a new one in its place. The refactored clean code, which simply renames the variables to the more easily understandable
generated_code , shows a significant improvement in readability of the code.
Readable (and, by extension, explainable) codes are taken seriously in Python and is included in The Zen of Python (PEP 20), which can be seen as the guiding principle of Python:
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
However, as mentioned before, readability is a subjective matter and it is hard to automatically enforce a readable code without express conscience from the developer themselves.
Slightly Debatable: Clean Code == Consistent Code
In most cases, there are multiple ways to express the same solution in a code implementation. While it can be tempting to use some or all of them at once, it can lead to inconsistency and complicates refactoring should the need arise. The advantage of a consistent code is less visible than a readable code since a readable inconsistent code can still be passed as acceptable by the developer who can still understand it.
In the case of Python, while Python recommends a single obvious solution for a program, there are some cases where solution plurality is baked inside the language itself. One such case is the usage of quotes for strings:
The above code snippet shows three different ways to enclose a string in Python, each with its own primary purpose. The purpose of the difference between single (‘) and double (“) quotes is to reduce the amount of escape characters should a quote is required to be inside the string itself. For example, if we want to include single quotes (‘) inside the string itself, we can use double quotes as the string enclosure. On the other hand, the triple double quotes are used to enclose strings with multiple lines.
As we can see above, there are no readability penalties from using multiple string enclosure types in a single code file. However, when there is no purpose that requires a specific enclosure, it helps to homogenize the enclosure, especially towards the double quotes (“), which is the standard string enclosure in most programming languages.
Some consistency checks can be automated using linters such as Pylint. String enclosure consistency is governed under W1405(inconsistent-quotes) rule as shown below:
The Elephant in The Room: Miscellaneous Rules and Clean Code Standard Suites
This part is usually the most debated part of clean coding since rules are often arbitrarily created with no apparent purpose. Usually, these rules have purpose which may not directly affect readability, consistency, or maintainability, but may indirectly induce other problems that affect them.
Let’s take PEP 8 (Python’s default styling guide) rule regarding character limits as an example. The guide recommends a maximum of 79 characters in one line, which may seem like an arbitrary and over-restricting rule. This number actually comes from legacy terminal’s maximum width of 79 characters in which horizontal scrolling can be prevented if this number is not exceeded. However, this rule becomes increasingly irrelevant with the advent of GUI and resizable windows. In Google’s Python style guide, this restriction is relaxed a bit to 100 characters per line. However, we do need to see the moral value behind this rule:
- This rule discourages the use of complicated one-liner code
- This rule indirectly discourages the creation of functions with large number of arguments
The next issue is coding style standard suites. A language’s maintainer or some professional developer teams develop their own clean coding rules and often publishes it for other developers to use, for example:
- PEP 8 (Python) from Python
- Styleguide (Python) from Google
- eslint-config-airbnb (JS) from Airbnb
While these suites are a nice start for developers who have no baseline regarding clean code practices in a particular language, some rules may become a nuisance since personal preference is also a factor in clean coding. For example, Google’s Styleguide enforces strict documentation rules in which every functions should have their own docstring.
At the end of the day, both self-defined rules and clean code standard suites are just a set of rules to be followed by the developer and configured into automated tools such as linters. The enforcement of clean code depends on the developer’s understanding of sustainable software engineering process. Rules can be adapted to suit the developer’s needs and tools can be reconfigured as long as there is a strong reason to add/remove them.
Bonus: Configuring Pylint with Google Styleguide for Python Projects
This bonus guide will walk you through configuring Pylint with Google’s Styleguide for Python projects, including ways to exclude rules at will.
Step 1. Installing Pylint
Pylint is an external library which will need to be installed through package managers. If you are using Python’s default package manager (
pip), you can use the following command to install Pylint:
# You may need to replace pip with pip3 for macOS and Linux
pip install pylint
Some large projects may use more advanced package managers such as
poetry. The following command installs Pylint using Poetry and updates the dependency list:
poetry add pylint
Step 2. Configuring Google Styleguide
Pylint makes use of the
pylintrc file as the configuration method for Pylint in a specific project. Google’s Styleguide preconfigures this file for you so that you can immediately drop the file in your project’s root directory. The file can be download through the style guide’s official site (link).
In a Python project, it is actually better to exclude rather than include to make ensure maximum project coverage. In some cases, there are indeed some directories which needs to be excluded, such as virtual environment folders or folders related to other languages (ex: React). Pylint can be configured to ignore these files and directories by entering them to pylintrc as follows (Pylint accepts base names only, not paths):
# Example: ignore=env,migrations,tests,backend_datalyst,manage.py
Step 3. Running Pylint
Pylint is a CLI application which can be executed to obtain a report showing the code’s quality in reference to the established rulesets. To execute Pylint, run the following in the project’s root directory (alongside the
pylint module-name-1 module-name-2 module-name-3 ....
# Example: pylint api packages repository server usecases
In addition to listing the violations found during the scan, Pylint also gives an overall code score based on the number of violations. The score can be used to judge overall code quality and parsed for repository badges. Pylint can also be integrated to various IDEs for automatic checking upon save or commit.