You’ve just recruited a new developer, Congrats! you’ve helped him set up his computer, get the codebase up and running, wrote together the first, second and maybe even third feature. and slowly but surely he becomes more productive and more independent.
Now imagine that instead of just one developer, you have 20 new developers. suddenly you and your team find yourself under a constant bombardment of questions, bugs created by “noob mistakes”, and a lot of frustration both for your new and experienced developers.
The code you write today will be refactored eventually. Ask yourself: what can we do, on an architectural level, to ease the next person in?
So how do we make onboarding more effective and in less time? I believe we should start with our codebase.
let’s take a look at your codebase. can you send your repository to another developer (who does not work with you … that would be cheating), and have them tell you what it does or how the application work in general terms? how about developing a simple feature?
We’re used to thinking about the user when we create our products. But what if we consider the newest member of our team as we build our codebases? The best teams share an understanding of where each bit of code lives, what it does, and why. They can then spend their time discussing how that code should work. Working on ‘how’ instead of ‘where’, ‘what’ or ‘why’ is essential to productivity
There are 5main characteristics that make a codebase intuitive — Taxonomy, Topology, Terminology, Context, Local Cohesion.
the most important aspect of an intuitive codebase is a coherent, extensible structure. This structure brings order to what could otherwise be a chaotic environment, and helps create a clear mental model so developers can understand where they are, what they can do there, and where they can go next. It also allows the codebase to change and grow gracefully as new features arrive.
You can think of taxonomy as a map to the business functionality of your code: the specific capabilities it offers to the end user. for example, for a blog system, common taxonomy might be Users, Articles, and Tagging. The Sign-in screen is located in the Users area of our codebase and only makes sense there.
Start building your taxonomy by thinking about what your code does, rather than how it does it. Make it explicitly clear where a certain business functionality lives, so when someone starts hunting around, they’re spending more time thinking about how a chunk of code works and less about where it lives.
When you look at your taxonomy, you’ll notice that even though each section is different in the functionality it offers, all sections are similar in the way they do things.
This self-similarity between different sections of the code is called topology.
Topology is a term that comes from mathematics: It’s the study of the properties that are preserved when something undergoes non-destructive transformations such as twisting, stretching, and scaling. Think of a circle. If you stretch the circle, you create an ellipse. The ellipse and the circle are topologically equivalent. However, if you tear the ellipse in half, it would no longer be topologically equivalent to the circle.
for many systems, even though you different sections of the code offer different business meaning, they share a common technical structure that makes them topologically equivalent.
This arrangement has an important effect on the way engineers can understand and navigate our codebase. For one thing, the taxonomy is made clearer because it’s presented as narrative variations on top of an invariant, predictable structure. For another, it also makes the codebase easier to learn, since the affordances provided by one section of our code are the same in all others. Should a developer decide to introduce itself to a new part of the code, it should feel like “more of the same”.
The challenge of maintaining an effective topology is keeping it consistent, as our codebase is usually in the state of eternal refactoring, our topology might develop disparity over time. and this disparity creates confusion.
if you heard the following sentence in a code review — “yeah, you should probably fix that. that how we did it in the past, for our new code, we don’t do X anymore”. then probably your topology is not as effective anymore. and you should put your efforts on managing your refactoring efforts thus reducing your topology disparity.
Many software terms can be vague or overloaded and, without explanation or context, can mislead the reader. Make sure your terminology is clear and unambiguous.
When ambiguity might arise, provide a context in the name. for example — while the terminology Segment might mean just about anything, Market Segment is clear, unambiguous and most importantly — searchable.
Once a term is established, make sure it’s consistent internally, but more importantly, be consistent with how this terminology is commonly used externally. It will allow your team to leverage prior knowledge and prior mental models on how things work.
Another mistake we tend to do all too much is what I call pun names. We developers like cool names for our libraries, from Kafka and Spark to Angular and Meteor. And while those names are indeed cool, they also require explaining and memorizing, which you might be able to afford with a well-known library, but less with your internal code. So for your codebase, stick with the clear but boring names that don’t require this type of mental mapping.
Our code tells a story, and it’s more than just the sum of all of its functional requirements. It’s also an accumulation of design decisions, security concerns, performance considerations, business restrictions, and eventually our core values as an engineering team.
Strive to make this context visible and clear through your code. lead your developers to follow the right decisions by intentionally making it clear and easy to follow. It can be through README files that explain your main concepts and decisions, it can be through naming, and it can be through creating an opinionated design of API’s which suggest the correct usage of your code.
Local cohesion is a characteristic of a system where a strongly related code sits together. Code with good local cohesion is easier to grasp as it’s functional boundaries correspond to its structural boundaries. Which means a developer who wishes to understand a certain functionality doesn’t need to search and untangle your entire codebase. As the full boundary of the feature is clear without diving deep into the code itself.
A good way to asses how good is your local cohesion is to test the following —
- How easy it is to add a tracer bullet feature to your codebase? does the resulting code is spread all over the place?
- How easy it is to delete a feature from your code? do you need meticulously search for all possible dependencies and potential dead code? or can you just simply delete a folder?
The sharp-eyed among you probably have already noticed intuitive codebase shares many ideas and practices with former concepts such as clean code, domain driven design, and microservices. And while the practices themselves are not new, the difference is in the goal — Building intuitive codebase is about helping your team grow.
And once you focus on the goal and not the practices, you understand the most effective way to create intuitive codebase is to listen to your developers, understand where their confusion lies, and instead of fixing it locally by just explaining it, fix it at scale by removing the root cause of confusion.