KonMarize your code

Or how to use cohesion for a more harmonious, more feng shui code that will bring you joy

Mathieu Lemoine
The Startup
11 min readDec 22, 2019

--

You have just arrived on a new project: first step, have a little stroll around the code, just to get a glimpse at how all of this works. It’s a classic Symfony project, looks like shooting fish in a barrel: in src/Controller, thirty-ish controllers. That’s fine. You open one of them randomly: it’s a simple CRUD with 4 actions. You open another: aww, this one is 2500 lines long… “Let’s have a look at modelization instead”, you tell yourself, hastily opening src/Entity.

250 entities. You hover over the list to get the idea, open one of them, or two, three, 70. As the sun sets, you’re wondering how long it will take you to really understand this project. Fortunately, it’s well organized!… or is it? What would Marie Kondo think if she could read code?

Kiwis with kumquats

This little experiment shows us the value of grouping all the entities together, all the controllers together, etc. It’s a simple idea, it doesn’t need any thinking, and it allows us to easily find any class, granted that we know exactly what we are looking for. This way of organizing code has a name: logical cohesion. It enables us to easily find any object based on its type, thus providing a simple and familiar structure across projects. We could also call it the supermarket method: cheese with cheese, meat with meat, vegetables with vegetables.

Oops, no milk for my cereals

It’s convenient in a way, but it doesn’t help us much in understanding the project, just like the alleys of a supermarket don’t help us understand how people eat. And if a client will easily find his/her way through the store, it’s unlikely that he/she manages to grab all he/she needs without walking through most of the alleys. Good, because that’s exactly what the supermarket aims for: make customers lose as much time as possible to push them to buy more, whilst maintaining frustration to a minimum by giving them the impression of a well-thought-out system.

Maybe we can go differently with our code? But what do we really expect from code organization?

A tide of tidying I’m dying to do

A codebase is merely an ensemble of classes calling one another and depending on each other. When we group together classes which call one another, we call that cohesion. Conversely, when we have dependencies between classes from two different groups, we call that coupling. Thus, both concepts are complementary: the stronger the cohesion, the weaker the coupling, and vice versa.

2 groups of classes: the red group is coupled to the blue one

The concepts of cohesion and coupling in code have been popularized by Larry Constantine in the late 60s.

The goal when organizing our code is to maximize cohesion and minimize coupling. So we are going to try to split our code into small namespaces (call them modules, domains, bundles, thingies, whatever), having high internal cohesion and low coupling between them. This yields a code easier to maintain, putting more emphasis on dependencies in the code, but it also leads to a more modular code, easier to evolve. So, if different ways of organizing code exist, some of them allow us to obtain a stronger cohesion, thus a weaker coupling.

The basic idea is quite simple: group together classes which are related. The only problem is to define what it means! You probably have quite a few times stumbled upon the dilemma of having several “logical” solutions to organize your code. To decide, we often rely on well known architectures which seem to bring us clear and plain answers, like religions. For example with hexagonal architecture, we will split our code in 3 parts: Application, Domain and Infrastructure. So, if we want to implement a user management system, we will put a bit of code in each part. Or maybe we should start with a User namespace, and break it down into User/Application, User/Domain and User/Infrastructure. Both are logical, and in the end, we are not sure, what does the doc say?

But when we ask ourselves what kind of cohesion corresponds to each solution, everything makes more sense. The different possible forms of cohesion have been theorized and ranked from worst to best, which will greatly help us in making our choice.

On a scale from 1 to atomic

Accidental cohesion

That’s putting classes in random namespaces, with no logic whatsoever. It’s obviously the worst kind of cohesion, but who would do that?

Logical cohesion

That’s the one we discussed before, which consists of grouping classes based on their type. It’s better than nothing, but the nothingness shouldn’t be ashamed either. Logical cohesion often leads to ironic situations: for example, grouping together all the controllers means grouping the only classes that have zero chance, in any scenario, to call one another, since a controller can potentially call anything BUT another controller. Disqualified.

When you really think with your head

Temporal and procedural cohesions

Here we group classes together because they are used at the same time or consecutively, even though they are not directly related.

Example: people ride the tube to go to work, that’s about the only relationship there is between the tube and their work (unless they work for the tube…). Grouping RideTheTubeService and WorkService because they are used sequentially in our boring life simulator app is not completely dumb, but not completely smart either.

Informational cohesion

The idea here is to group classes together because they handle the same data, or to put it another way: they deal with the same entities. That’s the concept of the aggregate roots used in Domain Driven Design: you create groups of entities that work together and constitute a business domain.

Example: everything that more or less relates to the User entity (and as such, the data that it contains) will be grouped in a User namespace. Noice.

Informational cohesion is approved by the king of DDD

Functional cohesion

This one is about grouping classes that work together to provide a single feature.

Example: the user can create an account on our app. So we are going to group all the code associated with this feature in a common UserRegistration namespace: controller, forms, DTOs, validators… everything, as I said.

Atomic cohesion

It’s the perfect form of cohesion. Each feature is perfectly encapsulated in a single class, so there is no need to group them together because they are all perfectly independent. It’s schwifty, but not viable in practice (classes as long and DRY as the Amazon river).

Example: going back to the previous case, we create a single class called UserRegistrationAction, which will contain exactly all the code needed to create a user account, without relying on any other class. Each class is completely independent, so there is no coupling at all.

When the science of code goes too far

Theorize and fall, practice and gum

As reaching atomic cohesion is kind of vain and impractical, we will try our best to aim for functional cohesion, or at least informational, both being vastly superior to the logical cohesion we are used to, which happens to almost be the worst possible.

All of this is quite theoretical, and I guess you are worried that all this chit chat, like the thousands of other “architecture best practices” articles you have read so far, will just boil down to bovine poo. But fear not.

The overall idea to follow is that the stronger the relationship between two classes (considering that a functional relationship is stronger than an informational relationship, and so on), the closer these two classes should be physically. There is no absolute answer that will work for every scenario to organize all our classes, but now we have a tool at our disposal to evaluate if a given solution is qualitatively better than another.

One example with Symfony 4 or 5, in 2 shakes and 3 jiffies

Now let’s try to apply all this ancient wisdom to a Symfony project. We will start from the following basic project:

We notice that our project has basically 2 parts: some user management, notably with 2 forms to create and update users, and a blogging system which allows us to edit articles. We can also identify 3 use cases: user creation, user update and article edition.

To reach functional cohesion, we are going to group classes according to these 3 use cases. We will also try, as well as possible, to group these use cases according to which data they relate to: users or blog articles.

Following these principles to aim for higher forms of cohesion, we get something like this:

“Now these are impressively expressive namespaces”, my ex presses me to express.

Here we can clearly see that the classes which work together are grouped together. It becomes very easy to visualize all the features of the code, without even looking at all the classes: the first namespace level indicates the different business domains (informational cohesion), and the second one describes all the existing features in each business domain (functional cohesion). When we want to read or modify the code for a given functionality, no need to browse, everything is here in a single place.

If it’s broken, suffix it

The trend nowadays among hip developers is to remove suffixes from class names, as they are deemed useless and redundant. But it’s not that simple, as elbow developers will tell you. If this simplification of class names makes sense in some cases, don’t get carried away. “No need to call my action FooAction, it’s already in the Action namespace”. But that’s not the case anymore. Now, FooAction is in Foo, with FooFormType and FooFormData. Now that we have removed name spacing by object type, we can’t rely on namespaces anymore to determine the type of an object. So we keep this information in the class suffix, which actually makes more sense.

Hard cases in a nutshell

If this model is actually quite straight forward, it is not always easy to identify which feature a class belongs to. Sometimes, a class covers multiple use cases and we don’t want to link it to a use case more than another, so we don’t really know where to put it.

If you have looked carefully at the previous example, you have noticed that UserService has been replaced with 2 smaller services: UserCreateService and UserUpdateService. At least now that you have scrolled back there, you have noticed. In fact, the cohesion principle is also valid inside of the classes. Let’s imagine that our UserService contains the createUser and updateUser methods: these 2 methods handle the same data (informational cohesion), but they constitute 2 distinct and independent features.

So we have chosen the stronger form of cohesion by splitting our service in 2 smaller services, each having a functional cohesion. Actually, if createUser and updateUser don’t interact, there is no real benefit in them being together in the same class. Each one may have its own dependencies, which will be all mixed up in a common class. By having 2 separate services, we can much more clearly see the dependencies of each feature.

That being said, if there really is shared code between user creation and update, then there is a benefit in factoring this code in a single service. In that case, this common service will be placed in the parent namespace, a.k.a. the first common ancestor to our 2 use cases in the namespace family tree:

Actually, we decide to go back to a common UserService: we pull it up to the parent namespace

If you ever doubt that 2 methods are related, create 2 classes. If you are unsure that 2 classes are related, create 2 namespaces. It is much easier and faster to merge classes or namespaces together than to split a class or namespace in two afterwards.

Cogito Exec Sum

Tu sum things up, organizing code is always a tricky task, and we are never really sure we are doing it well. But that was before. We now realize that the logic that we choose to follow is never a simple “preference” choice: it can be categorized and compared objectively on a simple scale. So there really are good and bad ways of doing it, and we usually realize that when a project grows too big and/or when we need to split it or refactor it. By using cohesion, we can make sure from the very beginning that we are building on good foundations.

A code organization based on functional cohesion also helps us better visualize and question relations and dependencies between classes, and sometimes better split our code to be able to put the right methods into the right features.

Finally, a “drawback” is that we must be much more careful with naming classes so that our code stays clear and so that we can easily browse it, but it’s actually a good thing, because good naming is the most important attribute of good code. As a matter of fact, I have written this article as a tribute to this attribute.

Using cohesion, even sprinkled, will overall help you better structure your code and increase its lifespan.

Epilogue: code is life, life is code

I hope I have helped you better understand the logic that comes into action when organizing code, and thus more easily make the right choices that will benefit your projects in the long term. But actually this logic is the same for any kind of organization and any structure, it is not specific to code at all.

Let’s take for example the organization of a company: we create projects, teams, and we try to make it so that all those people work together in an efficient manner, even though the needs change constantly. We are very quickly confronted with a classic dilemma: build silos for each activity (a marketing team, a dev team, …) or build teams per project. Even if there are a few advantages to silos by activity, like better skill sharing, projects don’t run very smoothly, and it’s difficult to make all the teams work together. Conversely, project teams have demonstrated their superiority in bringing comprehensive solutions to customer needs, and in working more fluently thanks to a better team cohesion around the project. Here we have yet another problem of logical cohesion (cohesion by activity, even though people doing the same activity don’t work together) versus functional cohesion (around a project).

A final example would be the cohesion that exists between everyone of us: the interpersonal cohesion. If we look at our scale, the accidental cohesion would be the people you cross on the street without knowing them. The logical cohesion would be the people who do the same job as you. Temporal cohesion, the people with whom you do activities. Procedural cohesion, the ones you interact with. Information cohesion would be the ones you share your life with. Functional cohesion, the ones with whom you achieve things. And atomic cohesion… well, that’s you.

--

--

Mathieu Lemoine
The Startup

Having been a lot of things from developer to CTO, scrum master and designer, I started writing here to share the knowledge I’ve gained so far.