Refactoring Code Smells: Data Clumps

Can Kayı
inventiv
Published in
4 min readApr 22, 2022

A code smell means, badly maintained but healthily compiled source code from a subjective perspective. It is mainly not causing any problems when the code smell is implemented, it will cause problems after new code is added to the source code. Software developers unknowingly may implement code smells into the source code and if they are not fixed, they may cause bigger problems on development basis.

Introduction to Bloaters

We can categorize code smells into five different branches:

  • Bloaters
  • Object-Orientation Abusers
  • Change Preventers
  • Dispensables
  • Couplers

This article will focus on Bloaters branch as a code smell. In British slang, Bloater means “fat or greedy person”. In coding Bloaters may look like long methods, large classes(god objects), data clumps, long parameter list and primitive obsession.

Long methods and large classes cause problems for developers in the long run, they reduce readability and they increase the complexity of the method or the class. For developers it gets harder to understand what the method is actually doing or why the class is actually created for. Also the availability to reuse of the method or the class is less possible when new code is added.

Long parameter list means if developers have some related properties to pass as parameters to methods, those properties should be readable and reusable as an object. For example; if we have a start date and an end date as parameters, they should be passed as a “Date” object.

Primitive obsession is a behavior of categorizing objects on primitive values instead of creating classes for a context data. For example; imagine a class named Employee which has properties for Engineer, Salesman and type(to define that Employee is whether an Engineer or a Salesman). If there is going to be a third role, we must add it to the class with a new type. To fix this complexity, we can create 3 different classes like Employee-Engineer-Salesman. Engineer and Salesman can inherit an EmployeeType property from Employee class and we can set EmployeeType on their constructors.

Primitive Obsession Solution

Finally we arrive to our main subject, Data Clumps. This means that some identical variables or parameters can be grouped into objects for easier reusability and readability. This problem occurs when “copypasta programming” is used in software development.

To prevent Data Clumps, there are a few refactoring methods we can apply to our code, as mentioned before, these refactoring methods can be applied by a subjective look so if the developer feels that the code is hard to read, reach or reuse it is recommended to use these methods. The code should always be simplified because as long as the development continues, it will be harder to reverse the complexity. As many said, if you can’t explain something as simple as it is, you may not exactly know what it is.

Extract Class

First refactoring method to prevent Data Clumps is Extract Class. This method is required when we have a class using methods that is actually related to another concept. For example, we have a class named Shape which implements two different methods inside, Draw() and LogError().

The problem here is the LogError() implementation is irrelevant with the Shape class because logging an error can be used in many situations except Shape and if we want to log an error for something else, we have to define it again or use it inside Shape and logging an error from accessing Shape class is too irrelevant.

So the solution here is to define another class named Logger and access it inside Shape class to log errors. With this change, we can separate the responsibilities of our classes. Also this refers to Single Responsibility Principle(SRP).

Introduce Parameter Object

Introduce Parameter Object has a meaning of converting many similar parameters to a parameter object. Imagine we have a Mail class which implements two methods of SendEmail().

The problem here is that we have too many parameters used in method which are relevant to each other.

To solve this, we can create a struct included by the properties which we need as parameters to send a mail. By applying this refactoring method, we prevent code duplication and reduce the complexity of parameters. Also we have more readable code.

Preserve Whole Object

This can be defined in many forms like the repetition of collecting an object’s methods or properties to fill another method’s parameters instead of sending the whole object.

The issue above is that we define two extra variables to send the parameters to WithinRange() instead of sending the Room object as a parameter.

With this fix, if WithinRange() method needs another parameter within Room object we provide it without adding extra parameters.

Martin Fowler says;

“Whenever two or three values are gathered together, turn them into an object.”

As a result, people have different perspectives and measuring about code smells but we need to prevent the complexity as developers. Code Smells may happen to exist in every code we write, it is our experience how to prevent it.

--

--