Unused code detector — How to make your PHP code cleaner

Andrew Pogulailo
6 min readApr 30, 2023

--

Photo by Manolo Chrétien

Hi, folks! I want to share a quote to begin this intriguing article.

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”

― Martin Fowler, author of Refactoring

Unfortunately, our world is not ideal. That’s why you will often encounter less-than-ideal and overcomplicated code. It’s not so bad, but if you don’t take it seriously and don’t try to write concise code, someday, it will lead to unused code. Imagine a very hot project, a vast code base, and a missing of people who know how a specific part of the code generally works. And if you need to change this code’s logic and refactor it significantly, it can be a nightmare.

Unused code is a widespread problem for new projects that develop and change rapidly and for old projects that have existed for a very long time and have already experienced many changes. Usually, such code arises due to a lack of time to understand it well or a banal lack of deep knowledge in the domain area. If I say it happens very often, you will agree.

But today, we do not aim to learn to write concise code. Instead, we aim to learn how to find and remove unused code. So make yourself a coffee, and let’s figure out how I got rid of such code.

Chapter 1: The Quest for a Solution

Here is the code we will work with today.

It looks horrible, but this code works and makes money for the company, so we can’t just remove it and forget. So we’ll have to figure out how this method works and refactor it. And keep in mind that there is a lot of such code in the project.

This method can contain a lot of unused code. We don’t want it to remain after refactoring. Therefore, we will have to find it before starting work.

The first thing I tried to do was find a ready-made solution to help me with this. And I found a package called PHP Dead Code Detector or PHPDCD. And at first, I was happy because there was already a ready-made solution that could help me with refactoring. But then I started reading the description and realized this is a static code analyzer. And the problem with static analyzers is that they will help only in straightforward cases that do not depend on external circumstances. In addition, now almost every IDE incorporates them. Therefore, we need a different solution.

I started analyzing the code and noticed that some values could be booth string and array types. So, I made some tests and saw that currently, we send only string values to the method. So, I asked colleagues if they knew anything about it, but I was the lucky one who had the opportunity to be a pioneer. So, I need to learn about all the options for using this method, especially those that depend on other services.

I didn’t find a better solution than doing a code coverage report exactly as when running tests but while the application runs.

Chapter 2: The Search for Code Coverage

Having data about the unused code would be great. Moreover, such data would be indispensable for keeping the code clean.

But unfortunately, in PHP, we don’t have such an option out of the box. Therefore, first, we must figure out what we can use to receive such data. The most widespread and popular choice is the xDebug extension, which has a code coverage function. Unfortunately, this solution has some problems, which we discuss in the next chapter of our journey. However, there is also a not-so-popular extension that specializes in code coverage PCOV.

We figured out how to get code coverage data. But another problem appears before us. Where to store coverage data? It is much easier to get data about code coverage during the execution of tests than during the actual operation of the application. In addition, because our tests run locally for a short time, we don’t need to overthink where to store the code coverage data. But we cannot do this while the application runs because it can be many instances that work at the same time. Therefore, we will store our coverage data in the database since we require a stateless solution.

So I wrote a package called Unue that makes it very easy to get code coverage data. You only need to choose the driver to cover the code and the transport you will use to transfer data for processing and storage.

As you can see, I used the RabitMQ transport. But I said that we would store code coverage data in the database. So why did I add another level of complexity to our solution?

Chapter 3: The Race for Execution Speed

I have a lot to tell you because the speed of execution of the application is a very controversial topic. The faster, the better. But what are you willing to sacrifice for speed? So let’s analyze what I made to keep the execution time of the application as much as possible the same.

I’m using PCOV as my code coverage driver because it’s speedy. Also, xDebug is a very slow tool used for code debugging, so there are better solutions for code coverage.

Also, we can’t stop the application and make the user wait more by saving code coverage data. The execution of a request to the database takes time. And what if we also want to process the data before that? Horror. So we must split our implementation into “hot” and “cold.” That’s why we send data to RabbitMQ.

We cannot perform any complex logic on the “hot” part of the application not to delay the user because this part aims to quickly get the code coverage data and send it to the “cold” part. But we can do anything on the “cold” part because it works asynchronously. So in this part, we will process, store, and display our data.

And finally, the most exciting thing from this chapter. In a real example, I checked the tool’s impact on the application’s speed. These are not synthetic tests but actual application that makes requests to the database and makes requests to other services. But your results may vary, but it all depends very much on the project.

As you can see, the execution time has increased by 38%. And is a significant difference, so I recommend using this tool not on the project as a whole but only on the parts of the code that interest you. Then the difference in execution time will be much smaller. My goal was to show you an exciting solution to a problem.

Chapter 4: The Outcome

And look what beauty we got.

And as a result, after about a week of operation of the tool and more than 2 thousand executions of the method, we can see the following picture. We no longer transfer data in arrays, so we can safely remove the following parts of the code. But this tool is needed only when no one can precisely tell what happens in a specific part of the code.

In this part, I talked about how the “hot” part of the tool works. If I see that you like this article, I plan to write another story where I will talk about the “cold” part and the problems I encountered with data processing and storage.

There are still many exciting things to tell you.

--

--

Andrew Pogulailo

I specialize in developing high-load systems. I'm passionate about using my skills to create innovative web solutions that drive business success.