Deleting unused code with Ruby-on-Rails and Git

Camille Drapier
Wantedly Engineering
8 min readSep 16, 2020

One of my recent projects was to try to identify unused pieces of code and remove them safely. This can be quite a difficult task in Ruby, as there is little way to be certain what calls what.

A solution to this problem of identifying such code is to record all calls for a relatively long period on your production server, and then see what hasn’t been called. For this purpose, we developed a custom solution on top of the oneshot_coverage gem; but after taking a step back, I think I would recommend the coverband gem for this purpose (it has a polished UI, is maintained, based on standard tools such as Redis, etc.).

Assuming you now have a list of suspiciously unused methods, you can either be satisfied with that (you should probably not), delete them at once, and move on; or try to dig a bit more. This is the topic I would like to cover in this post, and by reading it, I hope you will learn more about this process and the tools I used to follow said process.

If you have numerous tests and excellent test coverage, this whole process might seem excessive to you, as you would simply say that if the tests continue passing after deleting the code, then everything is good. You would be right, but things are not always as simple!

I will try to use a real example from our application to illustrate my point. Here, I found a method called related_posts in our PostArticle model that seemed to be unused for a while on our production platform.

The method is loaded by the code loader (green), but the code inside the method is never executed, which means no call for this method was recorded.

Current references

The very first action you should do, before deleting the method blindly, is checking if the method name appears anywhere else in your code or potential dependents of your code. For a simple project, you can just search the method’s name in your favourite IDE, you can also use ripgrep in a folder containing your project and its dependents. This first check is fundamental because your method could simply be called by some other code but is not for some understandable reasons (very rarely used feature, randomness, code specific to some platform, code not actively used yet, problem in the coverage tool, etc.).

One thing you want to be aware of at this point is because there could be some meta-programming calling your method, you might want to check for some sub-string of your method name. Usually in these cases, your method would have some kind of resemblance pattern with other methods declared close by, so you would want to search for the discriminatory part of the method name. For example, if you have two methods process_carrots and process_potatoes , you might have somewhere a code calling .send("process_#{food_item}") inside some sort of iteration over the food items. In that case, you would like to know whether a particular food item is still called or not, and thus you would search for that name (i.e: carrots or potatoes respectively).

In my case, there are many matches, so I need to check manually for each of them if they are related, and while most of them are not, I noticed another model (that inherit from the same base model) that has a method with the same name. I checked with our coverage tool, and as suspected it is also apparently unused.

Checking references with VSCode

Note that in this case, checking for dependent projects is not necessary as our models are not used outside of our project. For the sake of completeness, here is the command assuming you are in a folder containing all your relevant projects.

rg "related_posts"

The method’s origin

The second thing you should ask yourself is when was the method implemented, and were there any hints towards how it was used then. It will give you other hints of code you might want to delete (that is linked to the feature, but that the production-code-coverage tools cannot detect because it is outside of its scope, for example: views, styles, JS, scripts, configurations, etc.). It might also give you more information on how the code is called and ultimately make yourself confident that you are deleting something that is indeed unused.

To do this, I like using the GitLense extension for VSCode, which gives me a link to the commit on GitHub, but you can also do a normal git blame on your file. If the methods you are looking at have been modified since they have been introduced you might need to chain git blames for finding their original commit (You could also use git log at this point).

Using GitLense, hover the grayed text at the right of the code to trigger a toolbox with more options, then click on the top-right pointing arrow to open the commit on your git service (here GitHub)
git blame app/models/post_article.rb -L 54,54
Using git blame

If I have a "commit hash" from running command-line git commande, I usually just copy it, insert it in a GitHub URL to show commits, and from there retrieve the PR, but there might be better approaches depending on the tools you use.

Using GitHub to display a commit and retrieve the related Pull Request

In this case, the method was not originally introduced by this commit but was transferred from the parent model to the child models with a slightly different implementation. At this point there are two things I need to do:

  • Check the original branch where the code was introduced, and see how it was called.
  • Check the branch where the method was split between models and see if the calling code was also changed.

In this case, the code was only originally used in one place, and this calling code was changed in the “split branch” at the same time as the related_posts method itself.*

One additional thing that should be checked is whether there could be some additional resources (JS, CSS, etc.) to delete: resources that were introduced at the same time than these prior modifications that might also not be used anymore. In the present case, there were none I could think of while reading the original code.

Tracking the calling code

Now that we know how the code was introduced, I would like to know more about how the calling code was changed for the related_posts method which seems not to be called anymore. Was it changed to meta-programming at some point? Was the code just removed and the underlying method forgotten? Is this code still called, but somehow I missed it with my checks so far?

By checking the “splitting branch” code changes, I know that the method I want to delete was called in a file called _post_footer.haml.html . But unfortunately, this file does not exist any more. To know when the code was removed from that file I use the following command (note that this also works if the files still exists)

git log -S "related_posts" -- app/views/post_articles/_post_footer.html.haml
Use of “git log -S”

I have seen this command fail a few times for deleted files. In that case, I fallback to a broader command that will give all commits to the file from the most recent commit. Usually, it is then quite fast to find the commit that removed the call to the code you want to delete.

git log -p -- app/views/post_articles/_post_footer.html.haml

By looking at the branch where the commit 995ed4d7 happened (same process as when we got the commit hash before), I can see that the related feature was in fact refactored: the implementation was changed and the original method was simply not deleted.

In this particular case, I found out that another call to the original method was deleted at the same time in that last “refactoring branch”, so I get a bit suspicious (but not too much!), check quickly when it was introduced again and ensure there were no other calls that were left unchecked. And finally, I am quite confident deleting this code is now safe*.

To sum up, in this archaeological journey we understood:

  • That the code appears not to be used.
  • For what reasons it was introduced and modified at some point.
  • When it was not called any more and by what it was replaced.

So we can create a “delete commit” for this code, create a PR where we sum-up our findings, and have/display the confidence* that it is a safe/legitimate change.

*NB: This is by no mean a way to be absolutely certain that your code is not called somewhere by some weird meta-programming. It is always possible that such a meta-programming code to be introduced in between the creation/use of the method and it’s apparent “un-call”; I do not believe there is a way to be absolutely sure about this. Even if you would git log the whole repository with a discriminatory part of the method name, it could still be possible not to find something that you are about to break. In the end, this is a compromise: with the help of automated testing, reviews, production code coverage, etc, to be fairly confident the deletion is safe.

I hope you have learned a bit more about ruby and git tools you can use to try deleting your old code in a confident manner. Some might argue that this is too complex and time-consuming to just delete some old unused code, and they might be right, but establishing this process can save your company some time: during the review process for example, if you show that you did investigate the matter thoughtfully and thus the reviewer can check it easily without spending too much of their own time. Also, being able to detect additional potential code/resources that can be deleted can save you time in the long run.

As for me, during the course of my code-deletion project, I have successfully deleted thousands of lines of ruby (and other thousands of JS, CSS, translations, configuration) by following this method without breaking anything (so far!), so I wanted to share this method as I haven’t seen much literature on the subject. 🤗

--

--