Managing TODOs in a codebase

Sakis Kaliakoudas
Babylon Engineering
7 min readMay 7, 2020

--

Every codebase out there has some “TODO” comments left by engineers. It’s a tool in every software engineer’s arsenal that helps them with keeping track of small tasks that they need to work on before completing their current task. Quite often it is also used on a more permanent basis, to point out problems or improvements that can be made at some point in the future. While useful in some cases, if left unmanaged, software entropy will eventually turn them into noise that only clutters the source code. In this article, we will review how we approached them in the Babylon Android project over the last few years, how they spiraled out of control, and the things we tried to get to where we are today, where things seem to be much more manageable.

Why do we use TODOs?

There are several reasons why software engineers use TODO comments:

  • They seem to be sitting somewhere in the middle between just making a mental note about a small action that needs to be performed in a codebase and creating a ticket in a project management tool. Making a mental note is a recipe for disaster, while creating a ticket in a separate project management tool might be considered as too big of an overhead.
  • Some engineers prefer creating multiple small pull requests as part of a task instead of a single big pull request and so might mark the remaining areas of work for that task in the code with some TODO comments.
  • Engineers might use them because they are aware more work will be upcoming but the backend isn’t ready yet/the ticket isn’t specified yet/the time isn’t right and they are leaving ‘tips’ for the person that will pick this work up.
  • In small personal projects, they can be found acting as a project management tool themselves. Not great, but gets the job done for your weekend pet project.
  • They are quite established in software engineering, with IDE support to highlight them, list them, etc.
  • They live much closer to the code, so they will likely get out of date less easily compared to tickets from a project management tool (but that doesn’t mean TODOs don’t get out of date — in fact, they get out of date all the time!)

Fun fact: The linux kernel GitHub repo has more than 4,000 TODOs!

The problem with TODOs

Unfortunately, the flexibility of TODOs also comes with a set of problems:

  • They share a similar fate to code comments — they get out of date easily.
  • There’s no clear ownership around them. Someone would have to use git blame, however, that is not a clear indicator of who wrote the TODO as code refactorings can make someone else appear as the owner.
  • People rarely address them.
  • They tend to only provide enough information to make sense for the person who wrote them.
  • In many cases they are overused, for things that should really live in a project management tool instead.

How we used TODOs in Babylon Health

The in-house Babylon Health Android team was created around spring 2016. When we created the team, there were already 50 or so TODOs from the previous maintainers of the project that we inherited. From that point on and over the next 4 years the in-house team grew from 4 engineers to about 40, and the number of TODOs grew with it, from 50 to about 200. As we saw the number of TODOs growing, we attempted to go through them and clean them up, but it was clear that as the team got bigger we would need to find a better, more decentralized way that could ensure that everyone was taking care of their own TODOs.

The first idea that we had was around putting some rules in place around TODOs:

  • TODOs should only be used when we want to define a relatively small action that needs to exist very close to the code, to provide further context.
  • Bigger pieces of work should only exist in Jira, so that they can be specced out in detail, and planned as part of a delivery team.
  • TODOs should have defined owners, so that it’s easier to get more context about them if required, and it’s easier to transfer them to another individual when someone leaves the business.
  • TODOs should be associated with a due date to make sure that they don’t stay in the codebase forever. We also decided to not allow any due dates more than 6 months into the future.
  • TODOs should have a consistent format. The format that we agreed on looked like this:// TODO <slack username> <due date> <details>, for example // TODO @sakis JUN-2020 remove this hack once backend is done .

We tried to enforce the last 3 points by creating a lint rule that can be found here. This allowed us to have some consistency in our TODOs, and easily identify those that are lurking in the codebase for a very long time.

As soon as this lint rule was introduced, all the TODOs in the project were converted to adhere to this new format, but unfortunately, we didn’t notice any substantial change to the number of TODOs. About a year later, the number of TODOs in the project grew to about 200. At that time we decided to give this problem another go.

The idea that we came up with was to create a script that would surface the information about how many TODOs everyone is an owner of, and how many of those are past their due dates. Given that all TODOs have the same format, it was quite easy to create a regular expression to capture the necessary information required for this report. The report would then be posted on slack once a month, highlighting the people that needed to take action.

The main TODO message going into the Android team slack channel once a month. Note that we only highlight the TODOs that are particularly old with 💣emojis, so that people can focus on those.

At the same time, the script would message each person individually, giving them more details about their expired TODOs and providing them with suggestions on how to fix them:

  • If a TODO is no longer valid then the engineer is asked to create a pull request to remove it.
  • If a TODO is valid, and can now be addressed then the engineer is asked to create a Pull Request to fix the issue and remove the TODO.
  • If a TODO is valid, but cannot be addressed yet, then the engineer is asked to create a pull request to update the TODO due date to something in the future.
The individual slack message that is sent out as a private slack message to every engineer for each TODO they own that is past its due date.

As soon as this report started getting posted in the Android team slack channel, people started actioning the expired TODOs straight away. Needless to say, we saw the number of TODOs in the project go down, and within six months, there is about half of them left. The aim here is not to go down to zero, but to make sure that everyone is aware of the TODOs under their name, and to make sure that they are actioned whenever the chance is given.

So an interesting insight here is that it’s not enough to have all these rules about TODO ownership and TODO format, the key ingredient for success was for this information to be surfaced to the team, monitored and reviewed regularly.

Diagram showing the number of TODOs in our project over time. We did a big clean up around the start of 2017, added the strict TODO formatter mid-2018, and added the TODO reporting bot around the end of 2019.

The future

We are very happy with our current setup, but as with anything in software engineering there are always more improvements that we can do:

a) Our current TodoFormatDetector lint rule only scans Java/Kotlin files. We should expand it further to any file in the project, keeping in mind that different files might have a different comment format. For example, XML files use the <!--<message>--> format for comments.

b) We are only parsing the first line of multi-line TODO comments.

c) To address the issue around adding more context into all the TODOs we might enforce a JIRA ticket to be created for each of these, where more information can be provided about a particular change.

At Babylon, we believe it’s possible to put an accessible and affordable health service in the hands of every person on earth. The technology we use is at the heart and soul of that mission. Babylon is a Healthcare Platform, currently providing online consultations via in-app video and phone calls, AI-assisted Triage and Predictive Health Assistance.

If you are interested in helping us build the future of healthcare, please find our open roles here.

Follow us on Twitter @Babylon_Eng

--

--