What Is Technical Debt? How to Manage (and Reduce) Tech Debt
Technical debt is pretty much any code anyone wrote last week. It’s all around us, binds us, and basically duct-tapes-and-glues the tech world together. Tech Debt is a part of life and, while we all want to ensure all our code is perfect and lasts the test of time, you will live with Tech Debt. Some Tech Debt needs to managed and reduced immediately, and some does not.
Managing Tech Debt
Knowing the categories of technical debt and how to think about them can make or break careers and companies. Not all Tech Debt is created equal and not all of it deserves equal attention. In this blog post, I will share some general categories that tech debt can fall into and some rules of thumb we use at Conductor when managing Tech Debt: what to address, what to ignore and for how long.
While we don’t get everything right all the time, we have been at it for a while now and in that time we have not only slain a few dragons and gotten rid of some boogie men, we have delivered on our full product road map and kept a sustainable, pride worthy pace of innovation.
There is a game-changing outcome of addressing the right tech debt at the right time that really gets my creative and competitive juices flowing. Building features and addressing underlying platform/tools/quality needs are absolutely necessary requirements for any technical manager. But there is a lot more to see under the covers here.
As a leader of a tech team I consider it my mission to ensure that the team is a competitive advantage for the organization. A key driver of leapfrogging all competition is to see obstacles that are still miles away. A deep knowledge of the technical stack along with an investment in understanding the business goals of the company, the product needs, and the competitive space puts me in a unique position to see what will block us from getting to where we want to be as a company from the tech side.
Identifying, articulating and lobbying to remove these obstacles in time seriously raises the bar for any new competitors entering the market and leaves the old competition in the dust. The results of these efforts have been clearly visible in the growth rate we have enjoyed, the morale across the organization and, most importantly, the customer satisfaction ratings we have earned.
Without further ado, here are some high-level categories we loosely use. The magic is in recognizing which category of debt you are dealing with and making a decision informed by as much data as you can gather.
Technical debt that kills your business
If, during a renewal conversation, anyone hears, “Your solution was great for when we were small. I would like to use more of it and give you a lot more money. But you just can’t handle the load, so I need to cancel,” you have a serious tech debt problem. Your prize for successfully bringing a solution to the market and growing the market in the process should not be that you get thrown out of the playground by new players on the field. New entrants in the market can learn from your mistakes and build tech that essentially sidesteps all the baggage you’ve been carrying around.
This is the easiest type of debt to prioritize, but usually the hardest to resolve. You probably have something that has been around for years and if it was easy to address you would likely have addressed it already. Resolving this type of tech debt will require you to sacrifice significant portions of your product roadmap. So, while this one might be the easiest to build up, as a tech leader who wants to make money, you want to take steps in advance to avoid this kind of technical debt like the plague.
As you build new features, you should not forget what is wrong with the older features you are carrying around. Pay attention to the things that are holding you down. As an example, for us at Conductor, our old Monolithic application was built in Java 6, and the intricate web of dependencies that evolved over time made it a drag for us to upgrade, address scale issues, improve performance, etc. We are explicitly applying the strangler pattern to all new development, with the ruling mantra of being able to independently build/deploy/test/scale anything new we build. Hard-earned learnings from the past will explicitly be part of our success in the future.
Technical debt that kills your team
Does your team deliver on the roadmap on a good schedule? Are they all tired because they are all literally “working all the time”? Do you have data to figure out why they are working all the time? Is it because the deadlines for new features are unreasonable? Or is it because the time/resource cost of supporting your existing service went up while you weren’t looking?
This is the kind of tech debt that creeps up on you. Maybe one of your clients grew and is using your service a little bit more than they were before. Maybe they tripped a threshold wire. Maybe one new client won’t be so painful, but the next one or the next 10 will be. When this type of technical debt is creeping up on you, a keen manager will observe that the sprint velocity is good, but teams are displaying telltale signs of stress, and being on-call is met with dread.
These are the “soft” signs that this type of debt is at play and needs to be addressed before it graduates to the killing-your-business kind. Here are some not-so-soft signs you can track on an ongoing basis:
- # of incidents that had to be addressed in any given week — this should be stable number
- # of repeat incidents — did the same thing happen last week?
- # of sprints bloated with on-call work items
- Rate of incoming blocking bugs
- Gradual decrease in sprint velocity
To manage technical debt of this nature, it is important to gather hard data around the debt, diligently conduct retrospectives or post mortems after every critical issue in production, identify what went wrong and break it down into bite-size chunks that you can address along with the rest of the sprint work. For some short period of time, you may have slower progress on the feature roadmap, but progress should not stop and the delay should still be within reasonable schedule variance. Addressing this type of tech debt will yield value immediately and will set up future success.
Technical debt that will kill your business next year
Some tech debt can be a blocker for growth for your business. Your product may be working fine right now but the future needs of the business may qualify some portions of your solution as debt. This debt is only debt if you foresee it blocking your growth in the future. That future tech debt might be portions of features that were never implemented because no one got around to it for whatever reason. Over time, that not-yet-implemented functionality can became a sinkhole that many things might fall into. Some examples:
- A new feature got added many years ago for the domestic market, but there wasn’t high demand to make it available in the international market, so it never got added there. Over time, this gulf just grew to be so large that it started to affect business. In addition, the cost of addressing it became exponentially larger.
- An ETL pipeline that gets kicked off on a schedule works really well until you change the time zones and address customers who are awake when you are asleep or vice versa. Since the time zones were not something you had to think about before, going back and addressing this now can be costly or based on a new implementation.
Not all of this debt needs a technical solution, and it can be a discussion between multiple departments (e.g. can we focus on selling in markets where English is the primary language first?). Do a lot of due diligence and ensure you absolutely have to have a technical solution. If you need it, build it into the roadmap.
Technical debt that keeps you from blasting off
This type of debt is the hardest to make a great business case for. It’s the type of debt that I believe tech leaders struggle with the most, and for good reason. This is the kind of debt that results from thinking, “if we just had a standard platform/api/pattern, we would be able to move so much faster.”
The technologist in all of us wants to build a platform that is sufficient and robust and scales for all the features that are built using it. This is an eternal struggle and the benefits of adding another layer of robustness to this platform is never a clean cut decision. It is always partly supported by empirical data and partly by extrapolation. This is the type of debt that is hardest to quantify in dollars and cents because it isn’t really killing your business now, it isn’t killing your team, and it isn’t going to kill your business next year. As a leader, this is what you spend your political capital on.
Start by being your own biggest critic and convincing yourself. It might include calculating the opportunity cost and doing a SWOT analysis. You really have to be sure that the future benefit of addressing this debt now is greater than the future of adding another feature or fixing more bugs.
An example from our practice here: We have one database that holds important configuration information for our customer accounts. This database has been around for over ten years. This database has the classic marks of evolution and has codified versions of the data modeling decision prevalent at the time they were made, it reflects the priorities and propensities of the business/people at the time the decision was made. We live in a very different world now, and the database is hard to evolve given how far down the dependency tree it is. There are also some early indications that that this is not a wave we can ride for the next 10 years. However, the business is either not currently suffering or the suffering of the business can be resolved by throwing unreasonable amounts of money at the problem or poorly utilizing our engineers (and turning this into the kind of debt that kills your team in time).
Those are all good reasons to build a plan and start executing on addressing the issues, but there was something specific that added urgency for me. The lack of flexibility in this configuration will significantly limit our ability to innovate next year in feature areas that are already in early discussions in the Product team, with the industry leaders and with our expert users. As a software company curtailing innovation that is clearly needed next year by not addressing this now when we can deal with it is a fatal flaw.
This is the category of tech debt has the least amount of objective data and requires a gut call. This is where you choose the blue pill or the red pill.
Managing Other Technical Debt
Everything else shouldn’t affect your roadmap commitments. That doesn’t mean you shouldn’t pick up small/medium things to pick up and address as they arise, but it shouldn’t affect your committed deliverables and product roadmap. If the debt is important enough, it will reveal itself in time. There is no magic formula to apply here. You have to let time be your guide and analyze all new pieces of information and apply intuition.
Be critical and be your own devil’s advocate, get comfortable living with some debt: it will always be with you in some form. Every once in a while you will get a surprise present from an engineer because they just couldn’t take it anymore and fixed something that you didn’t ask for but is the right thing to do. I don’t take those for granted, I say a thank you for those moments at Thanksgiving and use them as my happy place to escape to when someone at Amazon runs the wrong script cutting off access to S3 and blacking out our whole service the day before our biggest conference day of the year.
Building software is an interesting and satisfying mix of art and science. Deciding which tech debt to address and which to ignore tickles the same parts of my brain. The above rules of thumb have been a huge contributing factor in building happy, productive, profitable tech teams for the past few decades.