How to Dig Your Way out of Tech Debt

A framework to understand and prioritize debt reduction work in 4 steps

Published in

The Startup

9 min readNov 8, 2019

Every high-growth SaaS company has technical debt, and the successful ones eventually have to take deliberate steps to deal with it. Tech debt often comes from early choices to go faster at the expense of better designs. This is especially common at later stage start-ups because early on when time and capital were scarce, they often had to pivot multiple times to find product-market fit. Whatever the origin, tech debt affects most software companies, and as tempting as it is to put it off, neglecting it for too long can be perilous.

Technical debt usually manifests itself in the form of poor product quality, a sluggish user experience, slowness in new feature development, lack of security, and lack of scalability in one or more elements of the software stack. There are also hidden costs in the form of low morale on engineering, PM, customer support, customer success, and other teams. It can even make it more difficult to recruit the best talent because good engineers often don’t want to work on debt-laden systems. All this tends to creep up and worsen slowly over several years, rather than a sudden jolt, so it can be easy to miss.

Once that happens, inevitably the debates begin about how much engineering capacity should be dedicated to paying down the debt versus building more new products and features. Sometimes these discussions can get quite heated! Most often it takes place between product management, who feel the pressure to keep the product competitive in the market, and engineering, who feel the pain of unreliable and complex systems that are difficult to maintain.

Over the years, I’ve worked with many excellent engineers and architects with whom I’ve “enjoyed” this debate. Although there were probably times when we were close to throwing things, ultimately we successfully partnered together to work through it, and (to my knowledge) I have managed to stay friends with all of them. The key to solving the problem — like most hard problems in life, it turns out — was communication.

What is technical debt? Is it necessarily a bad thing to incur tech debt?

A software organization incurs technical debt when a team chooses an expedient path to building software over a more expensive but higher quality one. Ideally, a team makes this decision together with eyes wide open about the long-term versus short-term trade-offs, but all too often it is not fully understood until after the fact when quality problems emerge or development slows down.

The reason technical debt is such an apt concept is that just like financial debt, it can be a useful tool to help you achieve goals more quickly than simply “paying as you go.” For example, when looking for product-market fit, you often need to make numerous pivots and moving quickly means everything. Investing too much time and resources in something that ultimately doesn’t meet a market need is inefficient.

But eventually, you have to pay down your tech debt. And if it gets too large it can act like a ball-and-chain that stops you in your tracks at very inconvenient times, like when you have finally achieved product-market fit and are trying to scale, or right before a highly anticipated liquidity event. Like financial debt, the key is to monitor and manage it proactively.

What makes the technical debt debate so difficult?

Fundamentally, paying down debt today will take away from feature development today. The argument as to why it is a rational choice is that it will eventually speed up roadmap delivery and will put you ahead in the long run.

But how long is the long run? Is it so long that the market will pass you by? Or so long that you’ll lose customers to competitors who will beat you to launching that killer feature everyone has been asking for? On the other hand, if you don’t pay down the debt, might you end up getting passed by anyway because your ability to deliver even the table-stakes features becomes so impeded? Or will your user experience, speed, or quality frustrate your customers so much that they churn?

The debate is made even more challenging because the information is imperfect on both sides as to what the opportunity costs are. Engineers often struggle to articulate how much debt there is, how long you can go without addressing it and don’t always agree on the best way to pay it down. Meanwhile, PMs don’t have a crystal ball to predict exactly what will happen in the market and when, or how much patience current customers have left before they churn. The uncertainties on both sides make it very challenging to reach an agreement on how to invest the company’s limited resources.

Finally, unlike financial debt, which is more or less fungible, technical debt is complex and has many forms and many different remedies. Teams usually lack a framework and common language to help organize their discussions and decision making, which can lead to mistrust that each side is myopically arguing for their case. At its worst, you get roadmap gridlock — a breakdown in the organization’s ability to agree on any path forward.

A Framework for Understanding & Prioritizing Debt Reduction

Following many hours of debate, we finally developed a framework to help our teams to get out of gridlock, communicate more clearly with each other and our executive stakeholders, and help make sure we were neither under- nor over-investing in paying down our technical debt.

The framework describes five primary types of engineering investments and how they act to increase product quality, engineering output, and ultimately higher customer satisfaction. Think of it as a translator between the basic business desire for more, better quality product and the kinds of non-feature investments that are needed in engineering to get out of debt.

Prioritizing Debt Reduction Projects in 4 Steps

The first step in getting a handle on your debt reduction program is to classify the candidate projects according to the categories shown in the blue circles. Typically these candidate projects will come from the engineering team itself since they are best positioned to know where and in what form the debt exists.
Next, make sure everyone understands how each candidate project acts, through the mechanisms described in the grey circles, to improve the business outcomes shown in the green circles. This is important for at least two reasons: a) it adds credibility for business stakeholders who often don’t understand the candidate projects or how they really benefit customers, and b) deconstructing the source of the benefit helps differentiate projects, which otherwise can be difficult to prioritize because they all result in the same generic benefits (ie “higher engineering output” or “higher quality”.)
Use as much data as you have combined with team judgement to rank each project according to its expected impact on the business outcomes. Note that it can be very difficult to quantify exactly how much improvement a given project will have as measured by your chosen KPIs (ie, Net Promoter Score, bug counts, etc.). That’s OK because often it can be enough to know a project’s relative benefit as compared to other candidates. There are many variations on how one can do this, but for example, even a simple business impact rating on a scale of 1–5 can be effective for ranking project benefit on a relative scale.
Finally, estimate the cost of each project you are considering and compare that to the benefit you expect. Again there are many variations on how to do this, but the key is to find the maximum ratio of benefit to cost. For example, you could divide the score in Step 3 by your cost (whether you measure cost as story points, person-days, or sprints), and rank according to this “ROI” score. Alternatively, you could set up a two-by-two as below, which helps prioritize projects at a more conceptual level, useful when your data may not afford a high degree of precision.

Primary Types of Debt Reduction Investments

Code Reduction & Refactoring: Eliminating unneeded end-user features, and investments in rewriting existing code to be simpler and more efficient, for example rewriting libraries, queries, or redesigning databases. Ideally, it will also mean turning integrated functions into formally separated common services, as found in a service-oriented architecture (SOA).
Common Technologies: Investments in reusable, and where possible, publicly available technology to replace proprietary or single-purpose libraries. As technology evolves, what once required a home-grown solution, may now be available in the public domain. These investments can also result in the creation of formal common services under an SOA approach.
Automated Test Coverage: Investments in increased automated test coverage. Often teams have some portion of their software covered by automation, but not all. Increasing the coverage will decrease the time required to perform a full regression test and decrease the chance of defects being released.
Engineering Operations: Investments in continuous integration, release automation, automated monitoring, as well as team-oriented investments in communication, documentation, and collaboration (sometimes collectively referred to as DevOps)
Continuous Improvement: Investments in the reduction of an existing backlog of defects alongside the other more preventive strategies mentioned above.

Measuring Business Benefits

As a best practice, you should be measuring the business benefits of your investments with clearly defined key performance indicators (KPIs). But it is also important to recognize that the effect of any single investment on the engineering side may be difficult to see in the numbers. Often a sustained debt reduction effort is required for users to really take notice.

Higher Platform Quality: There are many well-documented measures of quality, but a few of the common ones are defect counts by severity, including separately tracking new vs regression counts and responsiveness of the end-user interface.
Increased Customer Satisfaction: It comes from both higher platform quality and engineering’s ability to produce more new product capabilities. Common KPIs for customer satisfaction include Net Promoter Score, Customer Satisfaction Score, Customer Effort Score, and Retention/Churn Rates.
Decreased Support Costs: It comes from a higher quality platform. KPIs should be normalized to reflect customer volume, such as tickets per customer per day, perhaps grouped by severity. (With support costs, be mindful that the KPI you choose doesn’t tend to look better when support is underfunded, such as “total investment per customer,” which will be deceiving.)
Increased Engineering Output: Measuring software engineering productivity is notoriously difficult, if not impossible, and I do not attempt it in this article (just google “how to measure software productivity” and browse for a while to see what I mean!) The fundamental reason for this is that while the input is clear — hours of engineering time — there are really no common units of output that you can measure and compare over time or across teams, i.e., no two features or products are the same. But we all know intuitively that productivity is not uniform over time or across teams. It turns out you can often more easily measure how much waste there is, such as how long it takes to run a full regression test, or how long it takes to deploy code to production. So setting KPIs around these things and aiming to reduce waste are pretty good proxies for increasing engineering productivity if you are willing to believe the team will not simply go take long lunches with the time savings!

Remember, better communication between engineering and product management is key to digging your way out of technical debt without completely sacrificing your competitiveness. Using this framework will help you improve communication, whatever your particular tech debt situation may be, and in turn, you will start making better, faster, and more transparent decisions.