Geek Culture
Published in

Geek Culture

The Tech Debt Playbook

Source

The biggest challenges to addressing tech debt will be cultural, not technical.

Rally leadership around tech debt

The system has grown too complex and haphazard to be manageable

  1. Acknowledge that tech debt is serious and real. We will need to invest considerable resources to make our systems more manageable. These tasks must be prioritized alongside product features, and sometimes we will choose tech debt over feature work.
  2. We want work to be deliberately planned. Last-minute scrambles do happen, but we want to avoid operating that way. As part of this, sales teams and the CEO will support a process for planning work and let go of directly controlling day-to-day engineering assignments.

Establish best engineering practices

  • All engineering work should be recorded in an issue-tracking system like Jira. Other departments should use this system for requesting engineering work. Engineers working on even modest tasks should track that work so there is visibility.
  • Tasks are prioritized by the engineering team. Usually the product manager and the engineering manager define the relative priority and sequence of detailed tasks, consulting with stakeholders regularly about the higher-level priorities and major initiatives. It is bad when engineers “go rogue” and choose to work on projects that aren’t a team priority. And it’s even worse when someone outside the team “swoops in” to insist that a particular task be done without working with the team to prioritize the work in context and ensure it’s well thought through.
  • All code changes require a peer code review.
  • Commit messages and pull request descriptions should be thoughtfully written to give context and describe the change. A code reviewer should reject descriptions that are cryptic or one-liners.
  • All pull requests must include automated tests, or the comments should address why tests didn’t make sense.
  • Beyond automated tests, developers are responsible for doing a pass at QA to double-check a feature end-to-end.
  • Projects have a definition of “done.” The developer is responsible not just for the code but any data migrations, rollout, etc.
  • Developers should only work on one task at a time. Limiting work in progress is proven to help teams maintain better focus and more reliably deliver work.

Metrics

  • Cycle time: how much time elapses from when a developer starts work on a task, to when it is done? Finding ways to reduce cycle time aligns well with changes that improve engineering as a whole — easier deployments, smaller task size, and of course less technical debt bogging down development.
  • Regression rate: what percent of bugs are caused by recent code changes? A high regression rate either means developers are being sloppy or that a part of the code is dangerously complex or tricky to work with.
  • Sprint velocity: a count of stories (or points) finished in a sprint. This is used to predict the team’s capacity for the next sprint, and to see if capacity is increasing over time as tech debt is cleaned up. Note that velocity is a relative measure with no inherent meaning; it is not a way to score how good a team or person is, and it’s meaningless to compare two teams’ velocities because their planning and nature of work are different.

Build a list of tech debt projects

Commit capacity to tech debt

  • Ongoing support: how much engineering time to dedicate to bugs and operations. This is analogous to the interest payment on your tech debt.
  • Tech debt initiatives: capacity dedicated to clean-up. This is paying down the principal on tech debt in order to free up future team capacity.
  • Features: adding new customer value to the product.

Start the clean-up with DevOps-ing

  • Deployments are automated (“push a button”) and reliable.
  • Monitoring is in place to know immediately when something goes wrong, and to quickly diagnose the root cause.
  • There is automation to regularly run unit tests and end-to-end tests. Even running a small handful of tests once a day is better than nothing.
  • Database schema changes and data migrations follow a well-controlled process, rather than people making ad-hoc changes.
  • You have some form of configuration management, with settings checked into source control.

On-call rotation and bug rotation

Putting it together — an example

  • One engineer a week is on-call
  • One engineer a week is on bug rotation

Don’t let the approach become the purpose

Other Resources

  • Accelerate is a must-read book for any technical leader. It’s especially useful for choosing which metrics to track and to justify investing in DevOps.
  • Martin Fowler has a great post about the types of tech debt, and whether design flaws or sloppy code count as debt. I acknowledge I’ve been a bit loose in my definition. And I didn’t even get into product debt, where the feature requirements have stacked to the point of staggering complexity.
  • Steve Rabin has a very thoughtful post explaining what tech debt is and how to address it.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Chuck Groom

Consulting CTO open to projects. I’m a serial entrepreneur, software engineer, and leader at early- and mid-stage companies. https://www.chuckgroom.com