How we tackled technical debt at Wikipedia
Talking a bit helped us write off several years of technical debt
A big challenge for any software engineer is explaining an important technical change to an audience who may not necessarily have the appropriate context. “Technical debt” is a phrase that will perplex a product owner if not articulated correctly. If you work in any product-centric team, you’ll likely find many technical tasks may be brushed aside in exchange for more tangible, visible outputs. That said, technical debt is very real and needs to be addressed.
Back in January 2018, Wikimedia engineer Joaquin Hernandez and I pitched a one year project to pursue the scoped and measurable goal of increasing code coverage of our codebase and investing in our ageing but strategically important codebase. We did so with the belief that this could lead to working with code we liked much more.
Often developers, designers and product owners speak different languages and have very different and conflicting desires. Taking the time out to write the project proposal and talk together about problems, solutions and benefits was well worth the time and resulted in a shared mutual understanding. I believe that because of this work and new shared understanding, the project was easier to get onto our annual goals.
The problem statement
We’ve likely had every discussion every developer in every other organization has had, whether it’s been “Mustache VS Handlebars”, “Should we use TypeScript?” or “Which is better? Vue.js or React.js?”. What we’ve found is it’s really hard to get consensus on these big questions in an open source community with no benevolent dictator to make the decisions.
This aside, mobile usage is growing. We are interested in adopting technologies such as service workers which provide offline support and better availability, but enabling one in a codebase such as ours is risky business.
After discussion within our engineering team, we focused on a solution to, and have provided proof of concepts for, iteratively refactoring and revising the existing code. Our project proposal, which is public, explains why we feel strongly that we invest in mobile’s frontend architecture.
We drew our proposal from experience in a recent experiment with our page previews feature and a write up of the experience. Before the project, our front-end assets were managed with a MediaWiki-specific system called ResourceLoader and our proposal was to move off it and lean more on modern-widely utilized front-end tooling, for example — but not necessarily — Webpack. With this achieved we would refactor, improve and modernize our neglected component library.
Inside the project proposal, we importantly made no specific promises; instead, we presented tangible problems and measurable outcomes, which after six months we have now partially achieved. Those include:
- Increased test coverage (our code coverage was less than 50% and more alarmingly, 45 of our 81 files had 0% coverage.)
- Possible performance improvements (so far we’ve not seen any change here but we see the potential for change)
- More reusable standardized components (we’ve started making headway on this!)
And less measurable (but hopefully outcomes that could later be recognized):
- Quicker on-boarding of new hires
- Quicker development cycles and estimations for product work
- More future proof code
Essentially, for a project with limited time and resources (Wikipedia’s mobile site is maintained by a team of just 6 paid people), we pitched a refactor not a rewrite. Rather than replacing a complicated system, we would slowly and iteratively improve it. This was important as it promised iterative development while keeping the site up and running (and improving at all times). While this slowed down development it kept our work visible and kept us accountable to our product team, meaning it was impossible for our development team to hold our product owners hostage by telling them we couldn’t build new things until the refactor had completed. It also allowed us to work on new products in parallel to this important work (we shipped several projects during this time) as well as guaranteeing that we would achieve something!
If you are interested, there is a technical write up of what we have done and what we have achieved so far on our internal blog: Migrating code from ResourceLoader to Webpack.
Halfway through this project, my team has made great headway, and I feel I am at a good point to reflect on what’s worked well for us.
1. Stop coding every now and again
If your engineering team is going from sprint to sprint without stopping, this is likely a problem. Our team found respite from coding during our company all-hands and used this valuable time to talk strategically and write up the project proposal over the course of three days. This activity wasn’t all talking and writing— we built an important proof of concept!
Make sure your team is making time where everyone can stop coding so that the team is free to think and describe problems they are facing. Stop coding and talk to your team and find out what drives them mad and make sure you collect around a single problem statement.
In my opinion, a team offsite, in a foreign city, in the best environment for this to happen.
2. Be problem focused not solution-driven
It really helped my team to keep the project problem-focused rather than solution-driven. While many of us were tempted to be more ambitious and say we wanted to use React.js or TypeScript, being problem-focused allowed us the flexibility to do whatever we felt was important for the project at any given time to be seen as a success.
In addition to this, while we’ve been forced to look at the code, we’ve been noticing ways to improve it and prepare it for a modern future. For instance, we’ve been reducing our reliance on jQuery. While we’re not removing jQuery from our stack just yet, we’ve found inspiration in other efforts to do this such as Github to at least make this a real possibility.
3. Consider refactoring rather than rewriting
Many rewrites have been proposed and attempted at Wikimedia, in particular for the mobile site — proof of concepts exist in the form of a React mobile-clone Weekipedia and project Marvin but the results of these have never materialized in production.
A legacy system is a well battle-tested system. While sometimes a rewrite is necessary, (bridges are a good example!), refactoring is a great way to ensure that something good comes of the work you do and that product sees the value of what you are working on.
We’ve been working on a refactor in a living, breathing codebase, for six months now, and that project is still running and hopefully, you have noticed no difference. If I don’t say so myself, I think that’s a pretty remarkable achievement of my team.
4. Make talking a routine
The most important output for me was the conversations my development team had. Having a big technical project and the time and freedom to oversee it empowered the team to justify a dedicated hour every week to talk about the problem statement.
This project, while not giving us shiny technologies, has allowed us to have many conversations about many of the patterns popular libraries encourage such as composition over inheritance, Higher Order Components and dumb components to name a few. It allowed us to understand the history of the project; why things are the way they are; talk about what we like in other codebases and what we’d love to learn. It was essentially a focused brown-bag session that gave us all ideas of where we were heading and if we wanted to achieve things how we might get there.
If you work in a technical team and you are spending most of your time coding in some form, I highly encourage you to find at least one hour a week to talk about the higher-level goals of your work as a team. What one person is struggling with or excited about is likely a larger topic in need of a discussion.
Technical debt is a real problem — recognize it!
Much of Wikipedia’s code is not the most appealing of technology stacks for front-end developers. We don’t use any well-known frameworks. My experience has shown that new hires can take many months to become fully effective, and existing hires are susceptible to getting frustrated with it.
That said, I believe strongly that in a decade from now, much of software engineering is going to be about fixing codebases that were built in this era by teams understandably building irresponsibly in order to hit deadlines or create that first minimal viable product, so I think getting exposure to mature codebases and finding ways to refactor them responsibly is an enlightening and rewarding experience.
While hard, it’s our duty as engineers and product owners to explain and understand why cutting a corner is a bad idea and technical debt is a problem.
We should all talk more — to our existing team members and future team members. Good engineers should be explaining the problems they are solving before building (it might have already been solved!) and take time to document complicated or hacky code in great detail! We shouldn’t be afraid to write long and lengthy commit messages about what we are doing, no matter how obvious it might seem at the time.
We should all strive to ask each other for help and opinions in code reviews and outside code reviews. We should communicate about what we don’t like and what we do like. We should listen to the frustrated junior developers in the team and the conservative senior engineers that have seen it all before and find problems and solutions.
Someone is going to have to live in our code and suffer our exact same problems when we are long gone and ultimately it's the user who suffers from that. Let's remember that next time we decide to speed up a software delivery.
If you want to learn more about what we did with Webpack and what we were doing for, please read my other article Migrating code from ResourceLoader to Webpack