I contributed to the fear of messiness in software (ataxophobia is the fear of disorder for those of you playing Anxiety Bingo). When I formulated Extreme Programming I chose as values communication, feedback, courage, and simplicity. Simplicity. I hoped that better development would result in neat, tidy software.
“Neat, tidy” is not one of the 3 options. The three states of a software project are near disaster, disaster, and unused.
The mess isn’t random. The mess goes through predictable stages. (You might have wondered when 3X would wander in.) We can respond better and worse to those stages. This, then, is what 3X predicts about messy software.
The seeds of messiness are planted while exploring. We don’t know what we’re doing, so we do whatever just so we can get feedback from real users. How users behave is different from what we assumed, so whatever we built is guaranteed to be different from what we wished we had built. It’s a mess.
One of those messes took off. Now the system is failing in new & novel ways every week as we scale. We only have time to tape it together and get it back on the road. It’s a messier mess, and getting bigger (that’s the good part) and messier (the bad part) by the day.
Whew! Now that growth is predictable you have time for regret and shame to set in. If only we had taken the time to do X back when then we wouldn’t have such a steaming pile today. Now, however much we tidy, we don’t seem to make any headway on the mess.
But, But, But…
I submit that perpetual mess is the ideal state of a software product. 20-years-ago me dreamed of software tucked in clean and shiny each night ready to awaken fresh, rested, and ready for the morrow’s challenges. It just doesn’t work like that. Here’s why.
If we want to observe how users behave, the one-day version that we then discard is better than the one-week version we’re proud of that we then discard. At scale, several explorers waiting for the One Tidy Solution are all burning daylight. Better to get out there & improve our chances of survival.
Say we got lucky with our clean-albeit-slow exploration and now we’re expanding. If we take the time to fully eliminate mess, then we lose precious growth opportunities while we’re stalled at a bottleneck. The time we waste is time we don’t have to prepare for the bottleneck after that.
Say we got lucky and survived, despite our efforts, to the extract phase. Our efforts to tidy are always going to compete with efforts continue growing and to invest in further exploration. Operation always teaches lessons, so yesterday’s tidy is today’s meh and tomorrow’s yecch.
All is not hopeless. We can use “tidy” as a verb, even if we must reluctantly let it go as an adjective. Some forms of tidying make sense in spite of shifting tradeoffs of 3X:
- Implementation patterns — we can write code that clearly expresses today’s intent at the small scale. Practicing and applying implementation patterns is pure win — better results sooner.
- Tidy first — we aren’t forced to pile mess on mess. Sometimes tidying first is the shortest path to success.
- TDD/TCR — when we can define expected value ahead of time, integrating testing with the programming workflow yields better results sooner.
Big Extract Tidying
The most satisfying (and profitable) tidying comes in Extract. Facebook once had 5 key/value stores in production at scale at the same time. This made sense because 5 (probably 50) explore projects needed a key/value store and there wasn’t one so they all rolled their own. Waiting for the blessed corporate standard key/value store would have forfeited the opportunities being explored. While expanding, the 5 surviving key/value stores were all under strain. Merging them under extreme time pressure would have been suicide. Once at scale, though, time pressure eased and it made sense to tackle the big project of moving from 5 to 4. And then 4 to 3. 2. 1.
I call this process “retroactive infrastructure”. What is the infrastructure you wish you had started with? The resulting unified key/value store was superbly engineered, ready for scale, and contained no extraneous features. Most of the profitability of tidy infrastructure still accrues, because most of the life of those 5 projects (and all the projects to come) operate efficiently. The early duplication is a sacrifice to the gods of efficient exploration.