Beware the Gremlins Lurking in your Code!
During WWII servicemen of the British Royal Air Force sometimes joked that sudden mysterious malfunctions in their airplanes or equipment were attributable to mischievous little critters called Gremlins from which, incidentally, the name of a series of eponymous films derives. In case you were wondering there were not in fact any diminutive beasties throwing a wrench into the engines of the fighter planes of the RAF. The malfunctions were in all cases traceable to a sufficient reason of physical causality, a mechanical misconfiguration, eroded or broken parts. But there is a certain fanciful metaphor in anthropomorphizing what goes wrong in a machine . In the machinery of a complicated vehicle such as an airplane every component must play its role with precise positioning and timing for the delicate balance of energy transfers to remain stable and keep it from scattering into pieces mid-flight. Such balancing acts are bound to have breakdowns. “Gremlin” so much like the phonetic cousin of “glitch” is the minion of entropy and the enemy of engines. In both cases the words are a stand-in for the unknown as of yet unidentified variables which cause mechanical or software systems to go haywire.
Anyone in the software field can speak to their own encounters with gremlins. Software systems are pristinely logical constructs in which no guesswork takes place in the clear execution of their internal operation. They are completely deterministic in the sense that every term in every expression in every statement of every line of code in every program subserves a definite role and represents an absolute relationship between its relata and the structure it embeds in. Despite the orderly lawfulness of programming, developers often experience quite a different world when working, one that is like wrestling an unruly lion into submission. Nontrivial software often seems to have “a mind of its own.” No doubt this is a projection of ignorance in the face of complexity; any nontrivial app can be expected to have tens of thousands of lines of code and even the most bug-resistant, self-documenting and performative code can behave unpredictably given the slightest change.
Exponential complexity means that even if the “micro” relations between code pieces are predictable, the “macro” relations can exhibit intractable emergent behaviors. Often fixing one thing will break another, in an ongoing game of whack-a-mole that never ends.
Despite the best attempts to minimize unpredictability through unit testing and modular design something almost always slips through the cracks. And one is continually humbled by the inadequacy of one’s predictions, which is why I personally believe estimates in software project management more often than not are futile formalities that are rarely correct. Estimates assume a steadiness and smoothness to the development process that will be met with stubborn defiance at every turn. Despite appearances software is never as rational as we hope. Even as logical systems the people who design programs in spite of what we tell ourselves are imperfectly rational. The discrepancy between how we think a program will behave versus how it will once it goes live can be significant.
In my own work as a developer I can recall days working on a project where suddenly everything would break and it did indeed almost feel that tiny demons had taken residence inside the computer intent on sowing chaos. The lesson then is to never underestimate how much you can underestimate the wackiness of complex systems. Murphy’s law is absolutely at play and you should never expect the development experience to be as rational and clear cut as the medium under development. Attempt to streamline development will eventually run afoul of the gotchas of imperfect human reasoning.
I’ve lost count of how many times I expected development to run smoothly only to have it explode in my face. And this was on the count not of shoddy engineering, but rather, the intractable messiness of emergent systems. Every working program is a set of interdependent logics that rely on one another through various orders of degree of separation. You may have two modules a and b, that in turn rely on a dependency c, which in turn relies on dependency d, which in turn draws from e, f, and g, and so on. Interjecting a new piece of code in the middle of this chain must be compatible with the circuit or the sequence will break. Or if some element far down in the chain breaks, the entire chain will follow suit, and it may be quite hard to figure out what happened and where. Break g and a will break. Any change or addition might require refactoring at several points distributed throughout the entire dependency chain. Even the most well-designed, extensible code will run up against these difficulties. The point to stress here is that it’s virtually impossible to know ahead of time what these perturbations will be when a change is made to a codebase.
The philosopher Martin Heidegger took a keen and special interest in technology and highlighted how we often only become aware of a tool or entity when part of it breaks or goes counter to expectation. We only pay attention to a hammer in our hands, says Heidegger, when the hammerhead falls off the handle. Otherwise the tool functions as an intuitive extension of the body. The same can be said for how we ignore the pancreas until experiencing a sharp pain on the right side of our abdomen. And of course with the construction of software systems, it only comes to pass that a discrepancy is discovered retroactively — when something breaks. And it’s usually the case that we don’t have to learn how some module works until it stops working. Only then is the code highlighted in awareness and the mole whacked.
As it happens the best learning experiences are when it goes wrong. I recall one project that was exceptionally gremlin-plagued. The details surrounding the case must remain confidential but the overall shape of the debacle can be communicated as follows. Gremlins appear in those dark, damp spaces that have fallen too oversight. As with many things in life a project’s future is often doomed by poor decisions made early on. The seeds of disaster are laid as a consequence of poor project management. Speaking to this project, the trouble came right as we were approaching launch. Information about the production deployment process was siloed in one architect, who subsequently left the project. Knowledge of deployment should have been accessible if not to everyone involved on the project at least to the senior members. Thus this one engineer’s departure had an outsized ripple effect that cast a pall over the entire launch and delayed it for weeks. Their replacement scrambled to piece together the mystery.
To make matters worse it was assumed (wrongly) that the deployment process for the production environment would be the same for the test environment. When it turned out not to be so the team rushed to rediscover the undocumented, lost knowledge of how to deploy to production which had been forgotten in the churn of a generation of engineers. Because nobody ever bothered to validate the production deployment process and it was used just once to launch an older version of the app, it was left to assumption that it would work the same as test deployment. Such was a damp dark space for gremlins to breed.
On top of these deployment woes the data schema for the api was changed without the frontend team’s knowledge, leading to an extensive period of time where the frontend had to be refactored in order to accomodate. The data model on the back end was out of sync with what the frontend was prepared to accept. Trouble came in even greater force and the gremlins cheered when in a last minute rush the client decided to change their minds about certain application features, throwing the developers into a tailspin and encouraging shoddy, rushed work. Software developers abide by agile devops practices precisely to uncertainly-proof the process so that gremlins can’t gain a foothold. Only by systematizing the process so that it is as uncertainty-proof as possible can Gremlin infestations be prevented.
Just as dark damp unattended places invite household mold so does poorly planned software projects beget bugs. It’s common sense to consider bugs to be isolated incidents, perhaps the sign of a particular logical flaw introduced by a particular engineer at a particular point in time. More truthfully, it’s that badly organized code is entomologenic — bug inducing. Even a skilled engineer working on a bad project would bump into more bugs than they would otherwise. The structural legacy of the codebase channels code production down error-prone lanes. The poor organization makes bad practices seem like the standard to follow. The theming of the codebase early on determines how the code will be developed in the future. Set up guidelines that promote bad code and the app will be developed with bug-prone, inefficient code.
Chances are there are gremlins lurking in your code as we speak. Untested, unthought of assumptions ready in wait to create problems. An outdated dependency ticking off the minutes before sets something off. A brittle line of code that will come into contradiction with some change and trigger a vague error message that makes it tricky to find. We may not ever know what exactly will go wrong until in hindsight. The best course is to be humble before the complexity of modern software and round up when calculating estimates.