But it worked before — stability and assumptions

Peter Naulls
Dec 27, 2019 · 3 min read
Image for post
Image for post
Will it fall over?

Never underestimate the role of luck in correct software operation

Yes, I made that up, and yes you can quote me.

There’s a dirty secret in software development — much of it’s made with shortcuts, assumptions, incomplete testing and bad design. With time and effort, all that can be polished, but new software can do weird things. There’s a reason we have prototypes — to test ideas.

And as usual, prototypes rapidly become the real software. Or least, begin the journey towards real software. Along the way are demos, reworks, design mistake fixes or work arounds, etc, etc.

Did I mention it was a prototype? That means that the software tends to work in a very specific way, with a good wind behind it, and a good deal of luck. Prototypes have a way of working perfectly the very first time they are tried, and then quickly hitting the real world.

This can give the impression that something is working famously, when it’s not.

How things are

In operating systems like Windows, there’s a lot of separation between systems — necessarily so, since so many people work on it. APIs between drivers (software which talks directly to hardware) remain stable for many years, and even if the driver is updated or the OS has fixes, you can by and large, expect things to keep working.

On smaller systems, not so much. Systems are developed by a smaller (sometimes only a couple) team of people, who have cross-system responsibility. Often, adding a feature or driver means a rework of interfaces. This is unfortunate, but hard to avoid.

Indeed, there is an essay in the Linux kernel documentation that talks about the follies of such a “stable” API:

The point of this rant is that the Linux kernel, like so many other things, is a living thing, with ongoing development, and a “stable” interface is somethings not possible or desirable.

Narrow Design

Many well designed and otherwise seemly robustly design products balance upon a knife edge. Software is written to work within confined constraints of a system, and hardware is design to meet specific needs. If things are changed due to new features, a rework for a new product, or hardware changes, well, somethings things can fall apart quickly as cascade of assumptions is challenged, or old design flaws have to be reworked.

Even the most robustly designed and broad-ranging system eventually has to be changed to meet new demands, and can come face to face with this.

Luck

And finally of course, luck can play an outsized role. Sometimes during a prototype or test, nothing goes wrong that could. It happens, and I’d suggest, it happens quite a lot. Sometimes long-standing bugs are simply not noticed in the field, because users didn’t push the system in just the right way (or didn’t notice). Sometimes the network conditions are just perfect enough to not noticed degrading performance. And so the Jenga tower continues to stand.

Where does this leave us?

And so to the casual observer, and sometimes, to engineers who probably should know better, something that works once, doesn’t mean it will work a second time.

And in the case of that driver setup, just because the driver worked against a previous version of the system software, doesn’t mean it will continue to do so. And just because something worked before doesn’t mean it doesn’t need a lot more work to be robust.

But good luck explaining all this to a lay person who saw the demo working.

Adventures in Software Development

Programming, Software and Lessons Learned.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface.

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox.

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store