Pathologies of Computer Programmers

Andre Masella
8 min readJun 18, 2017

--

There was a period were I was dealing almost exclusively in bioinformatics and was not dealing with computer programmers. After switching back after my trip to a different wold view, I’ve had many enlightening moments about computer programmers, including myself. Now, I don’t apply this to computer scientists as researchers; they are mathematicians and should do many of the things that regular programmers shouldn’t. I’m talking about people who write code for a living. The why of that is pretty straight forward: a civil engineering researcher gets to experiment with using weird new materials and figuring out how they perform, while a civil engineer should build unimaginative bridges using tools, techniques, and materials that have proven to be stable.

The first, and most severe, is solving the meta-problem. This comes from mathematics and, in mathematics, it is a very good thing, but here, it is a very bad thing. The keywords that come with this include “general”, “all”, “future uses”, and “arbitrary”. It is the case that you want to build a thing X, but rather than do that, you build a “framework” that lets you build X-like things. Then you can build all arbitrary X-like in a general way so you’ve already covered any future uses of your system.

And it’s a trap. It’s a trap on many levels.

The first level is that the future uses are never going to come, and when they do, they will inevitably involve changing your framework as much as the actual things you built with your framework.

The second trap is that building good frameworks is hard, so most people build crappy frameworks that are fragile and impossible to debug. People do this because building frameworks is more interesting than solving the problem when the problem is boring. This is how computer programmers entertain themselves in boring jobs: they create more interesting problems to solve. This is the third trap. Don’t worry, this theme will recur.

One of the goals a programmer has in building a framework is to shovel the boring work onto the customer. Yes, the customer. It usually starts off like this: some business department wants a program to make their lives easier and their work involves a lot of tedious rules, which all interact with each other. Usually, these rules are stuff like how to compute sales tax when your company is in London and your shipping a package from Chicago to Toronto. It’s something amazingly boring and complicated and your customer may not even really be able to tell you if you’re doing it right once you’ve finished — by the end it makes you want to pry out your eyeballs with a spork. But, but, but, if you created a framework by which the customers (a.k.a. suckers) could enter and maintain the rules themselves…you’d earn your pay-cheque and shuffle off the terrible work onto someone else. Excellent plan…except the customers never do. This is the final trap. Frameworks also have a tendency to gain features forever until they become ad hoc programming languages, which leads nicely into the next pathology.

Programmers love to reinvent the wheel. Google is really bad at this. Nothing is “Google scale” until it’s been rewritten as part of the giant incestuous mass of code that is Google. In fact, there’s an expression: At Google, we not only reinvent the wheel, we vulcanise our own rubber.

And how.

The problem with reinventing wheels is that reinvented wheels are rarely round. Usually it’s the case that there is a tool that does mostly what you want, but not everything. So, rather than make the rational conclusion of “I’m going to suck it up and miss out on feature X so that I can have this 90% done in 10 minutes” or even “I’ll just use this and bolt on my one feature that I need, even though it might be a little ugly”, we go for the big guns “I’ll write my own version that does exactly what I need” which slowly evolves into “I’ll write my own version that does what I need, what I think I might need in the future and what I think other people might want”, borrowing from the above.

Now, sometimes, you do need to reinvent the wheel. I’m not going to dispute that. If you’re building a vehicle to traverse glaciers, you might not be able to get wheels from Canadian Tire. Now, if you live in California and think you have to reinvent wheels so that you can drive in Toronto…well, no. How do you know you need to reinvent something? You should be crying and sleepless. You should have scoured the Internet and tried dozens of alternatives and seen if they had a fatal flaw and if it was possible to fix that fatal flaw; after weeks, exhausted and tired of reading forum posts, you should collapse in a demoralised heap with the realisation that you have to reinvent the wheel. Certainly, the world does not need another command line argument parser; in fact, given getopt(1) is in the C library, you should never need to write one. There’s also some consideration for your users here: getopt might be crappy, but a huge collection of command line programs core to UNIX, so even if your users have never heard of getopt, they’ll find the interface familiar. Novelty is overrated. Also, it saves you oodles of code so you can go solve the problem you actually care about.

There is educational value in reinventing the wheel. That’s less about creating a useful final product and more about understanding the internals of the technology we already use. This is one of the idiosyncrasies of school: you spend all your time in school reinventing wheels so you know how they work, not so you can spend your career reinventing wheels.

Inventing a tool is something programmers seem to do a lot. What we don’t make is the distinction between a tool and a jig. In the manufacturing world, when you want to make lots of identical objects, there’s a strong incentive to make tooling to help. A jig is something that holds pieces in the correct orientation so that an existing machine can shape them easily, precisely, and repeatably. A jig can be a simple thing: a board with some nails in it that hold the work piece and some arrows and marks where to cut. It’s the kind of thing that holds to pieces of wood so your saw can make a matched mitre joint. You don’t build a new saw to make that cut; you make something to extend the power of that tool for you need.

A jig holds pieces in the correct orientation, but it might also be able to hold them in an incorrect orientation. A jig requires that the operator is fully aware of how to use the jig. In fact, the tools also require the operator to know what they are doing. When programmers create tools, they don’t make the distinction between simple tools (jigs) and complicated tools. There are lots of simple tools that shouldn’t prevent the user from doing stupid things. It’s beyond their scope. Once you know what you are doing, they let you repeat that action many times with relative ease. They are mechanical amplifiers of your work. Tools get the option to be smarter. Tools have features to protect you and your work from doing “bad things”.

I see this a lot in Java’s type system. People want to use generic to try to ensure some matching or correctness. All generics in Java can do is prevent casting problems; they lack the expressive power for anything else. You can build a very useful jig using Java’s generics that, if someone specifies stupid type bounds, will produce stupid results. Just like the mitre joint jig won’t stop you from putting in the pieces of wood up-side down.

Programming language inventors like to prevent programmers from doing bad things. The reality is that you can’t. Solving Turing-complete problems requires creating a tool with modes of failure that cannot be detected (a.k.a. the halting problem). That’s okay. A saw isn’t useful unless it cuts through things harder than humans. It will always be dangerous. Programmers want to believe it is possible to create a tool that will make the problem “easy” and perfect every time. It can’t be done. Good tools will prevent you from making common mistakes, but they have to let you make mistakes to be useful.

This leads to the final problem when designing large systems, and here is where biology has been most instructive: our puny human brains. Most problems are sufficiently large in scope that we can’t conceive of how to solve them all in one go. We have to divide the problem into ever smaller pieces until they are eventually small enough to solve and then reassemble into a final solution with some predictable behaviour.

That is impossible. The larger the system is, the more likely it is there will be emergent behaviour that is probably counter-productive. This can’t be fixed in any rational way. The best one can do is try to mitigate the effects. One usual approach is to “hermetically” seal each component. The less coupled the components are, the less likely they are to have unpredictable behaviour. It’s a nice thought, but only partially true. More concerning, it leads to the thinking that the more optimal each sealed component in the system, the more optimal the system as a whole.

When large systems exhibit failures or inefficiencies, programmers believe that their piece-wise optimal solutions fail due to poorly defined interfaces. That is, they think the problems are a lack of communication at the boundaries that they defined.

Anyone who has read Zen and the Art of Motorcycle Maintenance knows that the divisions of a system, even ones that was designed, are arbitrary. If I can forever redefine the interfaces without changing the system, how can it be the case that the problems in the system are due to poorly defined interfaces?

Biological systems provide interesting answers because biological system often take advantage of emergent behaviours and can have very stable activity despite having very high degrees of coupling between components and there are no discernible interfaces beyond electron orbitals. Yes, even ecosystems are put together in a way that relies on particular quantum states. Think about that. It’s not even that surprising. The best systems we’ve designed, like the Internet, have all kinds of weird emergent behaviour relying not only on the technology but the humans involved.

It is difficult to understand and debug. I’m not necessarily advocating trying to build a system based on emergent behaviour; I’m pointing out that it becomes inevitable.

One other little biological gem: 100 % efficiency is a bad thing. Yes, we all like to believe that efficiency is the greatest thing we can do. Making a system more efficient will always make it better and the only way to do this it to make the individual components more efficient. We’ve seen lots of biological system with sub-optimal components where improving the efficiency of the component makes the efficiency of the system worse. The enzyme ribulose-1,5-bisphosphate carboxylase oxygenase (RuBisCO), a component in plant’s photosynthesis mechanism, is a common example. RuBisCO is supposed to capture carbon dioxide, but occasionally captures oxygen instead, leading to photorespiration, which wastes energy. Making RuBisCO more efficient leads to less photosynthetic product out the end of the system.

If you’re going to build a system that will eventually become emergent and inefficient, then piece wise-optimal is a losing strategy. It’s much better to go for piece wise-understandable.

I admit to being guilty of the above sins, but I’m hoping to repent. I’m getting better at recognising them.

--

--

Andre Masella

I make Flabbergast(@co_0nfig), a loom, and bread, not in that order. Views are those of my past-selves. Full of caremad.