The Single Biggest Mistake Programmers Make Every Day
Eric Elliott
82117

Great and informative piece as always!

It reminds me of this from the Zen of Python:

Simple is better than complex.
Complex is better than complicated.

It is not until recently — when I got to work with some very complicated code — that I finally experience the differences between complex and complicated.

This is just my interpretation, and may not be totally correct.


complex: less straightforward

In my understanding, a complex function still does one thing, but does it in a less straightforward way, in order to gain some benefit. (Most commonly it is for performance gain.)

For example, the heap sort algorithm is more complex than bubble sort.

A dynamic programming solution for the longest common substring problem is more complex than the naïve version.

And you can see, I am using the naïve version in my project.

That’s because simpler code is easier to understand and debug, and it works fast enough for my use case (and I can’t find any longest common substring module on npm that works on a plain String).

I’d prefer simple code over complex code.

  • I try to keep my function as small as possible.
  • I process my data using a chain of `.map` and `.filter` instead of a for loop, even though that would create a lot of intermediate data structures.
  • I just perform a linear search when I can also do a binary search.
  • I’d not write a `shouldComponentUpdate`, even though the component only uses immutable data structure.
  • That is, unless it is simply not fast enough.
  • Since I don’t know whether I will hit the performance bottleneck or not, I design things in such a way that the slow parts can easily be replaced with a faster parts — without affecting the rest of the system.

complicated: involves in too many affairs

I once wrote a musical score file parsing routine that goes through all the headers and events at once, all while keeping track the timing information, and generates an image from it.

The code does it in a very straightforward way. No fancy abstractions, just pure and straight code. That PHP script doesn’t even contain the word ‘function.’ Of course, the script also handles file uploads.

But it is hard to test — you can only test the final results, and it is not reusable — you can’t just reuse it in another use case.

In my new version (written in JavaScript for a rhythm game I’m developing), each part of the process becomes its own module.

  • A module for decoding a stream of bytes.
  • A module for parsing the decoded data.
  • A module to represent the musical score events as data structure.
  • A module that works with time signature.
  • A module that converts time in the music to time in the real world.
  • A module that interpolates numbers.
  • A module that works with actual notes.

And it happens that…

  • It becomes much easier to test. Each module is testable on its own.
  • It becomes much easier to reuse. I can pick the relevant part of the module, and use it in an entirely different situations (e.g. rendering the musical score as a .wav file, or indexing songs inside a package) with little to no modification.
  • It becomes much easier to extend. Since the code that handles the music is separate from the code that does the parsing, I can easily make my game compatible with another musical score file format.

While the code is more complex, it is less complicated, and it is so easy to work with.

I’d prefer complex code rather than complicated code.

  • I’d try to make my functions do only one thing as much as I can — even though this means more round-trips API/database calls. At least, things don’t become entangled. Perhaps, you will find a more elegant way to optimize them later!
  • I’d keep my data model as normalized as possible, so that I don’t have to go through the hassle of keeping them all in sync or preventing inconsistent data (which is, IMO, very error-prone). However, this means the querying logic may become slightly more complex, but that can easily be abstracted away. Denormalization is my last resort.
  • I would not use things like memcached right away; I’d just use an in-memory cache. However, should the time come, I should be able to swap that implementation with a memcached implementation without affecting other parts of the software.
  • Even with a caching facility, I wouldn’t cache anything at first, for the sake of simplicity, and then introduce caching in the place where it’d be most beneficial.
  • Caching should be as transparent as possible. This means as much as possible, the users of your code should be able to benefit from the cache without knowing it is there.
  • Browserify is an excellent example: You can pass an object for it to use as a cache to speed up rebuilds. The rest of API remain unaffected — there is no other adjustment needed to utilize that cache. React’s `shouldComponentUpdate` is another great example.
  • Don’t optimize, but make room for optimization.