Engineering Fundamentals 101

Software architecture and design principles.


REST is the new Linux Way

The pieces should be minimalist, easy to put together, powerful and simple to use.
Linux-way accomplished this goal as of 10 years ago.
REST accomplishes it today.
REST-ful interface operating over a transport like Thrift or Protocol Buffer is a modern high-performance solution.
Allowing the notion of publish-subscribe as part of a REST-ful API completes the picture.
Adding a beautification engine for human-to-computer interactions on top of the above provides a well-rounded set of interfaces to build upon.

Things should work out of the box

“Clone that instance and start using it” should become the standard.
The absolute lowest bar is:
$ git clone $module ; cd $module ; ./run.sh

Zero-configuration is king

Configuration should be as simple and straightforward as possible.
There is no logical reason to not have the module do its job right away.
When detailed tweaking requires setting one field, this setting should be exposed in a way that allows setting it via Web interface or by one curl call.

Live systems should be transparent

Here is a brief list of parameters that should be accessible within one curl request or one click in the Web interface:
* CPU/RAM/Network/Disk usage.
* Recent QPS.
* Uptime.
* Anomalies discovered in input or generated data.

Tools should be convenient

When functionality and efficiency are there, little thing matter.
It should take under five minutes to go from an endpoint that returns a list of numbers to a dashboard showing a chart of them.
A distributed lock filesystem should support ls / cat / cp / mv / chmod / chown right away. From the command line and via a simple Web interface.
Every single module that consumes, generates or works with data should have respective development endpoints exposed.
Along with beautifiers to examine them visually from the browser. From the mobile one included.

Complex should be made simple

Software engineering and architecture is largely, if not exclusively, about keeping the complexity of the systems manageable.
The key is to wrap complex internals into simple interfaces.
The internals of file systems, pipes and sockets are extremely complex. Posix-like external interface hides all the complexity and is easy to use.
The internals of an operating system are overwhelmingly nontrivial. *nix-style interface, however, has became an established standard a long time ago.
Revision control may be nontrivial. git solves it.
Transferring and updating large files is hard. rsync solves it on a technology level and DropBox does it as a product.
Dealing with dependencies and build order can be tricky. npm / pip / gem, as well as make and other make tools, take that burden away and allow engineers to build upon them.
There is nothing wrong with the internals of the system being very complex. What matters is how predictable, useful and straightforward their usage is.

Everything should be tested

Internals of every module must have a bulletproof whitebox test to run locally.
External interface of every module should have a bulletproof blackbox test to run from a remote machine.
Performance numbers reported should be automatically monitored and always up to date.

Performance is to be designed for

Right design decisions early on largely eliminate the need to fight performance bottlenecks on the later stages of the project.
There is no reason for the performance of log storer to be bounded by anything but system kernel limitations.
There is no reason for data insights that can be derived from a few gigabytes of data to require spawning a separate job on a cluster.

The codebase should be easy to work with

SOLID and DRY are more than just good-looking acronyms.

Leverage, not rebuild

Part of building on the strengths is in not competing on the fields where existing solutions have earned their place.
A much better alternative is to wrap those solutions to adhere to the above philosophy and move forward empowered by them.
That said, plenty of challenges are either are not yet resolved or not yet well connected. Putting them to play well together is often more important than designing the individual pieces well.

Open Source is the future

According to my vision on open source and data liberation.