IT Systems that Scale

Kevin Cox
Kevin Cox
Oct 9, 2017 · 5 min read

IT systems are unnecessarily complicated because of the way they evolved. Once upon a time, storage, computation and communications were expensive. Early computing systems were optimised to make use of these scarce physical resources. Moore’s Law has meant we are able to create large systems using techniques developed when hardware was expensive. However, the techniques used to optimise small systems do not scale well to large systems with millions of applications, billions of people, and trillions of objects in the Internet of Things.

Today, storage, computation and communications are inexpensive. Our constraints are different. We need to handle much greater complexity and have the ability to scale, integrate and reuse large complex systems.

One way to scale is to add a new layer to IT systems to increase their functionality, reliability, and security. If we add a new layer it should enhance existing systems without changing them. Somewhat like the prefrontal cortex of the brain, it should coordinate applications and data to extend capability without changing what already exists.

Semantics through the use of data

In early systems, it made sense to have one place to store each item of data and to have a single source of truth. When systems were small, it was useful to embed semantics in the name of the objects and their data values. (Figure 1)

Figure 1 — meaning stored in the data

Alternatively, we can give data meaning when an application uses it. (Figure 2).

Figure 2 — Meaning comes when the application is run

Systems become complicated when we have many applications accessing the same data items. (Figure 3)

Figure 3 — Same data many meanings

The meaning of data can change depending on the application. When many applications access the same data item, changes in the meaning of one application can impact other applications. It also becomes difficult to add new meanings and uses of data.

Complexity reduces if we isolate data by giving each application its own space for the same data item (figure 4).

Figure 4 — Reduce complexity by making copies of the data for different meanings

Here each application has its own data storage space for each data item. The application using the data provides the meaning to each data item. Isolating the data reduces problems from unforeseen interactions. It modularised the system and makes it easy to reuse an application in a different IT system. Instead of reusing data, we reuse applications across different systems. The data used with the applications is tied to the application and has no meaning independent of the application.

This approach makes for secure systems. An intruder examining the data has no idea what the data means unless they also have access to the application operating on the data. Examining the data values does not reveal the meaning of the data. This means that massive security breaches of data become less likely and systems are able to quarantine security breaches and isolate intruders.

Client Server and Distributed Systems

Most computer systems are Client Server systems. (Figure 5)

Figure 5 — one copy of the application multiple threads

We do this to reduce the amount of storage used for programs and the computational load on the total system. Using this approach means we put the meaning of the data with the data. Control of the data now gives control over the use of the data and it makes data mining and the extraction of data viable business models. Distributing the data means the data has to be the same wherever it is stored. To have distributed ledgers means we have to know that the data elements are the same in all ledgers. We achieve this with technologies like blockchain. But blockchain computation increases exponentially with each instance of the data.

By separating the meaning from the data, we can distribute systems by distributing the applications. (Figure 6). Distribute the application with multiple copies and distribute the storage and we get modular systems. More importantly we distribute control over meaning and, while it does not stop data mining and third party exploitation of data it provides for alternative business models.

Figure 6 — Many copies of the same application

Now the systems scale because we only need to make sure the copies of each application that provides each data item with its meaning are the same.

Once we have this structure, we can deploy the same application across multiple entities. We can represent the connections as in Figure 7.

Figure 7 — Applications connect entities when the application runs

Figure 8 shows a truly distributed system, connected by the same applications, rather than by data.

Figure 8 — Distributed modular system with meaning emerging when application runs

Within a distributed system, all entities have equal standing. New applications with new meaning for data can be introduced incrementally without changing the existing systems. Different organisations and people may use the same applications, with the data protected in silos. The meaning of the data is only known when we can retrieve it using the application which placed it there.

To ensure all the applications A, B and C are the same, we make copies of the code and give the copies to the entities. We only allow application A, for example, to communicate with another instance of application A, if we can prove that the application is a clone of all other A’s.

With this new structure, the required amount of computation, storage and communications may be one or two orders of magnitude greater than with a client/server application. However, the amount of computation power required for a given computation does not vary with the number of entities, applications or data in the system. Taking meaning from the data and putting it in the execution of applications is scalable, maintainable, secure, easily extended, and, for humans, it is private.

Importantly it opens the way for alternative business models where the applications that provide meaning to data are paid for the work they do. In contrast, today’s dominant business model is to monetise data meaning where third party controllers of data charge rent for access to data.

More From Medium

More from Kevin Cox

Also tagged Complex Adaptive Systems

More from Kevin Cox

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade