With a team consisted of people from various universities and past experience — and a few majored in totally different subjects altogether — Cermati’s engineering team is quite diverse. Due to this diversity, not everybody on the team got the same common body of knowledge beyond the required computer science and software engineering skills when joining. This drives discussions on various topics among us, from which we’re helping each other to understand new concepts and grow — filling the gaps in our knowledge.
This article is inspired by a discussion I had with some of my fellow Cermati engineers when we were talking about how modern digital infrastructure works and things that happen under the hood.
To understand how modern digital infrastructure works and what we need to plan when building one, we first need to understand a bit about the physical concepts of the hardware and networks — and how to manage them. Paraphrasing a comment by Thomas Dullien on an article about Spectre vulnerability earlier this year: computer science loves to be in proximity of pure mathematics, but it actually lives between electrical engineering and mathematics, there’s physics involved in the process.
Here’s the full comment, for those unwilling to leave this article for that one yet.
I think this is too dark a post, but it shows a useful shock: Computer Science likes to live in proximity to pure mathematics, but it lives between EE and mathematics. And neglecting the EE side is dangerous — which not only Spectre showed, but which should have been obvious at the latest when Rowhammer hit.
There’s actual physics happening, and we need to be aware of it.
This article isn’t exactly about hardware security — or security in general — as the article I referenced earlier, although it is a part of what we’re going to talk about. In this article, we’re going to talk about distributed systems and some of the principles behind it — some are physical, but this article isn’t a physics lecture so let’s skip the details.
Our infrastructure is distributed among several cloud service providers and some physical infrastructure. A part of our job in Cermati’s infrastructure team is to ensure our distributed infrastructure works reliably in unison to support our organizational needs.
While Cermati’s system at the moment isn’t at the point where we literally manage enough machines that’s placed in many points within the map above (at the time of writing it’s just a few locations in Singapore and Jakarta), understanding some of the basic principles can help us in designing our system’s architecture to make the most out of the resources we have at hand.
Managing all of the stuff we have requires some understanding regarding how each part of the system works and how to best utilize them. Every part of the computing systems is composed of electrical circuits and components on the physical side, also by code on the virtual infrastructure and software.
In a distributed system, the components can be physically separated by a great distance. Similar to a computer’s architecture — with different electrical circuits and components engineered to work together in sync — a distributed system consists of multiple machines working across a networked system. The networked system can be different hosts in the same data center, or even physically separated by oceans and continents.
Host Capacity Planning
Software needs to be run on top of other layers. How many layers exactly we have under our software depends on what kind of software we have for the context, which we will explain later. But despite all that layering, we — for sure — have hardware where we’re going to run our software on.
Our commodity computing hardware consists of electrical and mechanical components, generally CPU, memory, and storage. These components are assembled into a computing device perfectly capable of performing our computational needs.
On top of the hardware with the actual components, we might have virtual machines with virtual CPU, memory, storage, and everything. These virtual machines work just like their hardware counterparts, but with software logic at its core instead of hard-wired circuitries. The software acts as another layer to the system, interacting with the physical host’s operating system which has the actual access to the bare-metal hardware.
A host machine comes with resources we can get according to our budget. Choosing which resource to maximize for which host should depend on what kind of operations we expect the host to perform the most. Understanding a bit of computer architecture will help us determine what kind of machines we’re going to need for our system.
For machines that perform heavy computational operations, getting the most optimal CPU power for the code running on the machine is a good thing. CPU is used whenever an instruction is executed on the host. The instruction itself can have an idle time when it’s waiting for memory, disk, or network I/O access. During the idle time, the CPU is generally not getting used as much.
CPU power is usually measured by clock rate, which is measured in Hertz. 1Hz is one cycle of clock state change per second, while 5Hz is five cycles of clock state change per second.
The higher the clock speed rate means the CPU has a higher frequency of state change, which translates to faster execution speed of instructions.
Another common component we consider when choosing a host is the memory. A physical memory is a circuit containing capacitors, which are used to keep bit representations of data. The access from CPU to memory is generally pretty fast — even the instructions we need to run are stored in the memory — so the memory capacity is usually maximized to keep things that need to be accessed with minimum disk or network I/O delay, an example case for this is an in-memory cache server.
The other component to consider is disk. Disk access is generally slow since most of the disk types we’re using is a type of rotating magnetic disks. This type of disk is utilizing magnetic disks to keep the data, the disks are rotating as the system is reading and writing to it using rotating hands to modify the state in the magnetic disks.
The rotations for sector lookup take some time, so the access to the storage will be significantly slower compared to memory access. SSD storage is faster because it doesn’t need to perform the magnetic disk rotations for lookup and disk writes, but at the present it’s relatively expensive.
The type of storage and how much capacity it should have depends on how the storage will be used. We can also set up swap partitions on storage to be utilized as memory whenever we’re running out of physical memory, but it will be slower for the aforementioned reasons on why storage access is slower than memory access.
As our application software architecture is generally designed to not contain local states on storage, an application server usually needs decent CPU — multiple cores would be nice, depending on the process performed by the application — and considerable amount of memory depending on how much data is expected to be loaded into the memory during the peak time of the application. We generally don’t need a lot of storage capacity for application servers, just enough to keep recent log files and a few extra kernel images should be fine.
Database machines would need a bigger storage capacity to be able to store a lot of data, also considerable memory and CPU to keep connections and perform data operations.
After deciding what specifications we need for our machines, we need to think about the placement of the machines. The most obvious consideration when deciding where a machine should be located is where the clients accessing our machines are.
The physical distance between the machines is one thing to consider. Since at the present Cermati is only serving Indonesian customers, it doesn’t make sense to put the application servers at a data center in the US whereas we have alternatives to put the machines in Singapore or, even better, in Indonesia — under the assumption that the transmission medium factors stay the same.
Farther physical distance translates to longer time to travel for the data packets communicated through the network which means we’re expecting more network latency, since the packets traveling on a certain velocity would require more time to cover more distance — t = d/v, assuming velocity (v) is constant then the more distance (d) to be covered the longer the time (t) it takes for the packets to reach the intended destination.
The actual routing — unless the organization we’re working at is an ISP or large enough to have its own large-scale WAN network infrastructure — is managed by the ISP companies running the intermediary network, so we generally don’t need to plan too much regarding the network route outside the scope of our private network. We only need to focus on the obvious part: the location of the data centers and the physical distance between the users and our machines.
Since the network transmission rate translates to performance for processes with network I/O, we put hosts that intensively communicating with each other through networks closer together physically and logically to reduce the network latency between the communicating hosts. Also, we need to make sure we have sufficient network bandwidth.
Ideally, all of the hosts should stick together so they can communicate effectively and everything can be easily integrated. But some circumstances lead us to have multiple hosts in multiple cloud providers’ data centers, while also maintaining some physical machines ourselves. Yet, the systems have to communicate with each other to get in sync for it to allow our company to work. We manage this by designing the physically-separated systems to only communicate with each other in a relatively infrequent manner and with extra handlers on failures to ensure communication reliability.
Any group of hosts that requires fast and reliable communication to each other shall be put together in the same location, with some of the hosts taking the role to synchronize the system with the other systems at separate locations.
For static files and assets, we can put them on CDN to ease some load from our machines and networks by having the CDN cache them and let the CDN servers — instead of our machines — serve the content instead.
Security is a bit complicated to talk about, since it covers not only how to build the system but also how to manage and govern it well. The basic concepts of security can be summed up using the CIA triad below.
- Confidentiality means that a resource is accessible only to the people authorized to access it.
- Integrity means that when the resource is accessed we can verify that it hasn’t been tampered with and we can trust the resource we’re accessing is the correct one.
- Availability means that the resource is accessible whenever needed.
The points explained above are general concepts that can be applied in any context, as the resources referred in the points are context-independent. Not every context needs all three, since some contexts such as encryption algorithm strength only needs to consider confidentiality and integrity.
If applied to our infrastructure, we can set the contexts as listed below.
- Confidentiality means that any information regarding the internal workings of our system isn’t leaked due to misconfiguration of the infrastructure system. Example violation of confidentiality in our system would be our system monitoring dashboard charts become publicly accessible due to misconfigured access control rule.
- Integrity means that the information we have on the infrastructure system must be correct and can’t be tampered with in-between the processes to manipulate the system or the users. Example violation of integrity would be having one of our isolated systems synchronize a set X of data to another one of them, but what the target system receives isn’t set X but set Y which is slightly different — even a slight different means that X≠Y, which violates the integrity of the communication.
- Availability means that the system should always be accessible whenever needed. Example violation of availability would be having a system outage.
As for what we’re doing, we’re going to focus more on confidentiality since some parts of our infrastructure need to be accessible from the public Internet and the confidentiality of the data stored and communicated by the systems is the basic requirement we have when configuring them. Integrity is normally built into the synchronization procedure, while availability is usually covered when we plan for the system capacity and reliability.
To ensure the security of our infrastructure access, we can set up security access rules on our cloud infrastructure providers for each of our hosts. AWS and Alibaba Cloud have security groups — virtual firewall rules configurable for each host.
As our system consists of smaller systems distributed across multiple locations and third-party services such as SMS gateway services and partner APIs, we need to think about how the systems can communicate with each other securely.
One method is setting up VPN gateways to allow hosts from other locations access the target network’s machines without exposing it to the public Internet. But setting up VPN gateways on every part of the system would be costly since we also need to maintain them.
Another method is using firewalls on network perimeters and authentication methods to allow external access only to authenticated requests from whitelisted source IP addresses. Of course it only works for systems with static public IP address, otherwise we’ll need to update the whitelist every time the ISP issues a new public IP address for one of our separated systems.
A huge part of maintaining the system security is designing proper authorization — regulating who is accessing whom, on which the time is spent mostly for determining how the access is performed and deciding whether the access should be granted or not.
We haven’t dived too much into the user account management aside from managing IAM (for AWS), RAM (for Alibaba Cloud), VPN, and SSH accounts. This method still works fine at our current scale, but we’ll need a way to manage much more accounts in order to scale better as our team grows.
Security is an important thing to consider when building our infrastructure since having access to the system allows any user to parts of our production system. The impact of insecure infrastructure can range from our system behaving strangely once in a while to a massive information leakage and data theft. Keeping our attack surface small while effectively governing our infrastructure and workflow is a challenge on its own.
Infrastructure system management and planning is an important part of Cermati’s engineering dating back to our early days, due to our needs to run our own physical call center infrastructure and the nature of our business that requires us to comply with government regulations. These needs push us to go for a mix of cloud and physical infrastructure systems, with our infrastructures hosted on different data centers and cloud system providers.
There are a lot of things to consider when building and managing IT infrastructure, and we’ve covered some of the basic things we need to consider when setting up and managing machines and computer networks. Yet, they’re all only a quick introduction to some of the basics and with some examples as we encounter on our infrastructure.
In practice, we use a lot of pre-made components and tools. Using readily-available tools can significantly speed up our process, but performing some research to see how they’re going to fit into the big picture of our systems architecture — which requires a good understanding on how IT infrastructure works — is an important part of the job.