Services and Isolation

What are they, and what are they to us.

Published in

Jumbo Tech Campus

10 min readDec 20, 2021

We constantly talk about services. Many of us vaguely understand what they are and get the gist of it. The risk however is that — since it’s such an easy topic to discuss about — we often miss a deep common understanding. In this article I aim to set a common understanding of what services are and how it all works, without overloading you with a bunch of details.

Which kind of services are we talking about?

The term “Service” means something different depending on who you talk with. In example: when you are more into infrastructure or perhaps management, a service might mean something along the lines of a ‘service that a company provides for us’. Or even something along the lines of ‘something that is done as a courtesy’. Though relevant, this is not the type of service I wish to deepen in this article.

The service I want to talk about is a technical kind of service. The one that can be associated with a ‘service oriented architecture’. Whenever you speak with a backend-developer, they will be inclined to talk about these kinds of services instead of the former. They are:

Technology we employ to optimise the value and output of the development organisation over the course of time

What does it mean for my organisation?

When your company creates its software by leveraging service technology, those services create an impact back on the organisations. Therefore creating services is not always the right way to go. Let me explain.

Conway’s law (of which I’ve written many times before) states that an organisation produces software the same way it is structured and organised.

This means that when you have an organisation which has multiple development teams working on the solution, you will be inclined to fragment your software in such a way that each team can develop and deploy independently from other teams.

And the same goes the the other way around, when you downscale your organisation, all these fragments start to impede your teams and they will be inclined to consolidate them into one system. This is often overlooked when organisations downscale their dev forces since they have a running product.

Fragmentation into services also comes at a cost. New technology means new ways of governing, security, new areas of expertise and new boundaries on quality assurance as well. They reflect back on the organisation that will initially struggle to mature and adapt to this federated way of autonomy. One example of this is the need for something like a Site Reliability Engineering team, or in other words an on-call team that makes sure operations will be triaged and mitigated when stuff breaks outside of office hours.

Isolation

We often tend to build integrations found on tight coupling with any system that can fulfil a required piece of the technology chain. But what happens if that piece of tech needs to be replaced, either upgraded or switched out for a different piece of software?

What you see here are two people shaking hands. However, they have gloves on. The handshake won’t be broken when one of the parties retracts their hand and replaces it with someone else that is more capable.

What we need is to implement a layer of isolation around the functionality. In example, we have a Point Of Sale (POS) solution in place. We could integrate against it directly. Instead, we choose to isolate the POS from the capabilities it serves. It can i.e. calculate baskets, create invoices and keep track of loyalty points.

If we were to integrate that in our landscape everywhere we have the need of these capabilities, swapping that piece of software out for a new or upgraded piece would become a multi-year plan and a pain to plan and bring to production without the risk of harming the operation.

So instead we do the integration of those systems in services that serve a business goal (Domain Driven Design methodology). This would i.e. mean that there is a basket calculation service which is responsible for doing basket calculations. That fact will never change. This service is responsible for talking in our own language about everything we deem relevant when it comes to basket calculations. The fact that it delegates (perhaps a part of or even the majority) the responsibility to another system, is its choice in implementation but irrelevant for the consumers of the functionality.

Since those responsible services communicate in an internally defined language, we need an adapter that changes the way a vendor (the POS) talks to that service. These are the same kinds of services from a technical perspective, with the difference that they do not contain any business logic other than mapping between languages and do not contain any intermediate storage. They are there to translate. To adapt to our way of talking. That’s all.

How do they work

Services can only work in an ecosystem that allows them to do so. I will explain the main parts of how it all works so you get a better understanding of all the key components.

Loadbalancer

In its essence, a loadbalancer sends requests to pieces of the landscape in which:

Multiple of the same pieces of code provide resiliency and speed. A loadbalancer sends incoming requests to each available piece of code
Routing happens to healthy pieces of code, providing optimal experience
Each piece of code is wrapped in something we call a Pod

Pods

A pod is the smallest scalable piece of software. All the business logic a team develops is running inside of it.

A Pod is a group of whales. Each whale represents a container, and a Pod consists of multiple containers. Hence the logo!
A container is a ‘fake computer’ running only one application
Docker is a mechanism that is able to run those containers on ‘normal operating systems / computers’
There is usually one container in a pod which runs the application and a couple of sidecar containers that support the main container

How we talk

Within a container, we expose so called ‘end points’

Software developed by externals or ourselves provide endpoints to read, write, update, delete or execute on a specific topic.
Health endpoint

The health endpoint is needed to make the loadbalancer understand if the pod is in a good shape or not.

Each endpoint is defined by a contract so that the other systems understand how to talk with it. Developers have to stick to the contract, since they are the foundation on which other teams rely.

Meanwhile at Microsoft, Google and Amazon

For the really big players, it makes sense to create their own cloud.

Applications they run scale up and down depending on usage
They have to overprovision hardware in case demand is very high
Cheap buying of hardware because of the sheer amount (economics of scale)
Why not share with others to get even better buying position?

Hypervisor

Even fake computers need resources, but which are free? A hypervisor oversees the sea of hardware and can reserve and allocate its resources to particular systems.

They manage all kinds of hardware. Networking, memory, harddisk and CPU.
The allocation of those resources is done to something called a ‘node’. A node is a so-called ‘virtual machine’ which requires and consumes resources.
It blocks usage for other systems so we are certain we can scale our operations in time.
Nodes run ‘virtually’ on these ‘physical’ machines
Before cloud providers did this for us, we had ‘Mesos’ to manage resources across machines.

Orchestration

We need to run these services on these nodes, and they need to be managed. This is the task of an orchestrator. You might have heard of Kubernetes, or K8S (just a short for when you have to type it very often. It means K + 8 letters inbetween + s, a.k.a. KuberneteS).

Release or add another pod to be connected to the loadbalancer
Create more pods when required (when the load is high)
Enable talking between pods or between nodes

All of the aforementioned is managed by Kubernetes, a.k.a. the ‘Helmsman’ or ‘the man at the wheel’ Perhaps the logo now makes sense ;-).

So how does this work when you put it together?

What you see is:

A request comes in
The loadbalancer sends it
to the right node
The node sends it to another loadbalancer
The request ends up at a pod
The pod forwards it to the right container
The container answers with a response

Not enough resources to stay performant?

The orchestrator senses that a service doesn’t live up to its defined boundaries of performance
Scales pods first (free, we already paid for the resource allocation per node)
Nodes second (at a cost, reserves extra resources from the cloudprovider)
after scaling the system checks the health of the new pods and nodes
the orchestrator adds the new pods and nodes to the loadbalancers
the requests are now routed over more pods / more resources.

So what’s new?

Back in the days we created modules. Modules were also pieces of code that intended to do functional isolation. They sort of aimed for the same thing. What’s different when we do it with containers?

When you wanted to scale 1 piece, you needed to scale the entire thing
Often applications kept a lot of things in memory. Adding a second doesn’t solve that only the first knows the actual ‘state’ which is in memory.
Release trains of the entire thing prevented to quickly iterate and improve software while testing it on the visitor.
Developers were bumping into each other. Pieces do not communicate well
which makes releases prone for Error. Services allow such locality and isolation that they are better testable and teams become more accountable for their own creations.
Modules were conventions, but there were no clear or physical boundaries between them. With containers, the boundaries are at machine level. This makes it virtually impossible to cross those boundaries, and they are way more easy to reason about.

Technically we still aren’t there yet.

Creating services doesn’t automatically absolve your organisation of making mistakes in tight coupling. There are some other technical aspects that we need to cover before the structure can do its magic in isolation.

To prevent isolation happening well on the application layer, but not on the data layer, a data storage can only be accessed and managed by one service only

Because imagine that your application works perfect, but another team changes the data layer. Your application will break, without them knowing they were the cause.

That sounds simple, but it’s huge

Each service contains all the information it needs to work properly
That means that data is stored redundantly over services
A product-list service without some product data isn’t a product-list at all.
Each service needs to be able to fail independently
Expect other systems you depend on to fail or be absent
We need to be able to expect how we talk with each other (Common data model and ‘Contracts’ on ‘endpoints’)
A lot of updates on the landscape. When a product changes, the product-list needs to be aware. This is called ‘eventual consistency’

Eventual Consistency

Since we cannot rely anymore on acid compliancy (a capability to guard data consistency within one single — often relational — database), we need to rely on a different way of ensuring the data is at the right places at the right time.

Making sure that all the redundant data is processed Timely and Non-blocking, while maintaining isolation
Publications and Subscriptions
* Product service publishes a change
* All that are interested in the change subscribe to it
* Process as soon as they can
There is no single point of failure in this way of communicating Data is omnipresent
But you need to be clear on who’s the master of that particular piece of data

It’s not all technology here.

Isolation is super important, but without a well thought-out design practically everything can be isolated and things become overly complex

Solution: Domain Driven Design. Bundle technology in functional business domains
* Helps business to take ownership
* Makes technology easy to understand

I.e. ‘Logistics’ is too big, so reduce to a size a team can work on. Usually ‘business capabilities’ like ‘forecasting’ or ‘sending a text message’ are clear and bound in scope

To sum it up

Going towards a service oriented architecture

Increases connectivity
Lowers complexity per business domain
Heightens complexity in orchestration and enablement of teams
Centralises repetitive complexity and makes it manageable
Enables us to work smarter and better together.
Adds resiliency and performance
Only works when your organisational structure is as segmented as the technology