OpenShift — Yet Another Project Abstraction Layer
Many different teams are needed to implement and run a complex business model. Usually, there are teams that cover their very own business scope and most of the time those teams depend on other teams. There are three types of dependencies here:
- runtime dependencies (for example the shopping cart team depends on the user team),
- development dependencies (like every dev team depends on the operations team, or on the DWH team to figure out whether the product shape fits the market the best),
- organisational dependencies (like tech teams depends on the HR team, in order to get colleagues with expertise they need).
In our case, we leave the organisational dependencies out of scope as well as the development dependencies and focus on runtime only. Runtime dependencies are actually thought to be easy to manage since we all build microservices. In reality, it is far from being easy. The things that can go wrong are:
- bidirectional or/and circular dependencies,
- multiple source of truth,
- bottlenecks.
Not that there are no ways to resolve these issues. There are a bunch of design patterns like Bounded Context that tackle this problems down (theoretically) encouraging us to focus on business domain isolation and so reducing all dependencies that don’t explicitly belong to a business scope. But in practice things get complicated very fast. In a distributed environment you have no private or protected class methods anymore; your upstream and downstream dependencies are living in a network.
In order to have designed boundaries in a distributed architecture, it needs a lot of infrastructure efforts. Efforts that are supposed to be automated. One of the solutions to solve this dilemma is OpenShift.
Ground Zero
Let’s start first with a bit of history, because OpenShift has already been out for a while now. Its first release was published in May 2011, more than two years before Docker even existed, and three years before Kubernetes did. The initial idea about it was to provide a Platform as a Service or shortly PaaS for “application developers and teams to build, test, deploy, and run their applications” based on so called “Gears”, Red Hat’s name for Linux containers at that time.
In September 2013, Red Hat announced a collaboration with Docker and Openshift and so libcontainer was born — a community driven containerization standard. In June 2015, the third generation of OpenShift was released along with Docker and Kubernetes as first class citizens. You can imagine it didn’t have that much in common with the previous generations of OpenShift.
But at the end of the day, OpenShift V3 still sticks to its original goal which it was first developed for — being a single interface for everything from build to deployment and ending up controlling your application state. Sounds pretty much like it targets both runtime and development dependencies concerns.
The way it does is pretty obvious — first of all we get a homogenous common domain specific language (DSL) through the whole application lifecycle. Dependencies become a matter of YAML/JSON declaration specifying a series of Objects, the rest like Policies, Constraints, Service accounts are about organizing your team’s working scope.
The range of OpenShift objects spans the Kubernetes objects with a few custom types.
Project as a Bounded Context
Imagine a team is providing a checkout process for buying a product. First thing we would do in Openshift is to create a project for it:
$ oc new-project product-checkout
Project, which is build upon Kubernetes namespace, enables us a base level of isolation in “environments with many users spread across multiple teams”.
It pretty much the same as deploying a VM instance over cloud provider like AWS, with basic properties like:
- restricted amount of people is able to login there
- network boundary
Sounds like we got some isolation already for free. Let’s take a look at network boundary in detail. Hostname, is accessible as it is for anyone who thinks it needs it, does not really isolates much. There is definitely a need of some proxy before the instance.
Classical approach would be the following:
The OpenShift way to do it looks like this:
Code Snippet 1
apiVersion: v1kind: Servicemetadata: name: upstream-servicespec: ports: - name: REST port: 8080 selector: app: upstream
In the classical approach your app needs to know the address of where it has to register on the API Gateway in order to be able to apply requests. In OpenShift it’s fine on its own, it is not the app’s job to integrate itself in a microservice topology. So in OpenShift’s case we have one critical runtime dependency less, without losing any of resilient aspects that the API Gateway provides like load balancing, failover and so on.
Let’s imagine a team that delivers an upstream service becomes sensible about who is allowed to talk to it. Maybe they do worry about the load, or they just want to know what teams will be notified if they introduce some breaking changes. There is no standard way to do it with an API Gateway, because a downstream instance just doesn’t have any identity in terms of API Gateway (you have to work with OAuth, Firewalls and whatever techniques enabling controlled access)
OpenShift, since version 3.5 has Network Policy. In order to manage the access to the pod on your own, all you need to do is to provide a NetworkPolicy Object (Code Snippet 2).
Code Snippet 2
apiVersion: extensions/v1beta1kind: NetworkPolicymetadata: name: upstream-networkpolicy-for-downstream-projectspec: podSelector: matchLabels: app: upstream ingress: - from: - namespaceSelector: matchLabels: project: downstream-project ports: - protocol: TCP port: 8080
As a final result, we have a masterpiece of isolation where:
- instance is visible just for those services, that it is explicitly allowed to do it,
- the application code just needs to care about the business logic and nothing else.
And it is transparent. You can always see that network policies are applied to this project
$ oc get -n upstream-project networkpolicies NAME POD-SELECTOR AGE upstream-networkpolicy-for-downstream-project app=upstream 1m
Conclusion
Boundaries are not easy. Both from domain and infrastructure point of view. Big IT companies have already experienced this as well, ending up with a death star architecture.
OpenShift made an infrastructure part of isolating things easy. It makes this explicit and as a result transparent for everyone. From now own it’s possible to evaluate your whole dependencies chains by talking to the OpenShift API. The downside of this is exposing your whole infrastructure to more or less a single vendor. Though you are still free to accomplish things like building the application outside of OpenShift and just rely on the deployment, its concept of project is after all just a Kubernetes namespace on steroids. The whole isolation we talked about in this article boils down to namespace. After all, the question that we have to answer, is how much added value you will obtain by isolating your runtime dependencies. If you see that value then definitely you have another good reason to give OpenShift a try for multi-teams environment.
If you found this article useful, give me a high five 👏🏻 so others can find it too, and share it with your friends. Follow me here on Medium (seva dolgopolov) to stay up-to-date with my work. Thanks for reading!