Deploying Tornjak with Helm Charts

Published in

Universal Workload Identity

7 min readSep 10, 2021

In the previous blogs, Brandon Lum provided an overview of the problems that multi-cloud deployments are facing, and he presented a specific solution called Universal Workload Identity, followed by a deep-dive of the open-sourced project Tornjak — a tool for managing workload identities in the cloud.

Brandon’s examples show a quick start and demonstrate a few basic setup scenarios.

In this tutorial, we will show you new examples of the deployment that provide a reference architecture of workloads that are using Tornjak and SPIRE in the Kubernetes environment.

Deployment of Tornjak Server vs SPIRE Agents

Every deployment scenario provided in this blogpost is made of two steps:

The first step deploys components needed for a centralized organization identity management server, mainly Tornjak UI service bundled together with the SPIRE server, typically installed in the “tornjak” namespace. The deployment is performed by the “tornjak” helm chart, which sets up all components, including the proxy for communicating with the SPIRE server, server configuration, roles, service account, etc. Additionally, this chart contains a plugin for deploying the OIDC extension that is used in the OIDC Tutorial, and the optional multi-cluster configuration setup with various nodeAttestors, as described later.

The second step deploys the components needed for managing identity on the workload hosting side. The “spire” helm chart sets up a Kubernetes daemonSet for deploying SPIRE agents on every worker node, typically in the “spire” namespace. It also deploys an optional instance of The Workload Registrar, that dynamically registers workload entries in the SPIRE server, using a format provided by the Identity Template. More about the workload registrar here.

Once everything is deployed, the Chief Information Security Officer (CISO) can access the Tornjak UI Dashboard interface, to view and manage all the SPIRE cluster configuration and metadata, securely manage agents, and workload identity entries.

Assumptions

These assumptions below support the specific deployment outlined by this tutorial. There are several ways to customize the components, however, following these assumptions makes the process easier.

Ingress Access

To work properly, the agents need to be able to access the SPIRE server. When everything is deployed in one cluster, the communication can be done internally, using services, as described later in the tutorial. But when the SPIRE server is in a different cluster, it requires opening an Ingress that allows for public access.

SPIFFE Trust Domain

SPIFFE Trust Domain corresponds to the trust root of a SPIFFE identity provider. A trust domain could represent an individual, organization, environment, or department running their own independent SPIFFE infrastructure.

During all the deployments we will be using a single SPIFFE Trust Domain.

All workloads identified in the same trust domain are issued identity documents that can be verified against the root keys of the trust domain.

Each SPIRE server is associated with a single SPIFFE Trust Domain that must be unique within that organization. Without federation, only agents that belong to the same Trust Domain can communicate with the SPIRE Server.

The trust domain takes the same form as a DNS name (for example, prod.acme.com), however, it does not need to correspond to any DNS infrastructure.

More about SPIFFE Trust Domains.

Persisting Server Data

Tornjak and SPIRE servers are deployed as a Kubernetes StatefulSet. To persist the SPIRE data, the server keeps its artifacts in persistent storage. This can be done using HostPath — directly on the host system, or through persistent volumes.

Deployment Scenarios

This section describes various deployment scenarios, sorted by complexity, starting with the simple ones. We suggest running them in the specified order.

Instructions are based on helm charts and scripts available in our open-sourced Trusted Service Identity repository, so go ahead and clone it now, as described in this documentation.

Deployment Scenario List:

Tornjak and SPIRE agents in one cluster
Tornjak and SPIRE agents on OpenShift
A single Tornjak and multiple SPIRE agent sets in multiple clusters, in multiple clouds

Deploy Tornjak and SPIRE agents in one cluster

This is the simplest scenario that we are using in this tutorial. All the components are deployed in one cluster, separated by the namespaces. For this scenario, we don’t need any external communication setup (e.g. Ingress). SPIRE agents can communicate directly with the SPIRE Server via Services.

CISO can access and manage Tornjak UI directly using HTTP

Single Cluster on local minikube or kind

This scenario requires a local Kubernetes runtime like minikube or kind. The detailed instructions to follow, including the Prerequisites, are provided here.

First, we deploy the SPIRE server using the helm chart, then we skip the Ingress setup, and go directly to accessing the Tornjak Server as documented here.

After successfully testing the access to Tornjak, we can move on to the next stage — deploying the SPIRE agents. The agents need certificates to securely communicate with the SPIRE server via TLS. Certificates are kept in “spire-bundle” configMap in the “tornjak” namespace. Because we will be deploying the agents in the “spire” namespace, we need to copy the certificates there as well.

This command does the trick:

kubectl -n tornjak get configmap spire-bundle -oyaml | kubectl patch --type json --patch '[{"op": "replace", "path": "/metadata/namespace", "value":"spire"}]' -f - --dry-run=client -oyaml > spire-bundle.yaml

It creates the “spire-bundle.yaml” file that can be used for creating the certificates in other places. This is a useful command, and we will use it in all the scenarios.

So, let’s deploy it:

kubectl -n spire apply -f spire-bundle.yaml

Now, let’s just set up access to the SPIRE server. We can use the “ExternalName Service” method because everything is deployed in the same cluster, and then we can run the deployment using “spire” helm charts.

When deploying SPIRE agents, we also deploy The Workload Registrar, which dynamically registers workload entries in the SPIRE server, as Kubernetes pods are created. This last step requires manual registration of the Registrar and then we are ready to deploy the test workload. These last steps are described in the following document.

Once the Registrar is properly registered, you will see new entries showing up on the Tornjak interface.

Now we can deploy a sample workload, view it on the Tornak Entry List, and then test the process for obtaining its identity, as described here.

Deploy Tornjak and SPIRE agents on OpenShift

The next scenario explores a deployment on OpenShift. Since OpenShift is a cloud-based Kubernetes container platform, the main benefit of deploying in there, is its ability to be Cloud provider agnostic. This means there is the same interface and behavior no matter the underlying cloud provider. Additionally, it introduces elevated security and integrated Continuous Integration/Continuous Delivery (CI/CD) Solutions, as well as being highly scalable.

For this example, we will use Red Hat OpenShift on Kubernetes (ROKS) in IBM Cloud. To get a test cluster on ROKS, follow the steps outlined here: https://www.ibm.com/cloud/openshift

Tornjak deployment is similar to the local cluster, but since OpenShift introduces additional security constraints, there are a few, additional steps outlined in this document.

The main benefit of this exercise is that now Tornjak server is publicly available, so we can easily access it from other locations, other clusters, and even other clouds.

Deploy a single Tornjak and multiple SPIRE agent sets in multiple clusters, in multiple clouds

Lastly, we will set up a multi-cloud solution where Tornjak and SPIRE server are hosted in one cluster, and workloads with the SPIRE agents are deployed in different clusters hosted in different clouds (IBM Cloud and Amazon EKS).

The benefit of keeping the Tornjak and the SPIRE server separate from the SPIRE agents and workloads is the threat model. The blast radius attacks initiated from inside the workload would be limited to the agent side, and they should not affect the SPIRE server configuration. Another advantage is the ability to use one the Tornjak instance to manage multiple clusters of agents.

When we deploy workloads in clusters that are different from the one that is hosting Tornjak, the SPIRE agents must be able to communicate with the SPIRE server using a public interface. This is typically done by setting up Ingress or External Routes that expose external communication with the cluster. Ingress setup depends on the Cloud Service Provider.

Another challenge is that SPIRE Server must be able to trust the remote agents. This is done through the node attestation. The default nodeAttestor is the Kubernetes attestor (“k8s_psat”). The idea is as follows — we need to capture the portable KUBECONFIG files for every remote cluster and pass them to the SPIRE server, so it can make a callback to each remote cluster, and pull Kubernetes level information about each worker node. This was not needed when the SPIRE server was running in the same cluster as the agents.

In addition to the Kubernetes attestor, there are also cloud-provider-specific attestors that we are experimenting with. At the moment of writing this blog, we were able to successfully use AWS nodeAttestor.

We suggest using the OpenShift cluster created in the previous exercise to host the central Tornjak server. We can either reconfigure the existing Tornjak deployment or reinstall it using the multi-cloud setup.

The required steps are described in details in the mulit-cluster document .

What’s Next?

Hopefully, you found this set of examples useful, and now you are ready to deploy actual applications that take advantage of the Universal Workload Identity approach, to better control access to services or applications hosted in multiple clouds.

Our next tutorial will discuss Templates for Custom Identity format. This functionality comes in handy when we are using the multi-cluster solution. It is a good idea to pass the cluster-specific information like the cluster-name, region, or data-center location together with the identity. All this additional information can be configured on the cluster level during the SPIRE agents’ deployment. For now, see more info here.

We will also focus on providing instructions for setting up OpenID Connection (OIDC) as a secure solution for multi-cloud remote access to AWS S3 storage, and secrets managed by HashiCorp Vault, using a single Tornjak server that provides better control over security and identity auditability for your organization.