Governance for Multiple Teams Sharing a Nomad Cluster

Roger Berlind
HashiCorp Solutions Engineering Blog
3 min readMay 10, 2020

HashiCorp Nomad is an easy-to-use and flexible workload orchestrator that enables organizations to automate the deployment of any applications on any infrastructure at any scale across multiple clouds. While Kubernetes gets a lot of attention, Nomad is an attractive alternative that is easy to use, more flexible, and natively integrated with HashiCorp Vault and Consul. In addition to running Docker containers, Nomad can also run non-containerized, legacy applications on both Linux and Windows servers.

A Single Workflow Across Multiple Clouds

Introduction

Like all of HashiCorp’s solutions, Nomad has both open source and enterprise versions. Nomad Enterprise adds features that enable multiple teams within large organizations to run their applications on shared Nomad clusters without interfering with each other.

On May 13, 2020, I delivered a webinar called “Governance for Multiple Teams Sharing a Nomad Cluster” focused on three of these Nomad Enterprise features, Namespaces, Resource Quotas, and Sentinel Policies. Together, they allow multiple Nomad teams to share Nomad clusters without interfering with each other. I also demonstrated these features and Nomad Access Control Lists (ACLs) with a hands-on Instruqt track that you can try yourself. This track contains an updated version of a demo I delivered in another webinar on March 27, 2019. The demo illustrates how a security team can restrict which Nomad task drivers and Docker images can be used by teams deploying applications to a Nomad cluster.

You can read the original blog post associated with that webinar here. That post introduces Nomad’s architecture, describes the four Nomad features mentioned above, and describes the original Nomad Multi-Job demo that I delivered in that webinar.

In this blog post, I will only cover changes that I made to the demo while porting it from the original GitHub repository to the new Instruqt track.

Changes to the Demo

The original version of the demo used Nomad 0.8.6 and Consul 1.3.0. The new version used by the Instruqt track uses Nomad 0.11.1 and Consul 1.7.2. While Nomad has added many new features in the last year, there were no significant changes in the Nomad governance features.

The biggest change in the demo is the way it is deployed. The original demo used Terraform to deploy the Nomad cluster and used Terraform’s Nomad provider to create the Nomad ACLs, resource quotas, namespaces, and Sentinel policies. In contrast, the Instruqt track uses bash scripts to provision the Nomad cluster and uses the Nomad CLI to create all the Nomad constructs. This change was made so that people using the track could create these themselves.

Additionally, while the original demo ran in AWS, the Instruqt track runs in Google Compute Cloud which is Instruqt’s preferred cloud provider.

Another significant change is that I added a new Nomad ACL policy that gives a new user, Charlie, the ability to override Sentinel policies that have the soft-mandatory enforcement level. In the README.md of the original demo, I had mistakenly suggested that the user Bob could have overridden the restrict-docker-images.sentinel policy if he had wanted; but that was not actually true since Bob’s ACL token did not have the sentinel-overridecapability for the qa namespace. The new policy, override, gives Charlie that capability for all 3 namespaces used in the demo.

The demo flow is the same as in the original demo with the following exception: after Bob modifies and deploys the webserver-test.nomad job, Charlie modifies it in a way that violates the the restrict-docker-images.sentinel policy and then re-runs the job and overrides the violation of that policy.

Conclusion

In this blog post, I announced the May 13, 2020 webinar, “Governance for Multiple Teams Sharing a Nomad Cluster” and described the changes to the original demo that were made for the public, hands-on Instruqt track that readers can use to learn about Nomad’s governance features themselves.

You can view my slides and a recording of the webinar here.

--

--

Roger Berlind
HashiCorp Solutions Engineering Blog

Roger is a Sr. Solutions Engineer at HashiCorp with over 20 years of experience explaining complex technologies like cloud, containers, and APM to customers.