SYSTEM TEST AND ENGINEERING SERVICES

Ephemerally Yours, Part 2

Empowering Engineering Efficiency with Infrastructure-as-Code

Joe Zollo
VMware 360

--

In the first of this two-part series, we took a look at our Infrastructure-as-Code solution and some of the problems we resolved. Now, let’s take a technical deep dive into our architectural designs.

File Photo: Engineering Efficiency = More Time for Nyan Cat Videos!

VMware’s digital workspace platform, VMware Workspace ONE, is a leading Unified Endpoint Management solution (UEM) that helps enable IT administrators to manage thousands of devices — smartphones, computers and tablets — with ease.

In the first installment, we discussed how VMware, specifically its End-User Computing (EUC) team, has taken a SaaS first approach to software development — and the fundamental change that comes with said approach. We also discussed how our team developed an internal infrastructure automation platform that enabled and empowered engineering efficiency — but how does it all work?

Obfuscating Difficulty

We’re enabling engineers to be more efficient by reducing the time it takes to build test environments — but we’re also obfuscating the difficulty behind getting these components stood up. Is this generally a good thing? We believe it is, and here’s why.

When it comes to building a test environment, you need to have a strong understanding of almost every component — even those that are not within your engineering scope. Sure, there are some Rockstar full-stack engineers out there, but they’re a rare breed. When you hire an iOS developer — it would be unreasonable for us to expect them to know how to manage and stand up a full Exchange or SharePoint server plus their dependencies.

To get a view of the environment requirements, throughout the course of our development process, we have many discussions with our product teams to identify the most common use cases and testing scenarios. For example, we recently had a request to enable the creation of an Active Directory Certificate Authority with an OCSP responder.[AL1]

Under the Hood

Early on, one of the decisions we had to think about was whether or not to use one of the most powerful automation and Infrastructure-as-Code platforms available today — Ansible!

With Ansible, we can to interface with virtually every component our environment needs. Spin up 6 virtual machines on vCenter or AWS? Easy. Creation of load balancer rules, DNS entries and IP addresses? No problem. We obfuscate the complexity of these tools behind a simple web interface and even API’s for integration into test automation.

Ansible appeals to us because of its extensibility. We’ve leveraged Python and PowerShell to build custom modules and plugins that allow us to further customize environments for use. A great example: conditional DNS forwarders — we’ve built custom PowerShell modules that enable us to dynamically create forwarders in our central DNS infrastructure allowing environments to be resolvable anywhere within the lab.

The Pipeline

When a new build is submitted, our web service will first perform a series of preflight checks to ensure that there are no naming conflicts. We check DNS records, load balancer nodes and pools, and even infrastructure capacity. If our checks pass, we’ll submit the build to the pipeline.

The first stage in the pipeline involves building an inventory file for Ansible that will direct subsequent stages. Every time the pipeline is run, a new inventory file is dynamically generated, which allows the user to make changes to the environment. The inventory is the source of authority here. If there’s a change, the pipeline must make that change a reality.

Next, we build out the necessary infrastructure. We support multiple IaaS providers, but of course, vCenter is our favorite! To ensure that our VM’s have a good starting point, we leverage HashiCorp’s Packer to build highly optimized OS templates — updates, prerequisites and rock-solid security built right in. We’ve partnered with our friends in EUC SaaS Operations to ensure our templates remain aligned with theirs.

Apps on Apps

We’ve provisioned our infrastructure — that was the easy part, now it’s time for the meat and potatoes — applications! To make this process as efficient as possible, our code is architected to analyze the inventory file and spin up additional pipeline branches (parallel processes) based on dependencies.

For example, we can start installing the three Workspace ONE UEM components in parallel while Active Directory is building. The only real constraint here is compute, memory and disk I/O within our control plane.

A simplified view of how our pipeline scales dynamically — based on the desired environment size

Once applications are installed and configured, we’ll run through any advanced network configurations that need to be created. Much like our infrastructure layer, we support a variety of different DNS and Load Balancing platforms — Avi Networks, F5, AWS, HAproxy, Dyn, Cloudflare, Route 53 and Azure DNS — just to name a few.

Everywhere

Strong internal tooling can drive and accelerate product testing and ultimately reduce development time. In developing this platform, our philosophy was simple and straightforward — enable engineering efficiency by removing infrastructure roadblocks, in terms of difficulty and time investment.

Our development has focused on modular components, we want other teams to leverage the Ansible code that we’ve built — Roles, Modules and Plugins. With this type of approach, we can also more easily accept contributions internally. We’re also hoping to contribute some of this back to the open source community!

Thus far, we’ve seen a great response from our internal “customers” and we hope to deliver this solution to more teams across VMware in the future.

--

--

Joe Zollo
VMware 360

(He/Him/His) I’m a Senior Site Reliability Engineer on the VMware Workspace ONE Cloud Services Team.