Embrace the change or why to consider self-service infrastructure

Kseniia Ryuma
HashiCorp Solutions Engineering Blog
6 min readFeb 24, 2021

We live in a unique world. On one side we have self-driving fully automated vehicles that can get us from point A to point B. On the other side, the majority of companies today still operate on a manual, ticket-based approach when infrastructure changes are needed. We managed to automate our transportation, it’s about time we automate the infrastructure for our business.

Like Rome, automated self-service infrastructure can not be built in a day. Yet, success can be achieved through establishing proper leadership, culture, and tools. Before we jump into details on how to accomplish automated self-service infrastructure, it is important to define the meaning so we are on the same page. While a self-service infrastructure provides a library of approved infrastructure units that enable its users to easily provision infrastructure, the automation part assures that everything that is being provisioned is compliant and governed. If your organization is already using a library of pre-approved and tested infrastructure units, it does not necessarily imply that your infrastructure is self-serve from end to end.

As an example, your company can leverage Terraform open source (OSS) with the Terraform Open Registry. HashiCorp Terraform is the world’s most widely used cloud provisioning product with high adoption in enterprise-level companies. With Terraform you can manage a broad range of resources, including hardware, IaaS, PaaS, and SaaS services on any environment (AWS, GCP, Azure, or on-prem). Terraform open source, however, is not embedded with “self-driving” capabilities. Terraform OSS is appropriate for a solo pilot but not for enterprise scale. Consider the image below. It highlights some limitations of the operation that we’ll discuss below.

Until now, across the majority of organizations, all processes around infrastructure have been taken up in functional silos.

“You are in danger of creating another silo. I don’t think the sheer number of choices in terms of infrastructure technology is the problem. I think it is how people make the choices.”
— Brian Dawson

Due to the lack of standardization of tools, there is no way to monitor infrastructure that’s been provisioned and configured which, in turn, leads to over-provisioned or orphaned infrastructure that drives additional costs to the business. Assume your organization already leverages infrastructure as code through Terraform OSS (or any other tool). While IaC adds a lot of value to the IT environment, there are some challenges that cannot be overlooked.

  1. Team Management — Most IaC tools are CLI-based. As a result, there is no granular Role-Based Access Control assigned to individuals who work on a specific unit of infrastructure. The reality is, everyone has access to all git repos where the whole organization’s IaC is being stored. Additionally, in some organizations, a majority of developers also have access not only to the console (where they can log in anytime and make manual changes), and also all secrets (it might be API Tokens, DB Credentials, TLS Certificates, and so on) on the organizational level. That approach can lead to security vulnerabilities.
  2. Service Governance & Policy — When you push changes into infrastructure, how do we ensure that it’s mandatory to use tags in our configurations? How do we monitor the size of the instances that are used in production or development environments? How do we prevent instances from being included in forbidden AIM actions? Without policies, you can provision infrastructure resources with security group rules that have the CIDR “0.0.0.0/0” for egress rules. One error like that can expose your private resources to the public internet. Hopefully, you see the value of building the well-thought guardrail that will verify what kind of infrastructure is being provisioned out there.
  3. Advanced Security — Many organizations today using identity management tools like Okta, Active Directory, SAML. With Single Sign-On IT can manage any employee’s access to any application or device. By using SSO, your organization can centralize the management of users' applications, providing greater accountability and security for an organization’s identity and user management. There are almost non of the IaC tools that provide SSO build-in integration. The one that includes that capability, we will discuss later.
  4. Monitoring — CLI-driven workflows do not allow you to create a cohesive picture of how your infrastructure is being run. Without a single platform and audit logs, an organization will be lacking visibility into key events on the infrastructure that they manage.

Problems like inconsistent environments, longer provisioning time, manual intervention, and lack of end-to-end orchestration arise when there is a lack of standardized platform to rely on. In addition, it is hard, almost impossible, to take control of role-based access which eventually leads to security compromises. The recent Solarwind tragedy highlights the importance of keeping your software and hardware under close control and supervision. Finally, a lack of standardization in provisioning, as shown above, prevents organizations to scale to an enterprise-level operation.

By introducing standardization through self-service infrastructure provisioning, an organization can maximize the use of IT resources, save costs, empower end-users, and ensure security excellence. Ultimately, you want to provide your developers the ownership and access control to provision infrastructure as they need. Additionally, to be able to keep track of provisioned resources, centralization and standardization are essential. This way, one can govern that infrastructure, check for the security, compliance, governance, and operational best practices.

In adopting a new way of operation, the difficulty is not the lack of systems but rather enabling the transition to the new operational model. As a result, change is only possible through a gradual cultural shift on the level of the whole organization. The organization can achieve agility at scale only when successful collaboration across all teams is involved. With the Terraform Cloud platform (which includes Team Management, Governance, Advanced Security, Monitoring, and many more) organizations can bring standardization and a central approach to manage infrastructure among all key players.

Business Line Leader — Monitor data and make calculated decisions. The audit trail provides transparency and defense of records for compliance. A company may use the audit trail for reconciliation, historical reports, future budget planning, and/or risk management.

Policy Owners — Reduce risk with a single workflow to secure, govern, and audit regardless of who provisions. The security team can assemble a security sandbox that will prevent out-of-policy infrastructure from being provisioned. That approach will also assure that all infrastructure that is being provisioned through Terraform Platform will be compliant and governed.

Dev Team — Increase developer agility by allowing developers to provision their own self-service infrastructure without an operator bottleneck. Developers can focus on managing IT infrastructure through the infrastructure as a code approach. Terraform Cloud platform manages Terraform runs in a consistent and reliable environment. That makes the whole software development life cycle more efficient, raising the team’s productivity to new levels.

Ops Team — Increase operator productivity by allowing them to serve more infrastructure requests with predefined modules. Modules are small, reusable Terraform configurations that let you manage a group of related resources as if they were a single resource. The modular approach makes infrastructure to be reusable.

End-User — Automate the provisioning. An organization with a ServiceNow integration can enable its end-users to provision resources by selecting them through Terraform Catalog. A user, without any knowledge of how Terraform Configuration files work can request and stand up — VPC, a new instance, or application.

Finally, the Terraform admin can log in to the platform and oversee all the commits, versions, and changes of the infrastructure through a user-friendly interface.

Terraform Platform focuses on the methodology in which the balance of people and process drives the action. Operators need access to infrastructure on demand so that a self-service option provides a much faster workflow than having to make requests to central IT service. Business leaders must ask themselves if competitive advantage, security risk reduction, and revenue increase are important constituents for them, if yes, then organizations need to embrace the change and embark on the automation journey.

--

--

Kseniia Ryuma
HashiCorp Solutions Engineering Blog

Solutions Engineer at work 👩‍💻| Lifelong champion of self-love and personal wellness 🍏| Femininity Advocate 💜