Engineering the Next Generation of Cloud Governance

Cloud Custodian

Drew Firment
Capital One Tech
5 min readOct 25, 2016

--

The Need for Cloud Governance

The adoption of Amazon Web Services is fast becoming mainstream among large corporations. An increasing number of enterprises are seeking to reduce cost, accelerate their speed to market, and enable innovation by eliminating traditional data center constraints.

As organizations increase the size of their public cloud footprint, so does the complexity of managing the cost, compliance, and health of their accounts. While existing corporate governance models can be leveraged to provide oversight, they often lack the integrated cloud engineering expertise to effectuate a desired change in enterprise usage patterns.

The Challenge of Scale

Most enterprise implementations on AWS will consist of a collection of applications, deployed to multiple VPCs, contained within an account structure, and associated with a geographic region. The large number of AWS accounts that an enterprise manages is often driven by their organization construct, billing considerations, access management policies, and architecture principles.

Ideally, the automation of your delivery pipeline with advanced CI/CD techniques will drive consistent and compliant deployment amongst the fleet of applications within each of your accounts. In reality, the optimization of delivery pipelines will be constructed alongside “shift and lift” migrations from legacy data centers for many enterprises in the early phases of cloud adoption.

As a result, organization migrating to the cloud are managing a large number of AWS accounts with varying levels of maturity and compliance. The number of unique snowflake patterns will be compounded by the shift of accountability to local AppDev teams embracing DevOps, each empowered by the mantra — “you build it, you run it.”

Engineering the Next Generation of Cloud Governance

In response to this challenge, IT enterprise organizations will benefit from embedding a dedicated engineering role within their governance structure — a Cloud Custodian Engineer.

A Cloud Custodian Engineer is responsible for consolidating many of the ad-hoc engineering responsibilities of cloud management, focused on the automation of policies and controls that govern your fleet of enterprise accounts

The Cloud Custodian Engineer is a senior-level engineering role based on its open source namesake. The Cloud Custodian open source tool provides an engineer with a lightweight and flexible framework for deploying cloud management policies and controls at scale.

Organizations can use Cloud Custodian to manage their AWS environments to ensure the hygiene of security and operational policies, sanitization of nonconforming resources, and implementation of cost management controls.

The engineering of cloud governance translates into the ability to codify policies, amplify feedback on compliance to defined controls, and automate corrective action that can be applied across entire fleets.

The Cloud Custodian Engineer

Infrastructure as Code (IaC) offers organizations the capabilities to effectively manage and automate the oversight of their virtual data centers. A Cloud Custodian Engineer can harness the power of IaC to effectuate enterprise compliance to align with desired end-state patterns.

  • Operational Controls: Organizations need to establish a robust framework for governing and tracking the operational configuration of deployed AWS services. Common issues the Cloud Custodian Engineer will encounter include stale Amazon Machine Images (AMIs), orphaned EBS volumes, and services deployed without resiliency options enabled.
  • Security Policies: While most AppDev teams will integrate security policies into their Cloud Formation Templates (CFT), it’s imperative to establish mechanisms for early detection and remediation of non-compliant workloads. The Cloud Custodian Engineer should be vigilant for the absence of encrypted services and permissive access control lists.
  • Monitoring Limits: Even with the elasticity of the Cloud, resources are finite. Each service within an account has specific limits that need to be understood and closely monitored. In addition to tracking the AWS service limits, the Cloud Custodian Engineer should carefully review the allocation of CIDR blocks and utilization of IP addresses.
  • Cost Optimization: Resource management tools like AWS Trusted Advisor enable the Cloud Custodian Engineer to identify underutilized resources, and provide insight on opportunities to purchase reserved capacity. Integrating these capabilities into a cloud governance framework driven by an engineering mindset can yield higher returns as usage patterns can be more readily optimized.
  • Evangelism and Education: The Cloud Custodian Engineer is well-positioned to engage directly with the engineering community to solicit feedback and offer guidance. While the policies and controls that govern developers may be well-intentioned, regular check-ins with users provide an opportunity to inspect current patterns, share best practices, and adapt to emerging needs.

As enterprise IT pivots to the leverage the Cloud, the complexity of adoption, migration and transformation requires strong governance.

Paramount to your success is ensuring that senior-level engineering expertise is tightly integrated within the cloud governance model, or Cloud Center of Excellence (CCoE). Doing so will accelerate the consistency and compliance of your fleet through the automation of policies and controls.

A Federated Model to Enterprise Services

The size, maturity, and culture of an organization are key factors to consider when introducing this role into an enterprise. Should a central team dictate the automation patterns for an entire enterprise? No. Should local teams repeat mistakes and implement inefficient snowflake patterns when enterprise best practices exists? No. There is a balance, and organizations will need to adapt their approach as they mature and transform.

Ideally, federated delivery teams have the right amount of local authority and control to engineer solutions that meet their specific needs — but not at the expense of duplicating efforts.

To encourage the right behavior, enterprise services should be exposed to local delivery teams via inner-sourcing. This approach to enterprise services provides federated teams a mechanism to engineer solutions to solve local challenges, while contributing to the success of the larger system.

As the enterprise services improve and scale based on contributions from federated delivery teams, the role within a Cloud Center of Excellence (CCoE) can shift from innovation towards product management. The CCoE should ensure the evolving product aligns with the overall vision, broader organizational considerations, and external regulations.

With tools like Cloud Custodian, local ‘control’ is achieved through federated config files that determine the extent to which enterprise services are executed within a specific account. Local accounts leverage the enterprise capabilities served from a central account, but use their config file as levers to fine tune the local actions. This approach offers federated teams the discretion to either apply liberal policies or draconian measures — depending upon their specific needs and maturity.

Based on early influences from adrian cockcroft, my preference is to “turn your monkeys to eleven”. Enforce the right behaviors from the beginning using automation, instead of trying to modify behaviors in later phases of cloud adoption — psychology is much more difficult than technology.

About the Author

Drew Firment is a Director of Cloud Engineering with a passion for driving the enterprise adoption, migration, and cultural transformation of Cloud Computing. Follow on Twitter @drewfirment.

This post originally appeared on Capital One DevExchange

For more on APIs, open source, community events, and developer culture at Capital One, visit DevExchange, our one-stop developer portal. https://developer.capitalone.com/

--

--