Box Cloud Management Framework: Multi-Cloud IAM Meta Model (Part 1 of 3)
Enterprises today, both small and large are choosing to leverage the Public Cloud to power their applications. In many cases, these enterprises are choosing multiple Public Cloud providers. The reasons vary, but they are typically for one or more of the following: avoiding vendor lock-in, a lever to drive better pricing, leverage best of breed cloud technologies, and as a hedge against disasters.
As Enterprises make the choice to consume multiple Public Cloud providers, it is critical to ensure that emphasis is put on building the right foundational capabilities. One of the most critical (in our opinion) is the establishment of a consistent Multi-Cloud Identity and Access Management architecture (IAM) and governance model. This is essential to ensure:
- “Principle of Least Privilege” over who (users and programmatic) has the right access, to the right resource(s), at the right time is consistently implemented
- Visibility into when, where, and why cloud resource(s) were accessed, so that you have a complete audit trail and consistent governance. This is often a must for compliance reasons in some industries.
- Architectural consistency in how you implement IAM capabilities across each cloud
In our original blog post, Box Cloud Management Framework: Our Journey to Delivering a Cloud Management Platform, we introduced Multi-Cloud IAM as one of the 5 capabilities we believe are required based on our hybrid, multi-cloud architecture and our identified operational challenges. In part one of this mini blog series, on Multi-Cloud IAM, we will introduce some of the challenges we faced in our multi-cloud approach and how we defined an IAM meta-model to help address those challenges. In part two, we will dive deeper into our Multi-Cloud IAM challenges and how we approached resolving each of them using our IAM meta-model. In the final blog, we will illustrate how we used that model to apply a consistent IAM implementation across all of the cloud providers we chose to operate in.
Multi-Cloud IAM challenges
Before Box established a well-defined Multi-Cloud IAM model, we faced a number of the classic challenges any organization faces when beginning to leverage a Public Cloud provider run their applications. This included typical account sprawl, unexpected costs due to overconsumption of cloud resources, security concerns due to lack of governance, and significant gaps and inefficiencies in meeting security and compliance requirements. In addition to those challenges, as we expanded our use of more than one Public Cloud provider, a number of other areas began inhibiting our ability to manage our IAM across multiple clouds in a consistent manner:
- Inconsistent ownership of Organization Root administrator credentials
- Inconsistent support for federated Identities across our clouds
- Visibility into who as access to what resources
- Lack of automated Identity and Access Management Life Cycle Management
- Lack of consistent role definitions and role assignments
- Limited input from Box Persona’s on Role Definitions
- Lack of a common resource tagging mechanisms
- Inconsistent naming conventions
- Lack of consistent process to secure and manage programmatic application identities
- Lack of consistent and efficient process to provide critical evidence for compliance audits
As we started our process to define our Multi-Cloud IAM approach, we evaluated a number of cloud specific best practices, talked to a number of industry analysts, and had many discussions internally about how to structure our approach. The Multi-Cloud IAM meta-model we describe in this blog is the result of these efforts.
Multi-Cloud IAM Architectural Meta-Model
In order to ensure a standard governance model over all chosen cloud providers, it is imperative that you define an IAM model that will allow you to apply and manage it regardless of the cloud specific implementation details. The model we’ll define below is loosely based after the The Open Group Architecture Framework (TOGAF). In particular, we used a simplified 3-step Architecture Model to drive the Multi-Cloud IAM Implementation:
- Define a high level Architectural Meta-Model. This architectural meta-model will capture key architectural principles, vision, and requirements. It will help to define and drive specific architecture investigation areas that will ultimately be used to implement parts of the overall architectural vision.
- Define Architecture recommendations with detailed investigation areas to resolve inefficiencies in meeting security, compliance, and other stakeholder requirements while managing the lifecycle of IAM across multiple cloud providers. This included a number of proof of concept implementations to demonstrate specific parts of the architecture, reduce risk, and understand areas of complexity that might inhibit long term supportability.
- Drive an Incremental Architecture implementation approach. Rome wasn’t built in a day and you should not expect to build a complex IAM model across multiple clouds in a single day (or quarter) either. This will be a multi-year effort in order to ultimately implement the overall architecture. It will require roadmap planning and prioritization to ensure you are implementing the high priority parts of the overall architecture in a phased manner.
The following diagram depicts the our overall IAM meta-model
Terminology
One of the first things you will notice as you look into the details of each of the leading Cloud Providers is that there are significant differences in the terminology used to describe their IAM capabilities. So, before we dive into the details of the Multi-Cloud IAM Meta-Model, it is critical that we define some terms that will allow us to reason about conceptual aspects of each cloud in a standard way.
IMPORTANT NOTE: The Organization Group construct is intentionally not depicted in the diagram as it can be used in a number of different ways (depending on your specific implementation) to group resource and shared resource groups. In our implementation examples below, we will show how we used this construct at Box to represent a consistent structure across all of our cloud providers.
Meta-Model Overview
The Multi-Cloud IAM Meta-Model basically codifies how we think about IAM and where we want to ensure we have proper governance. There are basically 3 primary areas:
I. Identities cover the federation model and how access restrictions are managed via roles and permissions.
II. Organization provides our methodology for how to layout the Organization and Resource Groups to manage all cloud resources throughout our environment.
III. Policies provide a methodology for how we define and apply various constraints and conventions to establish guardrails and governance across the organization
Along with the 3 primary areas, we have also defined 3 other constructs that can be used as part of this meta-model:
- Test Organizations provide a critical capability to allow validation of high risk changes in a safe environment prior to applying to your primary organization.
- Custom Roles will often be required to add more granular permissions than what is defined in the providers pre-defined roles.
- Decommission Organization Group(s) are used to stage the removal of resource groups and associated cloud resources to protect against accidental removal or unexpected issues that may result from removal of those resources.
We will illustrate examples of how you could apply this meta-model to multiple cloud providers in part three of this mini blog series.
Meta-Model Principles
As we defined our model, we decided to establish a set of core principles that would help drive key decisions throughout our investigation, PoC, and implementation process. These principles are defined below:
- Minimize the impact that the variability in Cloud Provider implementations has on the ability to automate recurring operational requirements.
- Define a common federated identity model so that we can ensure we can onboard and off-board identities to the cloud in a consistent and operationally efficient manner.
- Minimize the number of roles and resource groups that are defined across each Cloud Provider. This will ensure efficient automated management of roles and resource groups across a diverse set of Cloud Providers that each have very different implementation requirements.
- Minimize the permissions applied to each role along with applying appropriate policies to ensure we follow the principle of least privilege
- Define consistent role and resource group names across all Cloud Providers. This will ensure we can audit and check for compliance issues across each cloud provider with a consistent process.
- Define a consistent tagging model that ensures we label resources consistently across all Cloud Providers. This will help facilitate easier billing and forecasting optimizations across each cloud provider.
- Define enforcement mechanisms that automate compliance checks for proper naming, tagging, network policies, and other security related controls.
Conclusion
Our initial blog, of this mini series, should provide some insights into how we developed a Multi-Cloud IAM Meta-Model to serve as the basis for our IAM governance model. We are continuing to learn as we use this Meta-Model to implement our IAM model across our multiple cloud providers. Our goal is to continue to share this information with customers and partners so that we can provide value to others that may be on a similar journey. In Part Two of this mini blog series, we’ll go into a bit more detail on the Multi-Cloud IAM challenges and how we approached resolving them.
If you are interested in joining us, please check out the open opportunities at Box.
Special thanks to the following people for their detailed reviews and comments:
- Luis Hernanz, Principal Architect, Box
- Matt Bowes, Staff Security Engineer, Box
- Xaviea Bell, Senior Site Reliability Engineer, Box
References