Allowing Ada to View her Files: Box’s Attribute Based Access Control Solution
As we move toward a future where less and less content is stored on-prem and where companies more and more often pick and choose the best in breed apps for each individual function, security has never been more important. We are now starting to see large established companies finally overcome some of their fear of the cloud, allowing them to be more agile and productive. Cloud companies largely wouldn’t have made these inroads without a large amount of attention to security and compliance. At Box, where we focus largely on enterprise customers, this is especially important to us. Being able to allow only the correct users to access content is at the heart of what we do, and therefore something we think a lot about. There are a number of ways that software companies go about securing objects in their systems. Three of the most common are ACLs, RBAC and ABAC. After a bunch of analysis (which I discuss in a previous post), we determined that ABAC or Attribute Based Access Control best met the needs of our system at Box. In this post, I will discuss the high level architecture of an ABAC system and what that system might look like in practice.
What is ABAC?
ABAC, or Attribute Based Access Control is one of the standard access control systems in computing. It uses policies and attributes to determine if a user is allowed to perform a particular action. For example, if Ada wants to view her file, the system would find if there is a policy related to reading files. If it finds one, it will then run through the policy, filling in any missing attributes, to determine if Ada can access the given file at this time.
Standard ABAC Architecture
The primary standard implementing ABAC is XACML or eXtensible Access Control Markup Language. Based on the name, XACML sounds like it only describes the language/syntax used, but it also describes the system architecture and how the requests and policies are processed. A XACML system contains five main components: the PEP, PDP, PIP, PRP and PAP. The PEP or Policy Enforcement Point intercepts any requests the user tries to make to objects or services in the larger system and redirects them to the PDP. The PDP or Policy Decision Point evaluates the request against a set of policies to determine if the user should have access to the object or endpoint. If any attributes necessary for determining the response of a policy are missing, the PDP will call one of possibly many PIPs or Policy Information Points. The PIPs are responsible for looking up additional data from various services, systems and data stores and know how to call any needed systems. The PRP or Policy Retrieval Point is the system where the policies are actually stored. The PRP is usually either a database or filesystem. The PAP or Policy Administration Point is the interface that allows users to add and update policies — essentially the admin console for the policies themselves.
XACML also describes how requests, policies and decisions are written and structured along with what sets of conditions lead to what results. The most common representation for the requests and policies is XML but json is becoming more and more common and there is an official json representation (so yes, json can be considered valid XACML). There are also different options for how decisions are made. For example, the system can be configured to deny unless a specific permit rule is matched or the system can be configured to permit unless a specific deny rule is present. There are a number of variations on these matching algorithms and the result is that there are four valid responses that the system can produce. The system will return Deny if Ada does not have access to view her document. It will return Permit if Ada does have access to view. The system can also return Not Applicable if it found no policies related to the viewing of documents or Indeterminate if the matching algorithm results in two different matching rules with differing responses and no clear way to decide which one is more important. In addition to the main response, the system can also return an obligation. An obligation is a way to tell the caller that something additional must happen. For example, you could use an obligation to indicate something like the user has permission to view the file but must see an NDA popup first or the user can’t view the document and the admin should be notified.
In addition to the policies, XACML classifies the types of attributes the system may need. There are four main types of attributes: User, Action, Resource and Environment. User attributes are metadata on the accessing user — for example, is Ada an admin user? Action attributes are what action the user is trying to take. Resource attributes are attributes related to the object Ada is accessing. For example, who owns the file? Has her file been quarantined due to questionable content? Environmental attributes are any external attributes. For example, maybe we’ve detected a large number of illegal access attempts from Ada’s IP, so we want to block all accesses from her IP at this time or maybe Ada’s data center’s temperature is too high so we want to limit the number of calls.
ABAC in Practice
Its always useful to see nice diagrams and read definitions, but they don’t always seem to map directly or evenly to the parts of the system when you actually build it. So what does our system look like in practice? Access control can be used at many layers of an overall system. Even within a company like ours, we have at least three access control frameworks securing different levels of our infrastructure and code. We have one to give our employees access to various pieces of software that we use, we have one that we expose to our end users to assign high level permissions to their employees using Box and we have the one I’m talking about here. For this use-case, we are using XACML on top of our newly built microservices. More specifically, we are using it to secure every exposed service API endpoint which is marked as protected. We mark an endpoint as protected by having a developer specify which permission is needed to access that endpoint. This permission is an action attribute which we can use to match a policy. This means that more than one endpoint could all use the same policy or a particular endpoint may have special needs and need a unique policy. For example, the GET /files/id endpoint might have a FILE_VIEW permission assigned to it meaning that we would check if the user calling the endpoint has permission to view the file specified.
The service with the endpoint then uses our PEP library to collect information from the endpoint and from the original request in order to turn that into a XACML request that it can send to our permissions service. Our permissions service consists primarily of the PDP — the engine that evaluates the policies and returns an access decision. In our initial implementation, this service also contained one PIP for fetching attributes and a sudo-PRP for storing the policies. We are currently in the process of moving our policies out into a separate storage location (so a real PRP), but for our first version, the policies themselves were actually kept in the PDP code and loaded into memory on startup. It turns out that when you only have one policy, or even if you have a few dozen, the scale is so small that this isn’t a problem. Our main driver for moving the policies out actually wasn’t scale, but that we wanted to enable other teams to write their own policies without having to push commits to the permissions service codebase. The other thing I found interesting is that even though we have PIPs, these are actually small and are effectively just classes that know how to call some external endpoint. In fact, we found that through some fancy configuration, we could actually use a single PIP for a large number of the services we might want to fetch data from. For our initial implementation, we didn’t implement a PAP of any sort or any external system or UI to write new policies. This is on our roadmap to add eventually, but as a engineering team where the people writing policies are engineers, this is much lower priority than a number of the other pieces of the system.
So to summarize, at a high level, we have a library that knows how to take a call to a service endpoint and turn that into a XACML request to check the calling user’s permission to access that endpoint. We also have a permissions service that process that XACML request and contains all of the rest of the permissions logic. The majority of this service is made up of the PDP.
An Example Request Through Our System
Since everything is a little easier to follow through an example, let’s walk through Ada trying to view her file. (1) Ada makes a request to get her file. This request will hit our API Gateway. The gateway handles many of the things like routing and auth. (2) The gateway will decide which service the request should go to, so it will route to the endpoint on one of our services that knows how to retrieve her file.
(3) This service has specified that this endpoint it protected triggering the PEP library to be called before the main endpoint gets executed. The PEP code knows how to turn the information from Ada’s request into a XACML request which it sends to the permissions service. The permissions service unpacks the request and attempts to match it against the policies it has access to. If it notices that the action attribute from the request (in this case file_view) matches the access attribute on the policy, it will execute the policy. The PDP runs through each rule in any matched policies. If, for example, the rule in the matched policy says that it should return Permit if Ada owns the file, the PDP will need to look up the file’s owner. (4) To look up the owner, it calls the specific PIP that knows how to do this. That PIP then calls whatever outside source it needs to in order to fetch the information. Once the information is fetched, the PDP will continue to evaluate the policy. (5) Once it reaches a decision, this decision is returned to the PEP. (6) If the decision is permit, Ada’s request will be allowed to continue through to the internal service. If the decision is deny, an error will be returned to Ada and she won’t be able to access her file. (7) Either the successful request response or the error will be returned to Ada.
Permissions can be complex and figuring out both what makes sense for your use-case as well as how to implement that solution can be challenging. Now you know a bit more about what ABAC is and what an ABAC solution looks like as a part of a microservice architecture. In a later post, I discuss how we evaluated build vs buy for our ABAC system, some things we considered and how we went about starting our project once we decided our course of action.
This is a very high level and fast overview, but there are some great resources out there. If you want additional overviews, Wikipedia is a great place to start as is the NIST (National Institute of Standards and Technology) ABAC overview. For much more in depth explanations, NIST’s Guide to Attribute Based Access Control (ABAC) Definition and Considerations is very good. Several of the companies providing ABAC solutions also provide great overviews including Axiomatics and Jericho Systems.