AuthZ: Intuit’s Unified Dynamic Authorization System

Published in

Intuit Engineering

11 min readAug 16, 2021

This Intuit blog post is co-authored by Vice President and Fellow, Mallik Mahalingam, Distinguished Engineer, Thomas Barnes, Distinguished Architect, Snezana Sahter, and Principal Software Engineer, Bala Dutt

Intuit’s mission is to Power Prosperity Around the World and our strategy for delivering on that mission is to be an AI-driven expert platform. This is an open, trusted and easy-to-build-on platform where Intuit and others partners solve the most pressing customer problems and deliver awesome experiences. This platform is used by individuals, small businesses , financial experts (outside Intuit and within Intuit) and partners.

Trust is a key piece of the platform and delivering trust at a planet scale is where authorization comes in. Authorization concerns itself with protecting access to resources by reasoning over data. Following authentication of a “subject” (e.g., using username and password or biometrics), authorization provides access to features, data and user experiences to the subject. AuthZ is Intuit’s unified dynamic authorization system that protects resources and enforces compliance in our AI-driven expert platform.

AuthZ makes products more secure by enforcing the principle of least privilege (PoLP), which limits users’ access rights to only what are strictly required to do their jobs. AuthZ fundamentally transforms how products are developed by separating business logic from the policies associated with the access. This reduces reaction time and allows evolution outside of the software development lifecycle. AuthZ provides precise access to a variety of subjects and use cases, across organizational boundaries, to diverse resources while enforcing compliance to government regulation and laws. For example, Internal Revenue Code 7216 prohibits disclosure or use of information by preparers of returns for any purpose other than tax preparation.

The Burning Platform

The industry is moving from”castle-and-moat” perimeter-based security to a zero trust model, which calls for a “never trust, always verify” approach and micro-segmentation. Further, the risk of internal threats are on the rise and authorization-related issues have become top security vulnerabilities. For any platform that enables collaboration among multiple users, this approach is an imperative.

Authorization management systems determine whether a given user profile or identity is allowed to access an application or perform a specific action. Traditionally, businesses of all sizes and market sectors with multiple users have simply added users to an account and assigned roles for providing access to applications. This is akin to giving keys to a house to a service person for access, ideally for a limited time and for a specific purpose, yet inadvertently opening up the possibility for wider access, for a longer duration, even after the service person has left. Unfortunately, this approach to managing user authorization has been out of alignment with the principle of least privilege. Further, it has led to increased complexity (e.g., role proliferation) and vulnerability to security attacks, thus putting the burden on the admin to come up with piecemeal, partial solutions and configurations. There is a need for fine-grained authorization.

Industry analysts estimate that 20 percent of product code is typically attributed to authorization.

Legacy code written with custom authorization, using roles has a variety of pitfalls. For example:

If an instance of a resource, like a Google doc, has to be shared with a subject then roles cannot help. Roles are also coarse-grained and do not allow fine-grained access control, causing wider access than necessary.
If access has to be conditional or temporal, like delegation of authority when the primary is on leave, then one has to promptly assign and unassign roles. This could lead to lapses.
Sometimes the user doesn’t belong to the organization and access is needed for customer support/troubleshooting work. In this situation, one is forced to make the user a member, assign a role, and subsequently undo both. This also has the potential to leak the identity of support individuals rather than limiting it to the organization the individual belongs to.
By using roles in code, the number of roles are limited, and can only be defined by administrators but not product developers. Also, every new role requires a code change and an entire software release.
When authorization models are defined independently within products from the same company, seams inevitably show up because the products don’t work together. For example customers may have to assign more roles than necessary as products do not have common roles. Also, they may contradict each other and may be complex to understand. In addition, developers have to write more code to hide the seams. And, when third parties are integrated via APIs, this increases complexity, becoming an additional patch. All this complexity is the enemy of security.

As requirements for authorization continue to evolve, traditional systems have become infeasible. Because of custom code, product code changes are required, which complicate the development lifecycle, leading to costly duplication of efforts and security implications when multiple different models are made to work together using complex integrations.

From a user perspective, traditional approaches to authorization management systems can lead to unintended outcomes. For example, a manager of a team may get wider access to all resources instead of only the resources the team member(s) have access to. Same would be true for a manager of managers, who ideally should have access transitively, but gets access to all resources irrespective of teams they manage. And, managing an end user’s movement between products and services from a single company (e.g., accounting, payment, tax) in a single workflow can cause friction or access leaks. In both instances, a unified authorization management system could hide different roles and provide sometimes-only access to the entities involved for the duration of the workflow.

Given the nature of Intuit’s financial products and services business, authorization must provide users with an end-to-end, dynamic experience that guards against internal and external threats. This necessitates an intelligent authorization management system that far exceeds traditional rules and configurations devised by security experts, developers and end users. Further, the platform must be implemented company-wide and across product lines in support of ~100 million customers worldwide for TurboTax, QuickBooks, Mint and Credit Karma. Finally, since applications cannot be re-written for the new system, it must be easy-to-use, performant and highly resilient.

Intuit’s AuthZ Approach

Following are foundational elements of our approach:

AuthZ makes dynamic decisions that take user rights as inputs, in addition to other inputs and intelligence to protect against internal and external security threats. The other inputs could be details of the product, the sku, the region (country) of the user. Intelligence could risk assessment of the user and also of the behaviour in the session.
AuthZ provides tools to product developers and end users for PoLP enforcement.
AuthZ provides a platform for product developers to easily build secure products, including support for popular development paradigms for seamless integration with product code within a fast, scalable, reliable workflow.

A Logical Architecture

Below is the logical model for our AuthZ unified dynamic authorization system, based on XACML (eXtensible Access Control Markup Language). The request is intercepted at the Policy Enforcement Point (PEP) and is posed as a decision question to AuthZ. This is a shift from a permission model to a decision model, where decisions are treated as an output rather than simply as information. The resource server, for example a micro-service, only needs to provide minimal details of the resource and subject (which is also transparent, usually). This means that resource server code doesn’t change as new access use cases are implemented. This is a write-once, access-control-many approach.

The Policy Decision Point resolves policies needed to be executed in consultation with Policy Administration Point (PAP). It fetches attributes from multiple systems called Policy Information Points PIP). Some of these PIPs are intelligent systems that provide risk assessment. These attributes enrich the request, which is input to policies being executed. The decisions from policies are combined. Obligations are requests by policy for side effects from the system. They are either handled by AuthZ, or passed to the caller for end-to-end authorization, for example.

Logical architecture of authorization system

Built for Performance

The system has multiple mechanisms for performance, as follows:

The system is active-active in multiple data centers with geo-based routing to the data center closest to the resource server. With service mesh, using mTLS [multiplexed transport layer security (TLS sub-protocol)], the east-west traffic (server-to-server) can be fast with no intermediate hop for gateway or load balancing. By supporting gRPC (remote procedure call framework) we gain more benefit on latency and conserve computing resources. The system is auto-scalable horizontally.
The PIPs are runtime configurable. They are accessed in parallel, in a non-blocking manner with their data cached as per configuration. The centralized cache benefits significantly from cross use-case caching and can be in-sync between data centers.
Obligations are handled asynchronously. Batch requests and multiple policy executions are handled in parallel. This makes batch requests very efficient, as described in detail in a later section of this blog.
The resource servers can make calls in batches and cache decisions in memory or in a pluggable remote/far cache. Further, they can be chained in increased latency, such as request level (L0), heap based (L1) and far cache (L2). Extremely latency-sensitive clients pre-fetch decisions when a user signs-in, and use stale during refresh, so that actual requests do not have any authorization overhead from remote calls.
Some of the policies can be executed locally at the Local PDP resource server

Performance tactics applied in the authorization system

The diagram below shows a typical cache and decision reuse capability. In a multi-cluster application, each cluster may have a separate far cache. Applications themselves may have a separate far cache. By allowing a pluggable cache, we give resource servers the flexibility to scope cached objects, control eviction and optimize for faster access.

Using pluggable cache for decision reuse

Under the Hood

The decision engine does scatter-gather in a request-specific way to fetch additional information needed to execute the policy. To be performant, it is built as a highly concurrent multi-step process, as shown in the first diagram below. The steps start with a request being broken down into single requests if it is a batch request. Each request is a decision request and may involve multiple policies. For each policy, there may be data needed from multiple PIPs. This data may also be present in cache, depending on configuration.

All of these are broken into smaller tasks and run concurrently, as shown in the second diagram. The results of multiple policy executions are combined to give the final decision. It uses an actor model instead of a traditional one-thread-per-request model. Actors pass messages between them and force implementers to adopt concurrency.

Tasks broken down into sub-tasks to run concurrently

The Decision Model

AuthZ makes decisions at multiple levels. For example, there could be decisions made at the Intuit platform-level or application-level. Accounts may have compliance-related decision-making considerations. And, a resource type, or an instance of a resource, may guide decision-making. All of this impacts resource access decisions, as shown in the diagram below.

Parallel Authorization

A subject may or may not have access to a resource for various reasons, based on multiple authorization models or multiple personas of the subject. To speed up authorization, AuthZ traverses multiple paths in parallel. Any one path resulting in a deny will result in an overall deny. If there are no denies, at least one path is needed to result in a permit to enable an overall permit. While this is a conceptual model, the implementation has optimizations to avoid high resource consumption from traversing multiple paths in parallel.

Consistency

Not all authorization needs are the same, so AuthZ is built for flexibility — and consistency — for task-based authorization constraints. Here are the mechanisms we employ to optimize flexibility:,

The client may ask a strong consistency decision question. This results in caches to be skipped and a decision to be made just in time by executing policies with fresh data from PIPs.
Some of the PIPs may need information that has strong consistency needs. These PIPs can configure low cache ttls (time-to-live) or none at all.
Policies can control whether a decision can be reused by caching or not, or conditions in which it can be reused, and for how long. The PIPs and policies can invalidate various caches on-demand for particular cached information. This is event-driven, in support of use cases when permission to an actor may have been revoked.

Availability

Decisions have to be available for products to function. AuthZ achieves this through:

Multiple deployments in fault-isolative domains to prevent disasters.
High quality implementation of changes to the system, due to Canary deployments, rigorous policy validations, and multiple versions of policy (live and otherwise).
Bulkhead pattern with resource isolation, app and PIP-based quotas to create resiliency.
Ability for AuthZ to automatically transition to read-only in degraded mode.

Flexibility for the Real World

AuthZ allows the client to combine consistency, availability and performance tailored to their needs. The system can switch modes between push and pull according to use case. For a very high ttl (time-to-live), the system behaves like a push model. With no ttl, the system is pull-based. In between, there are multiple variations possible. This gives developers of consuming applications the ability to fine-tune according to use case needs.

AuthZ operates as a just-in-time versus just-in-case system. As such, it is better than a pure decision computation and push system, as the latter can involve computing a combinatorially a very high number of decisions, many of which may not be used if the user isn’t online at the time..

AuthZ also includes intelligent inputs to make authorization decisions dynamically based on internal or external threats, at times based on the behavior of a subject.

Conclusion

As Intuit’s customers, experts and partners collaborate on our AI-driven expert platform, our AuthZ system is ever watchful. It empowers administrators with fine-grained access control, watches for their actions and intents, and factors in intelligence and legal compliance to provide sub-millisecond response times.

At Intuit, we’re incredibly honored to serve millions of customers, thousands of experts and hundreds of applications with our AI-driven expert platform. And, we’re especially proud to underpin it with a unified dynamic authorization system that instills trust and confidence with Intuit users and customers alike.

Acknowledgements

Many have contributed to the development of the product and the vision. Cindy Barker, Yi Zhang, Venkat Sonnathi and Jyoti Ahuja started an effort to bring attribute-based access control to Intuit. Several enriching discussions with Mohan Naik, Gayathri Belapurkar, John Panelli, Vijayan Srinivasan, Sourabh Agarwal, Sumit Choudhary and Randhir Sinha contributed to the vision. The AuthZ team, Raghavendra SC, Ravi Chauhan, Sachin Maheshwari, Charu Garg, Anuja Barve, Rashmi GS made early contributions to the system. The system continues to evolve and many folks continue to contribute to its success.

Trademark Attributions

XACML is a standard published by Organization for the Advancement of Structured Information Standards (OASIS)

Styra is the original creator of Open Policy Agent