Policy-as-Code for Leaner Governance

Published in

Salesforce Architects

7 min readJul 26, 2022

As an architect, you know that time spent on strategic, transformational work is much more valuable for your organization than time spent on mundane tasks like enforcing standards. At the same time, strong governance is a necessity for a successful implementation. And it can be a challenge figuring out how to make governance leaner to keep it from becoming a burden that takes away from the activities that you should really be spending the majority of your time on.

After working on several Salesforce implementations that delivered multiple isolated applications inside a single org (as described in Applying Domain Driven Design with Salesforce), I’ve learned that there is a way to streamline your processes through a combination of IT governance, lean, and DevOps principles.

IT governance

Governance is a large domain. It starts with outlining the composition and decision rights of various committees, and includes building a controls framework to ensure the effective mitigation of risk across the entire value stream.

In a Salesforce context, governance is often situated in a Center of Excellence (COE) and revolves around the execution of principles and rules about how the implementation will happen on an org. This technical governance, which incorporates many aspects of design standards, spans a range of topics, including what policies you have for visibility to records, what standards you adhere to when writing code, and how much evidence you may present to auditors. This post covers technical governance and does not cover decision rights and committee formation.

You may see the term lean governance being adopted in the Salesforce ecosystem. This is to overcome the sometimes bulky and slow governance outcomes that result from having a centralized COE.

Learning to see the lean way

Lean improvement initiatives look at processes through the lens of value-added activities. You should be able to group all of the activities associated with any process into one of three buckets from the customer’s perspective:

Value adding (having this will add value to the customer’s business)
Non-value adding, but necessary
Non-value adding, and not necessary

An example of non-value adding, but necessary, is a QA process that doesn’t provide direct value on its own, but is necessary to support the value-adding component of software functionality. Governance falls into the same bucket as QA.

To streamline governance processes, the first step is to make a list of the activities related to each process and think about the risks they are mitigating. If an associated risk is being mitigated through some other activity, or you determine that you are willing to take that risk, then the activity is unnecessary and can be eliminated. Once you’ve removed all unnecessary activities, your next step is to think about how you can automate the remaining “necessary waste” in a way that enables you to focus more of your time on value-adding activities.

Automation in lean governance

The lean approach has a concept to describe an automation mindset specific to governance. Autonomation is automation that provides supervisory automation that allows for a human to intervene in the production process and address defects as they are being injected.

Policy-As-Code is a framework that builds on the ideas of Infrastructure as Code and is a form of autonomation. It enables you to build policy as executable code to check software configurations. In Salesforce, you can apply the Policy-As-Code ideas for lean governance of an org’s metadata.

Other examples of similar approaches can be found in AWS Config or Hashicorp’s Sentinel products.

Policy-As-Code for Salesforce

Many Salesforce governance policies include rules on naming conventions, on not having more than one Process Builder per object, or on ensuring triggers always call the org’s trigger handler. With tools such as PMD or CodeScan, you can enforce some of these policies in Apex, but what about the declarative side?

An important value proposition of Salesforce is that more work is declared in configuration, rather than written in code. Taking a more declarative approach means that enforcing policies on the metadata, with existing tools such as PMD or CodeScan is limited to only a small portion of the solution. Compounding the challenge, Salesforce democratizes this configuration, enabling individuals with no background in software engineering to deliver significant functionality. So the ease with which Salesforce enables “citizen developers” to be software makers, can result in Gordian knots of technical debt.

You can write a Policy-As-Code engine for Salesforce metadata to address these issues. Too hard you say? You’ll be surprised at how much you can achieve if you are willing to stay simple and use common tools. For example, you can use simple regular expressions to build a policy engine that checks any metadata for virtually any policy you can imagine.

You can then integrate your policy engine with your ALM tool to stop commits, block deployments, and prevent pull requests from being generated whenever the engine identifies a policy violation. In the image below, the Policy-As-Code engine intervenes in the flow of configuration between environments, blocking configuration that does not meet policy from moving to the test environment.

Diagram showing where a policy engine fits in a development landscape

Here are some examples of declarative policies you may want to consider:

All metadata files must have a namespace prefix. (See Applying Domain-Driven Design with Salesforce for details on how we isolate applications and why namespaces are critical for this.)
Organization-wide sharing defaults of an object must be private. (See How to Build a User Security Model for more on why this is a best practice.)
Triggers always invoke the trigger handler framework.
Profiles never provide access to objects, fields, or classes.
All objects must have at least one record type.
Flows can have a maximum of 12 DML nodes (before having to be broken up into subflows).
No Process Builders, ever.

You may well suggest that these can all be inspected in the code review process. And you’re right, they can be. But, equally this “necessary waste” can be automated away, enabling, you the architect, to focus on higher-level concerns such as uplifting the capabilities of your administrators or innovating with the platform for business transformation.

How we built our pragmatic Policy-As-Code Engine

The COE team approached the design of our autonomation (that is, our Policy-As-Code engine), in the simplest possible way: a single JavaScript file, recursing through a directory of “rule-scripts” and executing each one. Each rule-script conforms to a common type that has an “execute” function, which executes that function and returns a common type that is an array of exceptions.

Internally, the execute function takes the file of metadata, (for example, Account.object or myFlow.flow) that it’s been passed and runs a regular expression, or series of regular expressions, to assess the metadata’s conformance to the rule. If it finds issues, it returns an array of exceptions, else it returns nothing.

The script executes on the pre-commit hook of every commit. It doesn’t stop the commit but returns results to the ALM tool so that the work is prohibited from going forward until reviewed.

Taking a leaf out of the static code analysis playbook, we invented an exception model, where any rule could include a list of specific metadata artifacts that were not required to conform to it. There may be, for example, a particular object that must have its organization-wide sharing defaults set to Public Read/Write for some valid reason. The messy world throws up exceptional situations all the time, so you have to account for them.

Over time, as we iterated the policy engine, we improved its design. But the core model still exists: an entry-point script that calls a series of rule-scripts in a rules directory, with an exception model to allow the messy world to still exist without reporting a ton of false positives.

Diagram showing a single Policy Engine with multiple rule scripts

Your pragmatic next steps

Once you see the value in adopting this kind of approach, you start thinking about your next steps. Here are some suggestions:

Quantify the cost-saving opportunity of implementing such a tool. Measure how much time you spend in code reviews checking that various standards have been met. This will provide you with the business value of having a tool that could do the checking for you.
Spend some time understanding Policy-As-Code as a domain. Read AWS Config or Hashicorp Sentinel documentation and consider how the same concepts apply to Salesforce.
Review the Lightning Flow Scanner tool, which applies the methods described in this post, though only for flows.
Write a simple script, in the language of your choice, that parses metadata and makes some policy assertions about your org.
Align your Platform Owner with the value and effort required for a simple start.
And lastly, if you’re really keen, learn to see the lean way, by reading up on lean. Lean Thinking or Learning to See, for example, can help you get your processes leaner still.

Bringing all the worlds together

Thinking about governance, and all the “necessary-waste” activities that comprise its domain, makes you wonder if there is some way to make it all go away. How can you “lean” governance down so that it’s not such a burden on business? How can you free yourself as the architect to spend your time adding high-level value instead of correcting a new developer on the appropriate usage of Flow, for example.

Every day, our policy engine “stops the production line” and prevents bad quality metadata from entering the value stream. It has already saved us once from accidentally changing Case to Public Read/Write. You too can be saved from performing low-value work. You’ll also sleep easier, as will your business sponsor, knowing that the policy engine never gets tired and never makes a mistake.