N-Man Rule for Break Glass
Break Glass procedures are commonly used to protect and grant emergency access to critical information, such as high-privileged accounts or secrets. Although not limited to information security, these procedures come in many forms and formats. Some require physical access to the secret, or may include a requirement to have two individuals available to perform the procedure. This last one is usually referred to as the “Two-man rule” or “the two-person concept.” In this article we will discuss a new approach to Break Glass, as a distributed and redundant extension of the “Two-man rule,” which we called the “N-Man rule.”
The issue with current Break Glass models
Taking a closer look at current Break Glass models, one may immediately find several limitations and dependencies like the ones previously mentioned, which are naturally attached or derived from the type of protections required to secure such critical information. And these precautions are in fact absolutely necessary, as the last thing that you want is to have the GodMode keys to your precious assets misused, abused, or exposed.
Although these Break Glass methods can be considered acceptable from a protection standpoint, they don’t actually scale well with highly dynamic and global operational requirements. Take for example the Two-man rule: it is actually possible to split a secret between two individuals, ensuring that no single person would use it autonomously, but you would always need the availability of both individuals, and only those two individuals. So if the time comes to (knock on wood) actually use the emergency access, how do you ensure that you have a global, 24/7 highly-available and immediate access to that secret? How will you rotate that secret efficiently? Will you depend on physical tokens or other similar hard supports? How would that work in terms of global logistics? And what’s the cost of running all of this?
So what if we could actually extend the current Two-man rule in a distributed and redundant manner, in order to overcome some of these limitations? Enter the “N-Man rule.”
The ‘N-Man rule’ Break Glass model
The operating principle of the “N-Man rule” is set by the following:
Take a Secret (S) and split it in multiple parts (Sn) according to a defined Replication Factor (rf). Distribute each secret part across multiple Key Custodians (KCn) sequentially. The Replication Factor relation to the number of Key Custodians will need to be in the form of 1<rf<KCn.
No single Key Custodian will have access to the full Secret, but a combination of multiple Key Custodians (depending on the defined Replication Factor) will in fact be able to complete the full original Secret. Nice!
Example 1
As an example, and in its most simple form, we can have a Secret (S) split by 3 Key Custodians (KCn=3) with a Replication Factor of 2 (rf=2).
How would we split and distribute the multiple Secret parts by all Key Custodians?
- Key Custodian 1 (KC1) would have the 1st and 2nd Secret parts (S1 and S2), but not the 3rd (S3).
- Key Custodian 2 (KC2) would have the 2nd and 3rd Secret parts (S2 and S3), but not the 1st (S1).
- Key Custodian 3 (KC3) would have the 3rd and 1st Secret parts (S3 and S1), but not the 2nd (S2).
With this distribution in place, no single Key Custodian has access to the full Secret, but every combination of 2 Key Custodians can actually complete the full Secret by combining their corresponding Secret parts. Simple, right?
Example 2
Now what if we have 4 Key Custodians (KCn=4) and keep the same Replication Factor of 2 (rf=2)?
- Key Custodian 1 (KC1) would have the 1st and 2nd Secret parts (S1 and S2), but not the 3rd and 4th (S3 and S4).
- Key Custodian 2 (KC2) would have the 2nd and 3rd Secret parts (S2 and S3), but not the 1st and 4th (S1 and S4).
- Key Custodian 3 (KC3) would have the 3rd and 4th Secret parts (S3 and S4), but not the 1st and 2nd (1st and S2).
- Key Custodian 4 (KC4) would have the 4th and 1st Secret parts (S4 and S1), but not the 2nd and 3rd (S2 and S3).
In this case and similar to the previous example, no single Key Custodian has access to the full Secret, but this time around at least any 3 Key Custodians are required to ensure that you can unlock the full Secret (or 2 Key Custodians in specific combinations, which we don’t want), as the Replication Factor remains the same (rf=2).
Example 3
Finally (and you’re probably already getting where this is going), we can demonstrate how this would fit in a 4 Key Custodians (KCn=4) distribution, but this time with a Replication Factor of 3 (rf=3). The distribution would be:
- Key Custodian 1 (KC1) would have the 1st, 2nd Secret and 3rd parts (S1, S2 and S3), but not the 4th (S4).
- Key Custodian 2 (KC2) would have the 2nd, 3rd and 4th Secret parts (S2, S3 and S4), but not the 1st (S1).
- Key Custodian 3 (KC3) would have the 3rd, 4th and 1st Secret parts (S3, S4 and S1), but not the 2nd (S2).
- Key Custodian 4 (KC4) would have the 4th, 1st and 2nd Secret parts (S4, S1 and S2), but not the 3rd (S3).
In this final example, and again, no single Key Custodian has access to the full Secret, but with a Replication Factor of 3 (rf=3) only 2 out of 4 Key Custodians are required to unlock the full Secret.
The Pros and Cons
Taking what we’ve seen here so far, we can identify a few advantages and disadvantages for this model. For advantages we have:
- Redundant. No single point of failure in terms of availability, as the full Secret can be completed with different Key Custodians, depending on the chosen Replication Factor.
- Distributed and power-neutral. Not a single individual has access to the full Secret to push the Doomsday button.
- Easy to implement and does not require any special tools, although a password/secrets manager is expected.
- Immediately available if required and not dependent on physical mediums, which is specially important in emergency situations and tight SLAs (Service Level Agreements).
We must note one important disadvantage related to the Secret length. Secrets must be both strong enough and long enough to prevent trivial password guessing attacks from a single party (i.e. minimum 24 characters for an 8 character guessing slot in a KC=3 rf=2 configuration).
What’s next
In this article we’ve approached some of the current limitations with Break Glass procedures, and proposed an extension to the existing Two-man rule in order to address some of these issues and make the process distributed and redundant.
If you want to have a quick run with this method, we’ve created a very (very) simple PoC Python code to support it, which basically reads a Secret from STDIN and does all the splitting for you based on the configured number of Key Custodians and Replication Factor. The code is available here.
We would love to hear your thoughts on this also, so please feel free to reach us!