Hierarchy of Controls in IT

6 min readMay 14, 2023

Minimizing hazards with IT solutions

If you’ve seen Tyson Gabriels’ take down of community masking policy (see his Substack), you will not only understand that these policies were unscientific, you will also be familiar with the Hierarchy of Controls, and the associated inverted pyramid depiction. Below is a version of that pyramid somewhat altered to apply to tech. The hierarchy of controls is a systematic set of common sense approaches we should apply to reduce risks.

Seniors working in IT solution development should be knowledgeable in these controls, so if you are looking to move into a senior role, this will provide you a fairly comprehensive overview of topics to look into.

The use of IT solutions is fraught with hazards, although happily software “bugs” don’t cause deadly contagions. There is the hazard of implementing IT solutions that do not work well or are difficult to implement, manage or maintain. This can be true in both domains of hardware and software — delivery and costs are risks as well of course.

A fundamental point to grasp about the different levels of control is their varying degree of effectiveness. The top 3 layers of controls are the most effective, while the lower levels are least effective because they are more dependent upon the behaviours of many human actors, and changing business culture and individual behaviours can be very challenging.

Let’s dig into the pyramid as technology implementers and see how we apply its wisdom to how we approach IT solutions. As business scales up and the impact of failure rises, it becomes more critical to work on all these areas of risk.

Elimination

There are a number of conservative principles in software development such as YAGNI (you-aren’t-gonna-need-it) and KIS (keep-it-simple) etc. that try to control hazards at source by eliminating them. Thinking more widely, we can even ask things like:

do we need this digital delivery?
do you need this application?
do we need this feature?
do we really need this/these cloud service(s), server(s), laptop(s) etc.?
do we need this software?
do we need a database, cache, message-queue etc.?
do we need this service?
do we need this programming language?
do we need this program?
do we need this method, routine, line of code?

If we can say no to any of such perceived needs, then that should seriously be considered as a right course of action.

Anything we can entirely eliminate, takes away a point of failure thereby getting rid of a hazard. Obviously this control is best implemented at the inception of projects or changes when it is most practical to do so, because then it will have the greatest impact and later it will probably be difficult to undo.

Substitution

We can’t always choose to eliminate a part of the solution, but there may be an alternative to use instead that will reduce risk. One technology does something more reliably than another, another programming language is less prone to bugs than another and so forth.

which hardware option has the better MTBF?
will a cloud service be more dependable than on-prem?
is there a more reliable networking tech?
is there a more resilient database?
is there some more reliable library code to use?
what programming language option reduces bugs?
will developer B do a better job than developer A?

As with elimination, substitution can have a hugely beneficial impact on reducing risk and raising the quality of service. Again, this is something to focus upon during inception phases because it may prove impractical to change to something else better later.

Engineering Controls

At the middle tier of the hierarchy of controls we start to get more deeply into mitigating the risks inherent to the systems in place, that we couldn’t eliminate or substitute to avoid the risks entirely. Engineering controls are systems put in place with technology that will reduce the risk of, or manage failures.

Software development benefits from numerous engineering controls, including:

automated testing
automated code generation
static code analysis
version control
code quality metrics
logging
automated deployment
load testing
security testing
continuous integration and delivery
metrics monitoring

Infrastructure can also include engineering controls such as:

backups and redundancy
power management solutions
network monitoring systems
cloud services
access control systems
system configuration management
systems monitoring

By putting engineering controls in place we isolate the source of problems from the vulnerable systems, or manage the consequences of failure to reduce the effect upon services.

Administrative Controls

So far we have looked at implementing technical solutions to reduce the risk of failure. The bottom layers of the hierarchy of controls are measures taken to directly mitigate the risk that peoples behaviours introduce to IT systems. Human behaviours can have the most dramatic effect when it comes to risk. If you have ever had one of those IT security trainings, you will know that something as simple as a successful phishing attack can bring a business down.

Administrative controls are implemented through working practices, these are attempts to cope with the risks that could not be eliminated by the superior approaches above. They may include all manner of best practices, guidelines, standards and procedures that people using IT systems need to follow. Administrative controls can be experienced as onerous, so there will be a tendency to try and avoid or circumvent them.

When administrative controls become onerous it may also be an indicator of poor engineering and as human based systems administrative controls are subject to human error. For these reasons it would be preferable to mitigate risks by the superior methods. Administrative controls are an ongoing expense that adds to delivery costs and timelines - engineering solutions can prevent issues arising, and thus be regarded as more of a setup cost that can help to reduce the ongoing costs and speed delivery.

Software developers should be familiar with the following list of administrative controls:

coding standards, best practices and guidelines
code reviews, pair programming
testing; unit testing, integration testing, system testing, acceptance testing, regression testing, performance testing, security testing, usability testing, exploratory testing, user acceptance testing, A/B testing, localisation testing, compatibility testing
version control and change management
secure programming
documentation; requirements documentation (consider BDD), design documentation, test documentation, installation documentation, configuration documentation, user documentation, support documentation, maintenance documentation, security documentation, disaster recovery documentation
continuous integration and deployment

Many software developers are averse to providing comprehensive documentation — we prefer to code, however documentation of some form can play an important role in disaster recovery and support and of course efficient and effective knowledge transfer.

At the business level there are also many ways of better managing risk by implementing administrative controls including:

specified ways of working to manage delivery; SCRUM, XP, Waterfall etc.
risk management
quality assurance
effective communication
interviewing, education and training
the working environment and culture
audits
disaster recovery planning
access measures/security measures

PPE?

So finally, what can we do as individual contributors to reduce our own risk of failure? We can’t just don some personal protective equipment (PPE), but what is there to assist us beyond administrative controls? Here are some individual centred actions that help reduce the risk of delivering defectively:

use of code analysis plugins for IDEs that detect bugs and code smells
assistive prompting tools for IDEs
code defensively, assure quality by checking code metrics
using patterns, consulting experienced experts/expert literature
self development, discussion and study etc.
taking good care of ourselves physically and emotionally

There is of course an immediate cost for doing things well enough to keep the risks manageable, and then there are the risks of taking short-cuts. Business should plan to apply the appropriate level of controls commensurate with the risks and of course within budgets.

High internal quality reduces the cost of future features, meaning that putting the time into writing good code actually reduces cost. — Martin Fowler, Is High Quality Software Worth the Cost?

As individual contributors, we can have the most significant effect on successful delivery. Making sure we deliver quality and encouraging others to do so, will help raise the bar and reduce the total-cost-of-ownership. It is better to spend 2 months delivering a project properly, than it is to spend 1 month delivering a project poorly, and then have to spend more time dealing with the consequences of bugs, defects and code that will be hard to maintain and may roll on for many months.