Threat Modelling Journey — Developing a Centralised Enterprise Capability

Mehran Koushkebaghi
Nationwide Technology
9 min readMar 9, 2022

1. Introduction

Threat modelling is one of the vital activities we perform when building a new system or changing an existing design. The ownership details, involved parties, key deliverables, and its position in the software development lifecycle depend on many factors and vary from one organisation to another.

In this article, I’ll explore a model in which threat modelling is done by a centralised enterprise team, the first step for many organisations in their threat modelling adoption journey. In this model, the threat modelling will be done before deploying to the production environment or when the application is going through migration from one domain to another (e.g. on-premise to public cloud). Therefore, a central team of security engineers owns the activity and will help the other parties perform the threat modelling.

As the process matures, the threat modelling activity moves to the left of the development lifecycle, and the development team take ownership of the exercise. In future articles, I’ll explain the intermediate steps from the current model to the ideal target model.

2. Process

In this section, I’ll explain the parties that are involved in the process, the different steps and the outcome of the threat modelling activity.

2.1. Involved Parties

The proposed process involves three parties. Each party looks into the system from a different perspective and makes a unique contribution to the developing model.

  • System Owner: Software engineers, system architects and any other individual responsible for the design and implementation of the system. They will help develop a view of the system as they’re closest to the system under investigation. It is worth noting this involves anyone who owns the technical decisions of the system and does not refer to the business owners of the system.
  • Security Engineer: The security experts who look into the system’s design from the security perspective and can understand the security implications of fine-grained design and implementation decisions
  • Risk and Assurance Specialist: The individual who knows the organisation’s risk appetite and can map identified threats to the organisation’s guardrails, security standards, regulations and facilitate determining the severity of the threats.

2.2. Steps

The below diagram illustrates the Threat Modelling business process and highlights the necessary steps throughout the journey. It also introduces a shared responsibility model in which the Security Engineer leads the entire process but relies on cross-team collaboration to accomplish the task.

It is vital to recall that the below process will be running iteratively as the development team changes the system (e.g. adding a new feature). Therefore, the “start” and “end” show each iteration’s boundary.

Threat Modelling Process

2.2.1. Building a View of the System

In this stage, the team identifies the system of interest for threat modelling and collectively builds a system view at the required abstraction level. The entire picture might contain several systems, and it is critical to draw the system boundary to define the scope of the threat modelling. To cover the whole system, we’ll move up and down in the abstraction levels and build the complete view of the system gradually. In section 3, I’ll explain this in more detail.

2.2.2. Developing data flow diagrams

Once we have identified the system of interest, we’ll identify the key components and develop data flow diagrams to understand how the data flows between those. The data flow diagram should incorporate the message content, communication method (protocol), and the transmitted data’s sensitivity.

2.2.3. Identifying Threats

We can identify the threats to the component and their communication channels based on the data flow we created in the previous steps. STRIDE might be a reasonable option for identifying the vulnerabilities in application-centric threat modelling.

2.2.4. Evaluating Threats Exploitability

Performing a risk evaluation will assist us in assessing the threat. The purpose of the assessment is to determine whether an actor can exploit the attack. The system’s sensitivity, the organisation’s risk appetite and the considered attackers(nation-state external attacker vs malicious partner) are some of the details that will feed into this evaluation. Depending on the outcome of this evaluation, we’ll move to step 2.2.5 or 2.2.6.

2.2.5 Log the Threat

It is essential to record the threat as any future change to the system (change in design, technology stack, computing power, etc.) might make the threat exploitable later. The system owner should log the threat and periodically revisit the register to ensure there hasn’t been a significant change to the initial assumption. The details of the threat log ownership (and the required business process to revisit it) are outside the discussion’s scope and should be defined based on organisational requirements.

2.2.6 Control Design

Based on the nature and complexity of the threat, we can use existing enterprise tooling, leverage native functionalities in the ecosystem (e.g. cloud provider native solutions, host native functionalities, etc.) or design a bespoke solution to mitigate the risk. The outcome of this stage is mitigation control which can minimise the risk but doesn’t contain the implementation details.

It is worth mentioning that as we’re performing the threat modelling retrospectively, a certain level of control may already exist in the system at this point. However, a lack of structured threat modelling could lead to a control that is not effective for mitigating the risk of the intended threat. The threat modelling activity needs to capture the controls and assess their effectiveness. In my next article, I’ll show a couple of examples of how poor detailed design can undermine the effectiveness of a control.

2.2.7. Risk Assessment

A risk assessment will be performed to identify the risk posed to the organisation by not implementing a control for an identified threat. The details of the risk assessment process might vary significantly depending on the business’s nature and industry. The associated risk will enable the business to understand the impact and probability of a potential compromise due to a lack of control.

2.2.8. Ask for Exemption

If it is not possible to design a mitigation control (e.g. technology constraints, time or budget limits, etc.), an exemption needs to be granted for the particular threat. Different organisations have varying internal processes for dealing with dispensations. The requests should have a deadline and include a plan to address the issue in the specified timeframe.

2.2.9. Assess Control Effectiveness

After the mitigation control design, the Risk and assessment specialist will assess the designed controls’ effectiveness. Additional mitigation controls are required if the designed control doesn’t guarantee enough control against the threat. Otherwise, we’ll move to the next step.

2.2.10. Control Implementation

Depending on the nature of the control — preventative, detective or corrective — different artefacts will be delivered at this stage. The security engineer will provide the implementation details to translate the control to the application (or infrastructure) code in the case of preventative control. Monitoring use cases (for the SIEM solution) and remediation runbooks can be potential deliverables of detective & corrective controls.

2.2.11. Verification

A certain level of testing is required to ensure the implemented controls are working as intended and are effective. Different technologies (e.g. Fuzzing, Policy-as-Code, etc) could be utilised to verify the implemented controls and complete the process.

2.3. Outcome

Many artefacts are produced in this activity and should be considered the outcome of the exercise, including:

  • A detailed view of the system’s components (and their interactions). It is a living document and should be kept updated.
  • A list of unexploitable threats (for now!) that will be stored in a repository — described in section 2.2.5
  • A list of exploitable threats with acceptable risk and a remediation plan — explained in section 2.2.8
  • A list of threats with an unacceptable risk to the system that will contain mappings to the design and implementation details for managing the risk.

3. Building a View of Complex Systems

Threat modelling assists us in building a full view of the system in different layers. There is no reason that threat modelling should be an exhaustive upfront analysis. The system changes due to multiple causes (adding a new feature, replacing old technology, etc.), and it is impractical to create a single document that captures a detailed view of the system and the list of threats.

Performing the threat modelling in multiple distinct layers has several benefits, including

  • Providing better visibility on the controls on each layer
  • Facilitate the implementation of the control by enabling the delivery team to identify the responsible team quickly (e.g. system architect, platform engineer, software engineer, etc.)
  • Improve the readability of the final artefact
  • Reducing threat model’s maintenance costs

We will start the exercise at the highest level of abstraction and move vertically and horizontally to cover all the system’s interactions.

The vertical move enables us to move between abstraction layers -up and down- and the horizontal movement allows us to switch between components in the same abstraction level. Leveraging a mixture of vertical and horizontal moves lets us cover all the interactions between system components.

We’ll use various methods such as studying design diagrams(application architecture, infrastructure, etc.), implementations (application code, infrastructure code, etc.) and workshop sessions with the system owners to uncover the details and understand the system.

The first stage involves identifying the critical components of the system. We’ll choose the system we would like to investigate and study its interactions with other systems in this stage, including all the steps described in the previous section.

Once we study this level, we will move down the abstraction layers. By switching the system of interest, we can iterate over all subsystems and study their interaction with the other components. In this article, Matin Mavaddat proposed a methodology to study the system of interest and its interactions.

You might notice that the diagram below lives in the same abstraction level as the above. We only switched the system of interest to study another subsystem and its interaction. We make the horizontal movement in a particular abstraction layer to concentrate on each element and its interactions with other systems.

We might decide to zoom in further to study the finer grade details of the interactions and determine the threats in that layer. The decision depends on the complexity of the component, the amount of control we have over it, and the criticality of the system we’re building. If necessary, we can extend the threat model to the lower level of abstraction and investigate the system in that layer.

The below design might offer a better understanding of the horizontal and vertical movement I described above. We can split the overall picture into four major components

  1. A client web application
  2. The cloud environment that hosts the system that is under investigation
  3. The on-premise infrastructure has some of the business-critical services.
  4. The external services/APIs that the business relies on to implement the business logic.

At the highest level of abstraction, we’ll study those components and their communication channels (e.g. clients’ connection to the exposed APIs on the public cloud, the link of the public cloud infrastructure with the on-premise).

The outcome of the investigation will be a list of threats and the controls designed on that level to secure those communication channels. As we’re at the highest level of abstraction, we don’t have visibility on the details of those communications, hence the implementation of the controls to secure the system. Those details will be known as we move vertically and horizontally in the abstraction layers. We need to develop a plan to cover all the interactions in each layer by defining the scope of each threat model around functionalities.

In the subsequent article, I’ll go through a real-world example and show how applying the introduced method allows us to construct a full view of the system throughout the threat modelling practice.

It is essential to remember that multiple teams deliver and maintain different parts of the system in enterprise environments. The implicit internal shared responsibility model is not always clearly defined, which adds to the complexity of the threat modelling activity. The threat modelling should consider this, and the described methodology aids us to recognise it in our practice.

4. Conclusion

In this article, I described a process for delivering a centralised threat modelling service. The method would be helpful for organisations that are in their initial steps of introducing threat modelling to their software development lifecycle. I’ll clarify the approach in future posts by giving examples of applying it to real-world systems. In a separate article, I’ll also introduce a strategy to move from a centralised model to a developer-driven decentralised one which unlocks the true power of threat modelling and enables the secure by design software development.

--

--