Managing cloud security risks with agility

Mark Lee
Mark Lee
Apr 21 · 10 min read

There is a need to relook at the approach to manage cloud security risks.

With the rapid advancements in cloud computing, businesses of all sizes and industries are turning to cloud services. These businesses are working with Cloud Service Providers (CSPs) to manage the maintenance and enhancements of supporting infrastructure, while they focus on their application logic and codes to support critical business functions. Upfront capital expenditure (CapEx) is also reduced, as the pay-per-use model offers the flexibility to scale computing resources upwards or downwards to align with their IT needs.

However, I have observed that even though there is an increase in the usage of cloud services, the security risks and impact associated with cloud adoption are less understood. For example: since the accountability of risk management and data security remains with the owner, we need to consider approaches that should be taken if the CSP-managed security controls are insufficient to mitigate the security risks. One possible way is to ‘layer’ the same set of controls, used in an on-premise environment that we are accustomed to, into a cloud environment.

However, this may result in an ‘off-balance’ state in managing risk for cloud adoption, with undesirable outcomes such as:

· Process lapses and oversight by staffs leading to an increase in incidents.

· Ineffective control measures leading to ‘holes’ that may be exploited by threat actors.

· Higher cost and effort required to remediate system design flaws that are discovered late in the development life cycle.

While these issues are neither new nor unique to cloud adoption, there is a need to have a deeper look at identifying the root causes. As such, the article seeks to answer the following questions:

a) What are the key contributing factors that lead to these issues?

b) What approaches can we take to manage these issues effectively?

c) How can we operationalise the approach in an agile fashion?

What are the key contributing factors?

A paradigm shift in application and systems development cycles

Upon deeper reflection, I notice a paradigm shift in how applications and systems are developed. The current approach tends to be more time-sensitive and involves multiple iterations akin to the “agile” development model, rather than the more conventional linear “waterfall” model. This could lead to potential challenges faced by the organisation whereby:

a) It is more difficult to align traditional risk management practices with the application sprint processes, which could lead to a less-than-ideal rigour in risk identification and thus impacting the ability to mitigate security risks.

b) Application and system design flaws are discovered only during security testing and audits, which have to be resolved by ‘bolting’ corrective action repeatedly, instead of ‘baking in’ the necessary baseline security measures right at the very beginning of the development.

Hence, there is a need for a different approach — one that has the agility to keep pace with systems and applications development and yet able to empower teams to manage risks and remain in control at every iteration; an approach that allows for a right balance of security and business efficiency considerations for cloud adoption.

What are the approaches to manage risk effectively?

Adopt a “De-Risk by Design” approach, which is team-centric, agile and repeatable, to be risk and control-aware at every process iteration.

The “De-risk by design” approach is a repeatable set of actions that is easy to remember. The goal is to begin security risk management early by identifying risks at the start of the development life cycle. This allows risk analysis to keep pace with each iterative application change through methods such as application threat modelling. As for systems and networks, the mitigating controls can be built into the design as a ‘blueprint’, e.g., Infrastructure As a Code, which consists of a control baseline that can be changed iteratively alongside design changes.

Referring to the diagram below, I have adapted the Deming cycle of “Plan-Do-Check-Act” to bucket common security gaps alongside its mitigating measures which corresponds with NIST’s Cybersecurity Framework’s five functions of “Identify, Protect, Detect, and Respond & Recover”. This provides a common and easy-to-understand framework to promote ownership of security across teams — almost akin to a team sport where teams of security and non-security practitioners (like athletes playing different roles) follow a common “fitness regime” , ie. security risk management regime to keep “fit” together.

Figure 1: The De-Risk by Design is a common reference approach to manage security risks as a team sport

But wait! How is this different from the traditional method of managing security risks?

In the traditional method, there is a tendency to focus on controls and solutions instead of the risk. Such an approach can become an arduous process with a laundry list of controls and solutions, which could include irrelevant measures. This old way of “exercising” (ie. keeping all assets secured) does not keep us “fit”. On the contrary, it provides a false sense of security that the assets are protected at a potentially higher business cost when in reality, the implemented controls are ineffective.

De-risking takes a focus on the security objectives and it can be done at each iteration

De-risking helps by shifting the focus to security objectives, risks and measures instead of solutions. What this means is that one can quickly identify new risk areas that are surfaced, or a shift in risk posture before determining the appropriate and relevant security controls to close these identified risk gap. Control measures are “baked-in” the design, not “bolted on”.

Now that we have a possible approach, we arrive at our final question!

How do we operationalise the approach?

There are two parts to the answer:

(1) Identify the Security Objectives (SOs) as they are critical in ensuring risk-mitigating measures are ‘on target’ to manage security risks within the enterprise’s risk appetite

The broad ‘big picture’ SO helps to ensure that security risks are within the organisation’s risk appetite and tolerance. The diagram below summarises the relationship between the three terms. I will explain this concept with an analogy about speed limits on roads and expressways:

⦁ Typical speed limits that we encounter on the roads are 50 or 60km/h, and up to 90 km/h for expressways. A company’s risk appetite is like the speed limit set by the authorities for safe travelling on the roads.

⦁ In the event that the driver travels at a speed above the limit, the authorities can penalise this driver for speeding. Risk tolerance is the variance from the risk appetite.

Figure 2: The top-level SO manages security risks within the risk appetite through effective and appropriate risk treatment

In a nutshell, the top-level SO is established to preserve the confidentiality, integrity and availability of information within the risk appetite determined by the senior management, using the appropriate risk treatment methods. In general, the SO does not change drastically over time, though the methods to achieve the SOs are dynamic as threats and technology evolve.

The next-level objectives can be derived subsequently based on the context of the system or application, and the organisation’s risk appetite. An example of an objective could be “to provide a secure cloud application system with the assurance that it is suitable to process and store customer data.”

(2) Adopt an iterative 4-step process to help increase the agility needed to manage cloud risks effectively

Now that the SOs have been established, it is time to assess the security risks and determine the controls that help mitigate the said risks. There are cloud control frameworks such as the Cloud Security Alliance’s Cloud Control Matrix (CSA CCM), which is an established tool that is useful in building a risk and control map based on the various infrastructure and application stacks. Based on the earlier identified SO(s), you can leverage such tools to perform a systematic security assessment and identify the relevant security risks and corresponding control measures. Let’s look at ‘De-risk by Design’ in action and its corresponding security practices!

Figure 3: De-Risk by Design as an agile and iterative approach to manage security risks at every development phase

1. Plan — ‘shift left’ the risk identification process

I have covered the importance of starting the risk identification process early. The security baseline can be established with questions such as ‘what are our security objectives’, ‘where are the risk areas’ and ‘what controls do we need to address these risks’. The team can think of this baseline as a blueprint that can be layered with controls should new risks surface.

2. Do — perform a review of existing and new risks arising from design changes with agility

The intent is to identify whether the risks have changed, and if the relevance and effectiveness of existing control measures have shifted. If so, the security baseline can be updated incrementally and can be further enhanced by automating the process to “bake in” the base set of security controls to reproduce a consistent security baseline blueprint for deployment.

3. Check — stay vigilant over evolving security threats

Changes in the threat landscape can shift the system’s security posture and hence, it is important to maintain constant awareness of our security posture.

Technology tools can help to detect threats and weaknesses (such as drifts in security configuration, or unauthorised activities) in a timely manner. However, detection by itself is insufficient and it should be combined with follow-up action to analyse the events or alerts. This ensures gaps are remediated and the security baseline can be updated incrementally.

4. Act — verify the readiness of response and continuity plans

Though the security management of a cloud infrastructure lies mainly with the cloud provider, it is paramount for the business owner to place priority on response and continuity plans. In the unfortunate event that a security incident occurs, such plans will help in providing a robust response to the incident and recovery actions can be activated quickly to work towards normalcy. Notably, a badly managed incident can affect the reputation of the business.

Each of the 4 steps is iterative, similar to how modern systems are developed whereby they can cater to an immediate business need but are always ready to take on future requirements. Enhancements can be incrementally built and deployed in weeks rather than months. De-risk by design brings agility — to do the right things from the start and “bake-in” security as early as possible.

So, by taking on this de-risk by design approach, do we still need to remain vigilant?

This approach is not a silver bullet that solves all security challenges. Gaps are still likely to occur and, in some cases, the gaps may be recurring.

One possible reason is despite the remediation actions put in place, the root cause was not addressed. Here is a method to augment the de-risk approach by identifying the root cause using the “5-Why” technique.

The 5-Why technique aims to help identify the root cause of the gaps surfaced

You may have heard of or used this method that analyses the “cause-and-effect” relationship which results in a problem. If you’re a parent, you may recall, much to your exasperation, your child asking ‘why’ repeatedly! The objective of this technique is to reveal the root cause of a problem through a series of ‘why’ questions based on the answer derived from a previous ‘why’.

As a rule of thumb, the style of questioning is meant to assess the series of events leading to the root cause without the intention of assigning blame. When used correctly, the root cause is usually revealed at the fifth why. It may well be a process failure that is within your control.

Here’s an example of a problem scenario to illustrate how this works. The questions and responses that are correlated to each other are highlighted in bold.

In this scenario, the root cause of the problem is that there was no proper guide to perform security assessments quickly which indicates a process failure. You could reduce the frequency of configuration changes or extend the delivery deadlines, but these are unlikely to prevent the problem from reoccurring. The solution could be to develop a process that can guide the developers in conducting a security assessment and ensure all team members are kept updated of this process through regular briefings and knowledge tests. This sounds challenging especially when one has to make many time-pressured iterative changes. Perhaps the de-risk by design approach may help alleviate the situation!

Key Takeaways

In summary, here are the three key takeaways from this article:

1. Determine the security objectives that establish the relevant security risks and the appropriate control measures. This helps ensure the investment (cost and effort) spent on security controls remain relevant and on point to manage risk through iterative application or system changes.

2. Discover security risks and introduce control measures early, preferably at the start of the Design phase. The “De-risk by Design” approach, a repeatable process enabled by agility, can be a practical method for risk identification and treatment.

3. Ensure your response plans are updated and remain relevant. Conduct regular exercises to help each team member stay familiar with the processes. After all, response measures should be constantly oiled so that action can be taken swiftly. And do not forget the “5 Whys” to identify the root cause to plug security gaps!

I hope this article has helped provide an agile and repeatable approach to be risk and control aware at every iterative step of business change. After all, security risk management ought to be a team effort!

CSG @ GovTech

GovTech CSG — keeping the Singapore Government’s ICT and Smart Systems safe and secure

CSG @ GovTech

CSG — cyber lead for the Singapore Government sector — keeping the Singapore Government’s ICT and Smart Systems safe and secure. Our blog is all about the techniques and technologies in cybersecurity. We post fortnightly. Till then, stay cyber safe, and cyber ready!

Mark Lee

Written by

Mark Lee

CSG @ GovTech

CSG — cyber lead for the Singapore Government sector — keeping the Singapore Government’s ICT and Smart Systems safe and secure. Our blog is all about the techniques and technologies in cybersecurity. We post fortnightly. Till then, stay cyber safe, and cyber ready!