Image for post
Image for post

A key performance indicator for infosec organizations

Using probabilistic risk KPIs to direct complex risk engineering efforts.

Ryan McGeehan
Sep 16, 2019 · 6 min read

I’ve been helping a few security engineering organizations in the Bay Area experiment with quantifiable risk modeling approaches that use clear language. We’re doing this to subject security teams to better measurement beyond (or in addition to) compliance, checklists, grades, color coding, or maturity models.

It’s difficult to unify broad security work with disparate disciplines under a single quantitative key performance indicator (KPI) that addresses rarely occurring and high impact cybersecurity risks.

We will discuss the potential of probabilistic, risk aware KPIs seeing experimentation at a few large tech companies. First, some background:

Leadership likes to elevate certain KPIs to guide behavior.

Some examples:

Twitter has the Monthly Active User and Timeline Views. Facebook has similar. Others are more dollar based. Cloudflare uses Paying Customers and Dollar-Based Net Retention Rate. eBay has Gross Merchandise Volume. Uber has Trips.

Leadership can choose these as rudimentary proxies for success and for investor awareness. They act as a north star for a massive organization. These sorts of leadership tools create simple and environmental self assessments for employees to test for good ideas or bad ideas.

Problem: The security industry inside tech has not adopted risk based probabilistic and quantitative KPIs that can be stood up by leadership for large and complex engineering organizations.

What is a reliable KPI for a complex security organization?

The following reasoning comes by way of my general conversations with CEOs and CISOs in consulting engagements. If you ask what the goals of a security team should be, they might respond with water-cooler generalities like the following.

  • “We don’t want to lose any customers.”
  • “We don’t want to be fined or regulated.”
  • “We don’t want to be a headline.”
  • “We don’t want to be pulled in front of the senate.”
  • “We don’t want to lose customer data or IP.”
  • “We don’t want to harm our customers.”

This is good. They relate closely to simple, probabilistic and risk aware measurements. Here are some:

The probability that within 1 month / quarter / year:

Many of the below involve some sort of internal incident classification. This is a qualitative measure of how bad an incident is. Some companies use the P0 classification.

  • > N regrettable customer exits resulting from aSEV0.
  • Any party in {set of regulators} formally discusses a SEV0 with us.
  • A SEV0 with >$10M of losses. Choose your own threshold!
  • A {set of bloggers and newspapers} publishes commentary on a SEV0.
  • A SEV0 has confirmed, unauthorized access to customer data.
  • >% of total users impacted by a SEV0 involving an explicitly defined failure.

These can be measured and tested with a variety of subjective and quantitative risk measurement methods. Understanding how risk based KPIs can be useful in engineering contexts is an important goal for me.

Can we trust a probabilistic KPI?

I think we can. NASA approves and launches missions similarly with a probability of Loss of Crew, I think we can use probabilistic KPIs too, even with the involvement of game theoretic adversaries, as the intelligence community does.

KPIs have limitations and faults. They’re a north star when used honestly. They are elevated only to guide behavior and help with decision making.

An organization will likely define their own non-probabilistic KPIs in support of the probabilistic goals anyway. That’s fine, as it’s a normal case to see tasks and teams deviate from organization-wide KPIs.

For example: Mean Time to Detection is a potential metric for a security team. Is it a good, single proxy for your entire security program? Probably not. To avoid creating a long list of metrics to act as a proxy for success, organizations usually hone in on short list of KPIs.

We have the opportunity to be less radical about narrow metrics that represent passing efforts or interests as engineering focuses change, and hold steady to more reliable, broad KPIs over time. We should explore how to introduce subjective, probabilistic risk into KPIs.

It’s all models, anyway.

Those features allow us to build expert panels, add, subtract, and prioritize expected values, determine appetite and tolerance, decompose a risk into causes, use frequency data to inform our opinions, measure for error, monte carlo tooling, and more!

Avoiding the organizational pitfalls of risk based KPIs.

This generates problems which we must discuss.

No one fully believes that a KPI can truly capture success as a model. Models can only go so far in capturing a normative concept. This is wonderfully fictionalized in The Wire and models are often pointed out as disingenuous.

So while KPIs are selected carefully, they’re not holy by any means, and risk based KPIs inherit well known problems.

Organizations that make decisions in spite of a KPI are not rule breakers or criminals.

Risk based KPIs will suffer the same weaknesses simply from being quantitative and a desire to game them. An organization will eventually see efforts that increase risk (like an M&A), and that is OK. This is where toxicity could brew, otherwise.

Examples: We celebrate functions that discover risks like red teams, penetration testing, threat intelligence, and hunting. These are good for a company! But may also increase the risk we measure at times as a result, as you build evidence about risks.

That’s ok! We wouldn’t want to limit those efforts because they discover information that would increase a risk based KPI. We’ll need norms that protect measurement and realize that they should be accurate, not minimized.

If norms around honest risk measurement are enforced, along with forecast accountability, we can get to a place where a red team can technically do a good job when not succeeding. Red teams will have some of the most interesting measurement potential in a security organization in a healthy environment.

Increasingly accurate and calibrated risk forecasting is the direction we’d want to pursue as an industry, regardless of whether it’s finding new or fixing old areas of risk.

More data is not necessarily useful for leadership.

It’s a problematic thing that CSO leadership desires this… while also very indicative of their needs. Leadership at a (non-security) KPI driven organization can count on one hand the numbers they steadily lead with. If KPIs and a strategy haven’t been decided on to represent security, it can be difficult to present that we are secure.

In lieu of this structure, we pursue many metrics instead.

This brings me to the point: Top level KPIs rarely change. For example, Facebook’s MAU has only seen minor tweaks for a decade. The underlying business units shift and change their KPIs as the business evolves, but the leading KPI sticks around a familiar form for the most part. These are not GAAP metrics. They were chosen by some form of consensus or executive decision.

Security teams often run without risk based KPIs, yet are interested in reducing risk. I’d like to continue exploring why!

What’s next?

If you’re exploring the area of risk measurement for engineering organizations, I’d love to hear about it.

Ryan McGeehan writes about security on scrty.io

Starting Up Security

Guides for the growing security team

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store