Dealing With Risk

jdtangney
Lean Security
Published in
4 min readAug 10, 2018

In my last epistle, I started exploring risk and vulnerability as properties of bugs, and concluded that security issues, functional defects and usability bugs can often be lumped together. I said we need to look at the risk associated with a bug, not just its mere existence before we decide to fix it.

I waved my hands a bit and then walked away. But now let’s take a closer look at risk.

There are plenty of models that define risk and how to calculate it. They usually boil down to some measure of the value of the asset you’re protecting, how likely a loss is to occur, and a measure of your exposure. Actuaries have waded in these waters for centuries, and like punters everywhere, they’ve studied the form and placed their bets. When insuring your car, an actuary has a pretty good idea about how likely you are to cost them money based on crash statistics for a 2007 Ford Fiesta, where you live, how old you are, and a whole lot of things that would make you shudder if you knew just how much they know about you. They have a good idea how much that car is worth. They can calculate a premium that will ensure a profit yet still remain competitive, even if you do have an accident.

But in most areas of life, it’s extremely difficult to come up with numerical assessment. Few of us know or can even begin to ascertain the statistical likelihood of a bug/vuln being exploited. Just putting a value on the assets is slippery: How much is my users’ data, their PII, worth to the biz that’s storing it? How much would it cost in lost reputaion if the data was exposed? (Smart people are debating these questions even as I type, but I’m expecting then to let me know when they’ve figured it out.) The text books say Oh, no worries, if you don’t got no numbers, just do a qualitative assessment, it’ll be fine. Which really just means assign a S, M, or L to each factor.

Ok, I’m being dismissive, and as I learn more about risk, I’m sure I’ll regret that tone. But if you’re not building up regrets in this life, you’re not pushing it to the limit, right? YOLO.

So what do you do with risk? The choices are fairly cut and dried: Avoidance, Acceptance, Mitigation and Transference. No, those are not the stages of grief or the names of the Horsemen of the Apocalypse. But their definitions are pretty straightforward.

Avoidance means you don’t do the thing that might cause a problem. Wanna avoid a traffic accident? Don’t drive or ride in a car. Wanna avoid a software bug blowing up? Don’t ship any code. Those are the extremes, and sometimes the extreme is the smart solution, and there are more subtle and pragmatic avoidance measures. For example, if you’re concerned about PII, don’t store PII. This has given rise to the oft-repeated advice: the less data you gather from people the less likely you are to regret it later.

Here’s another risk avoidance example: Avoid buffer overflow bugs by using a language and execution environment that makes them impossible. Java or Golang or Python, not C.

Acceptance is what we have traditionally done with known bugs. (And even the unknown ones as I’ll explain.) We decide that the odds of the bug being encountered, the consequences of it being enountered, and the cost to fix it all point towards just living with it. We accept the risk.

Regarding unknown bugs, we take what smells like a slightly actuarial approach to those. We know there are unknown bugs, and we can take a guess at how many there are, based on our historical find/fix rates and so on. Not many teams do that in any sort of scientific way anymore, and so we tend to rely on intuition (there’s that qualitative thing again) to help us understand whether, on the whole, our team ships crappy code or good code.

Quite frankly, it seems to me that our industry doesn’t want to stare Acceptance in the face for too long in case we see the murder in its eyes. Jus sayin. I think we accept things because we don’t want to (or can’t) put the time and effort into making a more scientific assessment. I’ll need to ponder that a bit more. What do you think?

Mitigation, you’re up. Mitigation means you put controls in place. You install an alarm or put a fire extinguisher in the kitchen or hang a Keep Out sign on the gate. What are mitigations in a software context? I think of mitigation in two layers. At the inner layer, we might look at an individual bug or a family of bugs and think about how we might prevent that bug from happening. That’s not the same as simply removing (fixing) the bug. For example, we might choose a microservices architecture in order to isolate failures: if the REST API encounters an error, the rest of system can keep going.

At the higher, more abstract layer, we can build mitigation into our processes: test-driven development, a fucktonne of automated tests, static analysis, pair programming. (If you say “code reviews” I’ll send you back to 1974.)

Transference is the last one. This is what the late Douglas Adams called an SEP Field. (I met him a few times at MacWorld a hundred years ago. He was very tall, and generally a nice person to be around.) You make the problem “go away” by declaring it to be Someone Else’s Problem. I can transfer risk to someone else. I can buy insurance so that I am no longer on the hook if things go bad — I’ve transferred the risk to the insurance company. What does that mean in the software context?

Damned if I know. LMK if you figure it out.

--

--