Want to improve quality and security of machine learning? Design it better

Taylor Armerding
Feb 21 · 7 min read
Photo by Franck V. on Unsplash

When it comes to analog life in general, “Don’t worry, be happy” might be a good motto. But when it comes to our massive, sprawling, complex digital world, worrying a bit more might be a more effective path to happiness.

Especially when it comes to machine learning (ML).

ML is already embedded in modern society in numerous ways but is still rapidly expanding in both power and risks. So if those creating ML systems worried more about the risks, they might actually do something to address them. Which would give us all reason to be happier — and safer.

And which is what motivated the relatively new Berryville Institute of Machine Learning (BIML) to describe a few things to worry about regarding ML.

More than a few, actually. Its recent report, “An Architectural Risk Analysis of Machine Learning Systems” offers 78 of them, conveniently grouped into nine categories that the authors say are based on their “understanding of basic ML system components and their interactions.”

The way to address those risks, the report said, is with rigorous ARA — architectural risk analysis — a well-established method of reducing bugs and defects in applications, systems and networks built with software.

Knowing possible risks makes it easier to design a system or product that avoids them. Indeed, an estimated half of the software defects that create security problems are due to flaws in design.

ML can demonstrably do many good things faster and better than humans — detect financial fraud, diagnose diseases, make airline security screening faster and more effective, translate documents, give us better online searches and make smart homes and vehicles ever smarter.

Dumb to dangerous

But it can also do things that range from irritating to dangerous. An ML algorithm that keeps generating ads for shoes on every website you visit when you just bought a pair and don’t want another for a while seems pretty stupid but is mainly just a nuisance.

But if an ML system is improperly trained, through accidental or malicious design flaws or through vulnerabilities that let attackers tamper with its functioning, it can create serious problems or danger — everything from discrimination to intrusive surveillance to catastrophic accidents.

According to a report just this past week, a couple of security researchers fooled an autonomous vehicle camera system into thinking a 35-mph speed limit sign actually read 85 mph, simply by putting some tape on it.

Then there was Microsoft’s “Tay” chatbot, launched on Twitter in 2016 with the idea that it would learn from having “conversations” with others. It did learn, from snarky users who turned it into a misogynistic, racist xenophobe in less than 24 hours. Which was both amusing and disturbing.

Those and other weaknesses of ML systems, say the BIML authors, mean that “an architectural risk analysis (ARA) is sorely needed at this stage.”

Gary McGraw, software security expert, cofounder of BIML and one of four coauthors of the report, said that while ML has been in use for decades, “we’re still doing the same stuff as in the late ’80s, except computers are much more powerful and there’s much more data. And we’re not doing much to secure it.”

ARA, the report said, can help create ML systems that avoid those mistakes. It “takes a design-level view of a system and teases out systemic risks so that those risks can be properly mitigated and managed as a system is created.”

The report imagines a “generic” ML system with nine components that include both “processes” and “things or collections of things”: raw data, dataset assembly, datasets, learning algorithms, evaluation, inputs, trained model, inference algorithm, and outputs.

As is obvious, ML focuses on, and depends on, data, more data and still more data. That is how it “learns.”

It’s all about the data

So while “building security in” to an ML system is philosophically similar to building secure software — an application built with rigorous ARA at the start is much less likely to have bugs and defects in its software — the emphasis with ML is on the risk that data is flawed at the start, or can be corrupted or exploited.

Questions about that data include: Where did it come from? Who had control of it before an ML team decided to use it? Is there enough of it? Is it of good quality? Is it vulnerable to manipulation? Can the system continue to learn from more data after it has memorized one data set? Is it secure? Is it legal (under the various data collection and privacy laws) to possess it?


Those and other questions lead to the BIML’s list of 78 risks, but the authors start with the Top 10:

- Adversarial examples: Probably the most familiar attack, in which the goal is to fool an ML system “by providing malicious input, often involving very small perturbations that cause the system to make a false prediction or categorization.”

Data poisoning: Pretty much what it sounds like — manipulating data to make the system learn the “wrong” things.

Online system manipulation: An “online” ML system is one that continues to learn while it is being used. So an attacker who can feed it malicious input can train it to do the wrong thing.

Transfer learning attack: This refers to ML systems that are built on a so-called “base model” that is already trained and then modified for a specific task. An attacker who can compromise the base model can then “transfer” that learning to make the system on top of it misbehave.

Data confidentiality: Protecting data is a fundamental duty of all companies that collect and use it. But ML makes that even more difficult. “Subtle but effective extraction attacks against an ML system’s data are an important category of risk,” the report said.

Data trustworthiness: This refers to the risk of not knowing where data came from or whether its integrity has been preserved. “Data-borne risks are particularly hairy when it comes to public data sources (which might be manipulated or poisoned) and online models,” the report said.

Reproducibility: Given the “inherent inscrutability” of ML systems, “ML system results are often under-reported, poorly described, and otherwise impossible to reproduce. When a system can’t be reproduced and nobody notices, bad things can happen,” the report said. Sort of like a science paper that can’t be peer reviewed.

Overfitting: This describes the risk that an ML system, once it has memorized its training data set, won’t generalize or modify its learning from new data.

Encoding integrity: This risk exists because of the reality that an engineering group (of humans) filters, processes and encodes data before it is used in an ML system. “Encoding integrity issues can bias a model in interesting and disturbing ways,” the report said.

Output integrity: This is the risk that an attacker can get between an ML system and the world and then skew its output. “The inscrutability of ML operations (that is, not really understanding how they do what they do) may make an output integrity attack that much easier, since an anomaly may be harder to spot,” the report said.

Those are just the most important risks. There are 68 more for development teams to consider when they are getting into the weeds of designing an ML system that will both work as intended and be reasonably secure from attack.

Just don’t look to the report for recommended solutions to any of those risks. McGraw said that was not the point. But he said if organizations building ML systems are aware of the risks, “they can design around them.”

Travis Biehn, principal consultant at Synopsys, said ARA that yields awareness of risks and possible attack methods on ML or artificial intelligence (AI) systems can indeed help developers create designs that avoid them.

He said a secure design will yield a system that “meets privacy expectations,” such as not phoning home with telemetry, “doesn’t over-reach on access requests” such as asking to access all contacts, “doesn’t include hidden features like a camera or microphone, and that appropriately separates data flows, including sanitizing user input before use.”

Design with risks in mind should also yield a system that is secure, has integrity, is transparent and does what it is intended to do.

However, so far, the fascination with what ML can do is apparently much more compelling to those in the industry than the notion of making it secure.

“It’s very hot,” McGraw said. “Everybody wants to use it. But we’re making the same mistakes with it that we made with software.”

Blind optimism

Biehn agreed. “Unfortunately, organizations are still optimistically pursuing AI technologies, not necessarily incorporating standard security practices to address the practical ways that things will go wrong,” he said.

His Synopsys colleague Chandu Ketkar, senior principal consultant, said the hope is that awareness of risks will generate some action. “If ML designers are made aware of how malicious attackers could tweak or fool their learning systems, they would be more inclined to conduct security architecture reviews of the system,” he said.

Which is another way of saying that perhaps the ML motto ought to be, “Worry. Then you’ll be happy.”

The Startup

Medium's largest active publication, followed by +605K people. Follow to join our community.

Taylor Armerding

Written by

I’m a security advocate at the Synopsys Software Integrity Group. I write mainly about software security, data security and privacy.

The Startup

Medium's largest active publication, followed by +605K people. Follow to join our community.

More From Medium

More from The Startup

More from The Startup

More from The Startup

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade