Machine Learning for Physical Security — an Interview with Stabilitas Data Scientist Dr. Mikhail Zaydman

Chris Hurst

Published in

Stabilitas

8 min readNov 2, 2017

Stabilitas COO Chris Hurst recently interviewed the team’s Lead Data Scientist, Dr. Zaydman.

Dr. Zadyman’s bio is below.

Purpose of the interview: An Intro to ML for Security Professionals

We hope this interview is helpful to security managers, analysts, EP teams, and others toward understanding an overview of how Machine Learning (ML) works. After Mark Cuban spoke extensively about AI at this year’s ASIS conference in Dallas (and after seeing a lot of eyes glaze over!) we’re providing our take on how Machine Learning can improve the lives of security professionals.

Chris Hurst: Mikhail, first, what attracted you to the security space?

Mikhail Zaydman: The security space is interesting because it is fundamentally global. You can’t think about security without thinking about global forces. That creates a lot of interesting challenges and a lot of interesting opportunities to understand data trends and to struggle with complex data questions.

For example, how do you gain information across languages, across boundaries, and across types of data? From a data science and a policy perspective, it’s a space to think about big questions- and that’s exciting.

Data science for security also has significant real-world impacts — and those can sometimes be hard to assess. Things that are averted don’t necessarily make a great metric because things that don’t happen are hard to measure. But measuring the unmeasurable and making the undetectable detectable is a cool challenge.

CH: Can you give me a concrete example of how Machine Learning helps security managers? Complete the sentence: “In the world before ML, a security manager struggles because ___. But in an ML world, his or her life is different because ___.”

MZ: In the world today a security manager watches 24 hours of CNN and gets a headache. In a world with ML, the security manager sees just three minutes of useful content and can move on with their day. They spend their time on proactive analysis and response, rather than sorting through huge amounts of meaningless information to find the needle in the haystack.

CH: Is there a particular example of a type of incident that you think ML will help unearth for that security manager?

MZ: In an ideal scenario the answer to that question is “no”! The goal of ML is to unearth every relevant incident — and nothing else.

The way that we have our system set up, you would have the advantage of collapsing a lot of similar reports, so that you are not forced to cycle through 300 stories about, say, civil unrest in Catalonia. They would all be in one package that you can dig into if you’re so inclined.

While civil unrest is happening in Catalonia if there is a knife attack in Israel, that still pops up, and it’s still given the correct severity. The security manager can take their time to review the Catalonia civil unrest and then dig into the knife attack in Israel because that’s what affects them the most. The goal is to be able to have all relevant incidents available for the consumption by the Security Manager — in real time.

CH: Mikhail, many of our customers have been in the military for maybe their whole career — or police, FBI, State Department, or diplomatic security service. It’s rare to have significant exposure in those fields to A.I. or data science. How would you explain A.I. and ML to a security manager?

MZ: The simple way to explain this is, “have you run a search query on a set of data? Have you Googled the price of a ticket? Have you looked at a document and searched using “control+F” for the term you care about?”

All that we are really doing is providing a simple way for the data stream to be filtered to the interests of the Security Manager or Analyst. At the core we are just one very fancy control+F function on a very big document. The only difference is that, rather than having control+F on a large search of different words and scanning for those things, we do that all for you.

In developing our training data sets, we’re ensuring we understand the terms and concepts you might search for, and we apply those tools to the large stream of data. We surface those things that you would normally look for. Just like you consume information today, with ML, the information is parsed for you in the way that you would want it to be parsed if you did it yourself.

C: So last week a security analyst asked, “Is Stabilitas just using keywords to send alerts to me? When the word “bomb” shows up you send the article to me — is that right?” How would you respond?

M: No it’s not right, because if we were to do that, [a keyword search using the word “bomb”] you would get a lot of the wrong type of articles. Maybe you’d get articles from the 1990s, saying “that movie was the bomb.”

A common example that I encountered in my dissertation work for example was keyword searches on “depression.” A keyword search would send articles on economic depression — or depressions in the ground in somebody’s backyard. I needed updates on depression related to mental health — and nothing else.

Words mean different things in different contexts. That’s why keywords are not sufficient for the task at hand. Machine learning accounts for context.

C: So let’s jump in deeper into the tech. What problems are you working on solving in your first few months at Stabilitas?

M: Right now what I have been focused on is making sure that we have an organized and structured way to think about all our data questions and our data approaches. So to that end I’ve spent a lot of energy on driving us towards very clear definitions — of relevance to security teams and their firms, of the severity of the incident, of the type of incident (geopolitical versus terror, for example) — which can be easily understood by humans. Therefore we can hopefully get machines to replicate that human understanding.

So being able to very clearly delineate what is important information and how best to categorize information is a foundational step of any real data work. A machine is not magical, it can only emulate what it is given. So if you are not sure what you’re giving it, its going to do a very bad job.

Editor’s note for clarity: Here, Dr. Zaydman is describing the challenge that humans may disagree on the relevance of a particular incident. For example, an executive protection firm overseeing the security of a few individuals may need to know if a particular street is closed when a water main breaks. That street may be the designated route to a hospital, or a critical meeting. Meanwhile, a 3-person corporate security teams overseeing the security of thousands don’t have the bandwidth to think through the implications of street closures. Without human consensus on relevance or severity, a computer can’t be trained on a model. We’ve had to address this by building consensus internally on our team regarding relevance.

MZ, continuing: Like a little kid, if you don’t give the machine clear instructions and clear parameters, it kind of loses its mind.

To that end we’ve built up a whole protocol — the use of tight definitions to be sufficiently clear and formal so that then we can have people actually provide replicable training data for the machine.

And then all of this is leading towards the core problem we are trying to solve. That is, how do you turn a large volume of unstructured human text into meaningful information for our customers?

Editor’s note for clarity. Here, the “large volume of unstructured human text” is broadly described as “all open source information” — or rather, all digital news and social media.

MZ, continuing: And so at the moment, we have four dimensions really. First, we have to assess whether an article is useful. Second, we have to asses what type of article it is — whether it is war, terror, a public health crisis, etc. Third, we have to assess how severe it is. Fourth, we have to determine where in the world is this happening, just from words.

And so by being able to clearly delineate what each of these things means to us and getting it so clear that humans can replicate it consistently, we are refining our tools to answer these questions with machines at scale and very rapidly.

Based on our current work, we are seeing good performance with traditional Machine Learning approaches. We’re also experimenting and pushing further into deep learning where we teach the machines to abstract patterns in data that a human can’t recognize to make a classification more effective.

When built correctly, tech can help security analysts better do their jobs - an exoskeleton - not a replacement. Image source: http://crunchwear.com/

C: Let’s close with this: you once mentioned that our system is kind of like an exoskeleton for Security Managers. Can you say what you mean by that?

M: The goal is to not change the way in which the Security Manager or Analyst is producing value. The Security Manager wants to have information, make decisions on that information, and ensure the safety and security of those that are their responsibility. We want the Security Manager to continue to do that.

It’s just that with our system, the security manager has a mechanical suit— an exoskeleton — which allows them to tear through enormous volumes of data. As opposed to before where Security Managers could follow CNN and a few other sources, now they have the kind of capacity to effectively see the results of thousands of online sources from around the world across different languages. So one person before could only look at a few places and make the decisions, now they get the information out of thousands of sources and can make decisions based on those sources in a fraction of the time. It’s much, much stronger.

C: Thank you Mikhail!

—

For more information about AI, physical security, risk management, or all-in-one incident detection and crisis communications technology, please reach out to us at info@stabilitas.io or (202) 683–7760. Or schedule a demo at https://stabilitas.io/.

Feedback for Future Posts

What did we miss? What questions do you want to see answered? Let us know at info@stabilitas.io.

Dr. Zaydman’s bio

Dr. Zaydman has a PhD from Pardee RAND Graduate School. In his dissertation, he applied machine learning techniques to social media data to understand popular attitudes towards mental health and mental health treatment. Other work included econometric analysis of health care educational standards and integration of digital technologies into the health care sector. He has published on topics of national security and policing and justice. Dr. Zaydman has a Bachelor’s from Georgetown and prior data science work at the World Bank and Geico insurance.

About Stabilitas

Stabilitas was founded in 2013 by US Army veterans and Harvard Business School and Kennedy School of Government Alums. They set out to address security problems and shortcomings they found during their service. “We had used the military’s intelligence tools, but we didn’t think they were forward looking enough. Something was missing.” Aiming to improve incident detection, data visualization, and crisis communications, they began work on a tech platform for security professionals. Stabilitas customers range from small NGOs to global companies. Stabilitas is a 2015 Techstars alum, and was awarded National Science Foundation SBIR grant, as well as the “Security’s Best” award by ASIS International in 2015.

Machine Learning for Physical Security — an Interview with Stabilitas Data Scientist Dr. Mikhail Zaydman

Purpose of the interview: An Intro to ML for Security Professionals

Feedback for Future Posts

Dr. Zaydman’s bio

About Stabilitas

Published in Stabilitas

Written by Chris Hurst