Totally flirting for your password…

What does a world with automated social engineering look like?

And how should that change the way we approach security and disclosure?

The technology now exists to create tools for gathering public information on people and spear-phishing them — automatically, at scale. Or creating a system that uses calls with forged realistic voices to impersonate someone. These new attack capabilities are being made possible by modern AI and may have significant implications on how we should approach security disclosure.

So what exactly is new?

Two examples: advances in AI enable conversation and impersonation

  • We can make text bots that far more realistically imitate a conversation with a human.
  • We can make human quality speech from text. We can even imitate voices of a particular person extremely well.

These capabilities can be combined to automatically create believable conversations with seemingly real people.

How do we currently think about security and vulnerability disclosure?

Let’s start with the basics. Security through obscurity is clearly bad. But that doesn’t mean spill all goods right away.

The CERT disclosure framing, asks us to “minimize the harm to society posed by vulnerable products” — and we need to understand what that means when the “product” is a non-patchable and non-replaceable human.

There are two basic models for disclosure:

  • Full disclosure — just publish it all — including ideally a proof of concept. This gets things fixed faster, and helps avoid repeating the problem in future products.
  • Coordinated disclosure — where security researchers work with vendors to minimize total harm from vulnerability — though of course this only makes sense if vendors are cooperative and responsive.

Fundamentally, this works through delay. It gives good actors time to fix problems before bad actors can exploit.

What this means for AI advances

The increasing ability to automate conversation and impersonation enables social engineering — at scale. When you automate social engineering, you’re actually writing code that hacks humans not computers.

Who is the vendor here? God? Evolution? Either way, it’s not particularly cooperative or responsive! Patching is rather difficult!

So clearly we should do full disclosure right?

Not so fast. Think back to the underlying goal — the role of disclosure is to minimize harm from a vulnerability — by enabling good actors to mitigate the exploit, and potentially delaying bad actors from learning about it. But not only are humans hard to patch — you also can’t “replace” them with new models. There are significant limitations on what you can do to improve future security. This is related to the “infohazard” problem.

So perhaps just releasing all the goods — the full “exploit” code — doesn’t actually help with our goals. As an analogy, consider what I call the “easy nuke scenario.”

Say you discover a way to create nuclear weapons with supplies found in a standard kitchen.

Maybe it’s not the best thing to just release instructions for making the nukes on github? Physics isn’t patchable either…

So what do you do?

Are there alternate approaches for sharing an exploit if your ultimate goal is to protect people from attacks?

There are no easy answers here, and to really get into this, we have to break down the threat models and capabilities, and perhaps even consider what time horizons we are prioritizing (especially given that approaches for mitigation range the gamut from education to downstream security fixes). I aim to do that in a follow-up or expanded piece, but here are two potential approaches to consider in lieu of the standard approach of “open sourced” disclosure:

  • “Ask for source” — meaning that someone would need to reach out to the exploit author, or some designee, in order to get the proof of concept code; perhaps undergoing some vetting process first. This currently sometimes exists on an ad-hoc basis, but trusted institutions could be adapted to manage access with vetting perhaps analogous to biosafety levels in biology.
  • “POC as a service” — where the author or their designee would maintain control of the POC (proof of concept) code, and if someone wanted a demonstration, the POC code would be run from their servers. This lets use of POC code be permissioned and/or logged, hopefully preventing malicious use.

So how could this apply to our original examples?

  • What if a POC for an “automated NLP big data spear phishing system” used ask-for-source?
  • What if a “google duplex” for social engineering used proof-of-concept-as-a-service?

Other approaches — which are not mutually exclusive — include having an “incomplete POC” — with bugs or missing pieces that require the most specialized knowledge to resolve, or just delaying release or conditioning release (e.g. on seeing exploits in the wild).

There is also the potential to take advantage of the machine learning component of the “exploits” to weaken them, by providing limited training data, or weakened models (e.g. trained for less time, less tuned, or with less data)

These approaches might apply not just to security researchers, but also to machine learning researchers that happen to (likely unintentionally) develop tools that can be used for social engineering attacks. Of course none of these are novel — they are simply ingredients that can be mixed and applied depending on the particular exploit and threat model. The goal here is not to be prescriptive, but to at least get some gears spinning in the heads of the security researchers who are primarily focused on the (significant!) benefits of disclosure.

I began this piece with the question “What does a world with automated social engineering look like, and should that change the way we approach security and disclosure?” While I won’t prescribe what exactly should be done, I can say that we should definitely be asking these questions, exploring options to see what actually achieves our security goals, and taking steps to ensure worst-case scenarios don’t come to pass.

We need to figure out how to handle exploits that hack humans not computers — given that God seems to ignore all emails to his security @ email alias.


This is an edited and slightly expanded version of my DEFCON AI Village recorded exhibit talk from August 2018, though only a stepping stone toward a more comprehensive framework. I intend to update it through further feedback and conversations with people in the security and machine learning community.

Please reach out if you have questions, answers, thoughts, rants, or would like to collaborate — you can find me on the open web at av@ aviv.me, or on Twitter @metaviv (and this mailing list).