How to avoid adversarial examples of machine learning in public policy?

Faraz Ahmed
BuzzRobot
Published in
4 min readJan 6, 2018

Machine learning and artificial intelligence have garnered a lot of attention in the recent times. They have been praised for making possible the advancement in future technology (such as the creation of self-driving cars, image-recognition technology, etc.), and, at the same time, have been criticized for unethically collecting personal data of citizens or making biased judgements against black prisoners, to name a few. Recent advancements in machine learning have also made it possible for attackers to design algorithms that can deliberately make mistakes. This is called “adversarial” machine learning and can have a huge negative impact on the society.

A lot has been written on what adversarial machine learning is, what methods to use to detect it, and how to guard a system from being vulnerable to adversarial mistakes. In this short blog post, I will talk about how adversarial machine learning is harmful for social good, and what we can do to prevent that from happening.

Recently, when the Federal Communications Commission (FCC) invited the public to comment on whether net neutrality should be repealed, the New York State Attorney General’s Office found that most comments submitted were fake. In most cases, those fake comments used the identities of actual commenters who wanted to protect net neutrality. Inspired by that post, Data Scientist Jeff Kao used machine learning to identify how many of those comments were actually fake. He found that out of 22+ million comments, over a million comments submitted were fake. Those comments were, in fact, generated by spambots.

Kao had used clusters to identify different pairs of words that were often repeated. The spambots, by smartly combining the same pairs of words with different sentences or with different pairs of words, generated fake comments. With 22+ million comments, a simple glance of eye will be easily deceived in identifying which comment is authentic and which one is not. Submitting fake comments using machine learning is easy, but identifying that they are fake using machine learning is hard.

If there was no investigation done on whether net neutrality comments were authentic, it could have led to disastrous consequences. By suppressing the voice of people who genuinely did not want to have net neutrality repealed, adversarial computer attacks can disrupt the process of free speech and democracy.

Even if adversarial mistakes are not done on purpose by a hacker or a spambot, they can still inadvertently sneak into a system. In the field of machine learning, this can take the form of “bias”. When that happens, the system or the algorithm can take a more harmful decision against someone than he/she deserves. Bias can take place when the minority population is underrepresented in the training stage of machine learning, when humans encode their biases in the data, and when the features in data are closely related with each other.

One such notable example of bias was uncovered recently by the news investigative organization, ProPublica. They claimed that the algorithm, COMPAS, used to predict bail and sentence lengths for prisoners, was biased against black offenders. They found out, by analyzing risk assessments of white and black offenders, that blacks were twice as likely to commit an offense in the future as compared to white offenders. When the effects of race, age and gender were isolated in ProPublica’s statistical analysis, the results still showed that black offenders had higher chances of being recidivists. In real life, if COMPAS is blindly trusted by the sentencing judge, a black offender is very likely to spend more time in prison than his/her white counterpart.

A black offender is more likely to commit crimes than an offender who is white. Source: ProPublica

It is likely that cases with adversarial mistakes and biases will keep on repeating in the future. People, whether they are keen on working in the field of artificial intelligence, use devices powered by artificial intelligence, or are just curious about the impact of technology on society, should be more skeptic of technology. They can approach that skepticism in two ways. The first is to consider the issues of ethics and privacy in every emerging technology powered by artificial intelligence. Is that technology collecting data on human beings? If yes, then who controls that kind of data? Is that information being shared with other people or organizations? One can then make purchasing decisions accordingly.

The second is to make data systems less susceptible to human biases. Out of the three reasons I mentioned on how bias can take place, two of them are human-related. Engineers and scientists who design machines should think about how to fairly represent people from different religions, races, age groups etc., and how to make more objective (and universally-acknowledged) systems of accepting what is right or wrong (to put the right “ground-truth” or labels in the data). When people start to seriously think about how technology can impact the society, real rewards can be reaped.

--

--

Faraz Ahmed
BuzzRobot

Technology and politics are two of my most favorite things.