Results of the NIPS Adversarial Vision Challenge 2018

Published in

Bethgelab

5 min readNov 9, 2018

The winners of the NIPS Adversarial Vision Challenge 2018 have been determined. Overall more than 400 participants submitted more than 3000 models and attacks. This year the competition focused on real-world scenarios in which attacks have low-volume query access to models (up to 1000 queries per sample). The models only returned their final decision but not gradients nor confidence scores. This mimics a typical threat scenario for deployed Machine Learning system and was supposed to push the development of efficient decision-based attacks as well as more robust models.

The completed model track on the CrowdAI platform.

All winners perform at least an order of magnitude better (in terms of median L2 perturbation size) than standard baselines (like transfer from vanilla models or vanilla Boundary attack). We asked the top-3 entries in each track (defenses, untargeted attacks, targeted attacks) for an outline of their approach and you can find their answers below. The winners will present their methods at the NIPS Competition workshop on the 7th of December from 9:15–10:30.

A common theme of winning entries in the attack tracks are low-frequency versions of the Boundary Attack and ensembling of different defenses as substitute models. In the model track the winning entries use a new formulation of robust models (I guess we have to wait for the details to be announced at the workshop) and use a new gradient-based iterative L2 attack for adversarial training. We will publish another post in the coming weeks with more detailed results, including visualisations of the adversarial examples generated against the defended models. The winning entries will be released in the upcoming weeks.

I’d like to give a shout out to all the great people that helped to initiate and run this competition. First and foremost, Mohanty Sharada, Florian Laurent, Anhad Jai Singh, Marcel Salathé and everyone else at Crowd AI have worked countless hours to keep the complex orchestration backend running. Jonas Rauber and Behar Veliqi helped with many technical issues that have been arising for participants throughout the challenge. Finally, Alexey Kurakin, Nicolas Papernot and Matthias Bethge helped with the setup and the promotion of the competition.

Defenses

1st place: Petuum-CMU (91YXLT in the competition)
Yaodong Yu*, Hongyang Zhang*, Susu Xu, Hongbao Zhang, Pengtao Xie and Eric P. Xing (*: equal contribution), Petuum Inc, Carnegie Mellon University, University of Virginia.

In order to learn deep networks that are robust against adversarial examples, we analyzed the generalization performance of robust models on adversarial examples. Based on our analysis, we proposed novel formulations to learn robust models with generalization and robust guarantees.

2nd place: Team Wilson
Xuefei Ning, Wenshuo Li, Yu Wang (Tsinghua University, Beijing, China)

Adversarial training using mutual learning and distillation, on both black-box generated examples and white-box generated examples.

3rd place: Team LIVIA (JeromeR on the leaderboard)
Jérôme Rony & Luiz Gustavo Hafemann (ETS Montreal, Canada)

We trained a robust model with a new iterative gradient-based L2 attack that we propose (Decoupled Direction and Norm — DDN), that is fast enough to be used during training. In each training step, we find an adversarial example (using DDN) that is close to the decision boundary, and minimize the cross-entropy of this example. There is no change to the model architecture, nor any impact on inference time.

Untargeted Attacks

1st place: Team LIVIA (JeromeR on the leaderboard)
Jérôme Rony & Luiz Gustavo Hafemann (ETS Montreal, Canada)

Our attack is based on a collection of surrogate models (including robust models trained with a new attack we propose — Decoupled Direction and Norm — DDN). For each model, we select two directions to attack: the gradient of the cross entropy loss for the original class, and the direction given by running the DDN attack. For each direction, we do a binary search on the norm to find the decision boundary. We take the best attack and refine it with a Boundary attack.

2nd place: Team TSAIL (csy530216 on the leaderboard)
Shuyu Cheng & Yinpeng Dong (Tsinghua University, China)

We use a heuristic search algorithm to refine adversarial examples, which shares the similar idea with Boundary attack. The starting point is found by BIM attack using the Adversarial Logit Pairing baseline to transfer. In each iteration, the random perturbation is sampled from a Gaussian distribution with a diagonal covariance matrix, which is updated by the past successful trials to model the search directions. We restrict the perturbation within the center 40*40*3 region of the 64*64*3 image. We first generate a 10*10*3 noise, then resize it to 40*40*3 using bilinear interpolation. Restricting the search space makes the algorithm much more efficient.

3rd place: Petuum-CMU (91YXLT on the leaderboard)
Yaodong Yu*, Hongyang Zhang*, Susu Xu, Hongbao Zhang, Pengtao Xie and Eric P. Xing (*: equal contribution), Petuum Inc, Carnegie Mellon University, University of Virginia.

We ensembled different robust models and different adversarial attack methods under several distance measurement metrics from Foolbox to generate adversarial perturbations. Additionally, we selected the best attack which minimized the maximal distance when attacking the robust model under different distance metrics.

Targeted Attacks

1st place: Team Petuum-CMU (91YXLT on the leaderboard)
Yaodong Yu*, Hongyang Zhang*, Susu Xu, Hongbao Zhang, Pengtao Xie and Eric P. Xing (*: equal contribution), Petuum Inc, Carnegie Mellon University, University of Virginia.

We ensembled different robust models and different adversarial attack methods from Foolbox to generate adversarial perturbations. We found that ensemble approach makes our target attack model more effective with regard to various robust models.

2nd place: Team fortiss (ttbrunner on the leaderboard)
Thomas Brunner, Frederik Diehl &Michael Truong Le, Fortiss GmbH

Our attack works similarly to the Boundary Attack, but does not sample
from a random normal distribution. Among others, we employ low-frequency
patterns that transfer well and are not easily filtered by a defender.
We also use the projected gradient of a substitute model as a prior for
our sampling. In this way, we combine the best of both worlds (PGD and
Boundary Attack) into an attack which is both flexible and
sample-efficient.

3rd place: Team LIVIA (JeromeR on the leaderboard)
Jérôme Rony & Luiz Gustavo Hafemann (ETS Montreal, Canada)

Our attack is based on a collection of surrogate models (including robust models trained with a new attack we propose — Decoupled Direction and Norm — DDN). For each model, we select two directions to attack: the gradient of the cross entropy loss for the target class, and the direction given by running the DDN attack. For each direction, we do a binary search on the norm to find the decision boundary. We take the best attack and refine it with a Boundary attack.

Results of the NIPS Adversarial Vision Challenge 2018

Defenses

Untargeted Attacks

Targeted Attacks

Written by Wieland Brendel