Adversarial Attack to Vulnerable Visualizations

BenTwist
VisWeekly
Published in
4 min readAug 16, 2018

--

Deep learning models are vulnerable to adversarial attacks (1). Given an image with inconspicuous and elaborate noise, convolutional neural networks can be easily fooled, for example, to label a panda to gibbon with high confidence. These examples that mislead machine learning models are called adversarial examples, which causes an urgent focus on the security of artificial intelligence.

An adversarial input, overlaid on a typical image, can cause a classifier to miscategorize a panda as a gibbon.

Similarly, visualization can also be attacked by adversarial attacks without exceptions. Visualizations map data to visual stimuli to help people make decisions through visual decoding leveraging graphical perception, instead of automatic decisions by machines. However, well-designed turbulence on the data to be visualized, although ought to be detectable, may be erased or amplificated by careless or malicious visualization design. This phenomenon is phrased as black hat visualization (2) inspired by the similar concept in computer security. This is very important because these adversarial attacks are too subtle to perceive, which can dramatically affect how data are visualized and subsequently interpreted. For example, Erin, a visualization designer with malicious intent, can alter the decision makings of Brook by producing adversarial visualizations.

A model of a visualization “attack.” Data scientist Alex has a dataset they wish to communicate to stakeholder Brook. Unfortunately, Alex must go through visualization designer Erin, who has malicious intent. Erin has many potential visualization designs at their disposal, but chooses one that is likely to cause Brook to have an incorrect impression of the data.

A recent study (3) shows that adversarial visualizations have a complex relationship with the chart types and parameter settings. Researchers study how well different distribution visualizations show data flaws, and show that choices (careless or malicious) or visualization design parameters can make the visual signatures of data flaws more or less prominent.

Researchers added a “flaw” (in this case, 10 sample points all with the same value) to a dataset. Visualizations in the left column are of the original distribution; those on the right are of the flawed distribution.

They exemplify their assumptions by examining the vulnerability of three commonly used visualizations during exploratory data analysis of univariate data, including histogram, dot plot and density plot. Given an initial set of samples, they add some flaws in data, such as noise, mean shifts or gaps. Then they search the parameter space of the three kinds of visualizations, including the histogram bins, KDE bandwidths for density plots, and plot mark radii with mark opacities in dot plots. By minimizing the two differences of the original visualization and resulted visualization, they get a set of visualizations that seem to be adversarial. The left image shows some synthetic adversarial visualizations. This pre-experiment shows that it’s possible for a designer to create visualizations that seem to accurately convey data, but in fact, hide or obscure important features or flaws.

Above experiment triggers more questions: How good is the general audience at detecting data flaws in standard visualizations of distributions? Do certain visualizations result in more reliable detectability than others? How sensitive are these visualizations to the design parameters required for their construction? In order to answer these questions, researchers conduct a crowdsourced experiment to evaluate how detectable different data flaws are amongst different visualizations, and how robust this detectability is amongst different design parameter settings. They recruit 32 participants to identify the hidden adversarial visualization among n visualizations.

Figure (a) and (b) show the same univariate datasets. Nineteen of these charts are “innocent” random samples from a Gaussian. One “guilty” chart is mostly random draws, but 20% of samples have an identical value.

Take the above image as an example. For each kind of chart type, 20 visualizations from random samples of Gaussian distribution are shown to users. One “guilty” chart is mostly random draws, but 20% of samples have an identical value. The oversmoothed density plot makes this abnormality difficult to see (participants were only 35% accurate at picking out the correct density plot). Low opacity dot plots, however, make the dark black dot of the mode easier to detect (85% accuracy). Can you guess which one is the guilty one? Find the answer at the end of this article.

Three kinds of data flaws

They test three kinds of data flaws including spike, gap and outlier like the left image. Different flaw magnitudes and parameter settings are tested during the experiment. The results can be summarized as the following image.

Performance of flaw detection task across chart types and their parameters
Accuracy at identifying which visualization contained a specific data flaw, given the size of that flaw out of 50 total points.

We can find that as the number of flawed points increases, accuracy will increase as the left image. Besides, no one visualization would dominate for all flaw types. And liberal parameters often had higher accuracy than more conservative ones.

This research provides warnings to visualization designers:

We should not only test the robustness of our design parameters in terms of the quality of visualizations in normal scenarios, but also to specifically consider whether data quality concerns are easily discoverable or detectable across the relevant parameter spaces.

Answer: The chart on the bottom left is “guilty”.

Bibliography:

  1. Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples.” ICLR 2015
  2. Michael Correl and Jeffrey Heer. “Black Hat Visualization.” Workshop on Dealing with Cognitive Biases in Visualizations (DECISIVe), IEEE VIS, 2017
  3. Michael Correll, Mingwei Li, Gordon Kindlmann, and Carlos Scheidegger. “Looks Good To Me: Visualizations As Sanity Checks.” IEEE InfoVis 2018

--

--

BenTwist
VisWeekly

PhD student of visualization. Share original academic notes of VIS, HCI and ML.