Complex visualizations and visualized complexity: how can we interpret the world around us?

Massimo Conte is the Editorial Coordinator of Complexity Education Project. Italian version is available here.

  • How can we orient ourselves in the complexity of the world?
  • How aware are we of what we see, necessary to interpret the world?
  • Which are the new basic skills required for citizens and those who work with data such as scientists, designers, journalists?

Introduction

The impact of Covid-19 caused in 2020 a strong discontinuity on several aspects of human action: health, economy, work, education. There is a less eye-catching cross-dimension that connects them all: our vision and understanding of the complexity of the interconnected and non-linear dynamic systems in which we live.

Unlike many other past quarantines experienced by men, this is the first worldwide one characterized by a detachment between body and mind: physical isolation to avoid contagion together with a digital hyper-connection. People’s knowledge is often completely mediated: by traditional and digital media. Consequently, our world view is completely mediated by the information sources we choose: our perception and the resulting choices are influenced by the interpreters we choose.

The Coronavirus pandemic has already had some consequences also in our relationship with knowledge, which is the central theme of this article. Scientific thought and method returned central in the public scene: the fundamental decisions of public order and health of governments made based on the advice of scientific committees, and ultimately, on data interpretation; daily updates on graphs, curves, trends as breaking news on the news broadcasts; the debate on social media among citizens with their opinions about those data.

Two key points stand out:

  • the lack of a widespread culture of the basic skills necessary to understand, represent and narrate data. Not only among citizens, but often also among professionals;
  • the lack of awareness of the lack of these skills.

So: on one side there is “how we think”, what method we use to create knowledge through hypotheses and confirm them through data. On the other side, there is the “what we look at”: scientific thinking may not be enough if the reference frame is a linear and reductionist worldview. The effort required is therefore double: an awareness of which tools we use to think, and the ones used by those who present us a thesis; and the comprehension of the basic characteristics of complex thinking, necessary to interpret a complex world. Let’s proceed step by step, and try to follow the process (split into six stages) of a person trying to get information and get an idea today, starting from data and arriving at the social debate on facts and events. Let’s start with the first step.

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

In the digital society in which we are living, every day are produced endless amounts of data. Some estimates assume that in 2020 each individual produced on average 1.7 Megabytes… per second. It means 2.5 exabytes of data produced by humans every day., i.e. 2.5 followed by 18 zeros, i.e. 2.5000000000000000000 bites. Each day.

Cognitive overload is not a risk, but an insane drift that everyone experiences daily. Big data were the engines that generated the 5 big techs, became the first companies in the world in just twenty years. But what do we mean by “data”? They are the basic elements of information, especially facts or numbers, collected to be examined and considered and used to help decision-making, or information in an electronic form that can be stored and used by a computer (source: Cambridge Dictionary). To orient ourselves, form an opinion and make decisions (albeit with limited rationality) we need to add meaning to the raw material constituted by data, creating connections inside our mental models and placing them in context.

To transform data and events into information, the user/citizen must be exposed to the communication of this information. In receiving this information, alongside the typical “5 W + 1 H” of journalism that are

  • What?
  • Who?
  • Where?
  • When
  • Why?
  • How?

it may be useful to activate what we can define as the “critical thinking kit” (adapted from R. Paul, L. Elder, 2020, p. 15), namely to question the elements necessary to develop reasoning, such as

  • Purpose: why is this information presented?
  • Questions: what issue are we talking about?
  • Information: what information is used to arrive at the conclusions? How do we know the information is true?
  • Inferences: how did we reach these conclusions?
  • Concepts: what is the main idea that the author wanted to express?
  • Assumptions: what is taken for granted in this reasoning?
  • Implications: if this position is accepted, what follows?
  • Points of view: from which point of view is the phenomenon being observed? Are there others?

Obviously nobody, not even the most willing, has the time and desire to ask all these questions for each news. This is true even more in times of continuous cognitive overload generated by the attention economy, in which many players constantly compete for our attention.

Turning the question over, we could say that questions should be considered according to the resources they require to provide an answer (L. Floridi, 2020). Asking good questions and looking for satisfying answers is the job, first of all, of scientists. And once the data has become knowledge, it must be represented. Leaving aside the epistemological issues relating to methods used to do science, here we will focus instead on the last part of the process, the step of information/data visualization, an interdisciplinary area that has had a growing emphasis in recent years.

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

Let’s partially change direction in our path, shifting the focus from the final user of the information (the citizen/user) to those who create the information and visualization of the data for professional reasons. In this category we can consider information designers, and journalists in a different way; and before them scientists and researchers, obviously with different nuances.

The “W” of journalism are translated in different ways for those who have to work on the creation and communication of information starting from data, (K. Börner, 2015). In this case it is a question of defining the object of study, so what are the questions to study a phenomenon:

  • WHEN: temporal analysis;
  • WHERE: geospatial analysis;
  • WHAT: topical analysis;
  • WITH WHOM: network analysis;
  • Statistical analysis.

In other words, it is a matter of problem definition. In the field of complex phenomena, the clear definition of the problem is the problem.

Behind the choice of the analysis to be made, even more important issues of design ethics arise. The very concept of “data” as “given” (Latin word data is the plural of datum, “(thing) given” neuter past participle of dare “to give”), “factual”, “certain” must be handled with great care: “data” are actually always “capta”, that is, taken, selected, constructed (J. Drucker, 2011). Along with all possible negative derivations, well expressed by the garbage in — garbage out formula used in machine learning: if you use unsatisfactory data, you will get poor results.

Those who work with data should know what are the dark data of their project, to clearly define the area of their work and its results. It’s even possible to use a taxonomy of dark data (D. Hand, 2020, pp. 294–8):

  • data we know are missing (known unknowns);
  • data we don’t know are missing (unknown unknowns);
  • choosing just some cases;
  • missing data essential for what matters;
  • data which might have been;
  • changes with time;
  • discordant definitions of data;
  • (excessive) summaries of data;
  • measurement error and uncertainty;
  • data that give distorted representations of the underlying reality;
  • information asymmetry;
  • intentionally darkened data;
  • fabricated and synthetic data;
  • extrapolating beyond your data.

The types of method problems to be addressed are many, and well known to those who work with data. The counterintuitive concept to be emphasized is: although “given” is perceived as a synonym for “proof”, the best way to approach a representation of data is a healthy skepticism on dark data, that is to delimit the perimeter on what it is possible and what is not possible to say from those data.

Data representations can therefore be manipulated in many ways. Let’s think, for example, of correlation, which indicates the relationship between two variables, that is, the regularity with which they increase or decrease. But the fact that two phenomena have similar curves (such as the number of films Nicolas Cage appears in each year and the number of people drowned in swimming pools, as shown by Tyler Vigen in Spurious Correlations) does not necessarily mean that one influences the other. False correlations, if taken out of context, can be definitely deceptive. Correlation is not causation.

With data it is possible to deceive others / self deceive or see / show correlations where there are none.

Credits: tylervigen.com

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

A data visualization may be interpreted as a process in the “data-information-knowledge” continuum involving two subjects: on the one hand the designer, who designs it, and on the other the final user/recipient, who will read and interact with this artifact (Masud, Valsecchi, Ciuccarelli, Ricci, Caviglia, 2010).

An effective data visualization, according to Alberto Cairo (2016, p. 45), should have five characteristics:

  • be truthful, i.e. based on honest research, which does not aim to deceive the public;
  • be functional, to help people to interpret information correctly;
  • be beautiful, that is to generate the sensation of harmony, as a combination of sensorial and intellectual pleasure;
  • be insightful, that is to lead the recipient to make discoveries that otherwise would have remained inaccessible;
  • be enlightening, i.e. give access to the information that people need to increase their well-being and change their minds for the better.

Going so deeply into development guidelines is only apparently a technicality: a conscious user or public of data visualizations is good both for the (ethically honest) designer and for society as a whole. We could define this awareness as visual critical thinking or visual literacy.

Alongside literacy (ability to read and write) and numeracy (ability to apply basic mathematical concepts and logical reasoning), there is an increasing need to include graphicacy (graphical literacy) among the basic skills. Graphicacy is the ability to correctly interpret graphs made up of words, numbers, images. Understanding an infographic occurs if we are able to decode the graphic system used (the visual conventions) and if we have some knowledge of the synthesized topic.

It is not enough that a data visualization is correctly designed, because it could still deceive the reader who looks at it or who could interpret it the wrong way, if he/she does not pay adequate attention or if he/she doesn’t have the basic skills to do so.

A data visualization can “lie”, that is, be misleading, for several reasons (A. Cairo, 2019):

  • because it is poorly designed;
  • because it displays dubious data;
  • because it shows the wrong amount of data (too many or too few);
  • because it hides or confuses uncertainty;
  • because it suggests misleading patterns;
  • because it supports our expectations or prejudices.

We have already addressed some of these points so far. Now in particular we will focus on the last one, relating to prejudices, or more generally, to cognitive biases. One of the reasons a data visualization may lie is because we tend to be inclined to lie to ourselves, that is, to accept and judge more favorably information that confirms our point of view.

Credits: Alberto Cairo

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

Cognitive biases are systematic deviations from rationality in judgment, based on distorted perceptions or prejudices, used to make decisions quickly and effortlessly (A. Testa, 2020). Mental automatisms that lead to wrong decisions starting from distorted data.

Dealing with the complexity of the world, we save our mental energies by referring to mental schemes to recognize and classify situations: such are the cognitive heuristics, real mental tricks that lead us to quick conclusions with minimal cognitive effort. Heuristics are a good example of Kahneman’s “fast thinking”; they are imperfect but convenient shortcuts to quick conclusions, while cognitive biases are ineffective heuristics, which in the long run become prejudices.

At the time of social networks (over 4 billion active users of social networks at the end of 2020), in particular some biases are particularly widespread. The list of cognitive biases is endless, they can be divided into four areas: what we should remember about; cases in which we have to act fast; cases in which there is not enough meaning; cases in which we have too much information (there is, in various versions, a rich and interactive mind map on the subject, the “cognitive biases codex”). In particular, a bias that has become particularly well-known and current in recent years belongs to the last category: the confirmation bias.

Credits: Wikipedia

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

Recently the issue of information spreading has become a key point, which was further amplified by social networks. Any information is now at your fingertips, on the screen of a computer or smartphone: economy, politics, health, terrorism, immigration. You just need to do few searches on the Internet to believe you are an expert (here the reference is to the Dunning-Kruger effect). The hoaxes, false but credible information intentionally produced, become viral, spreading in a short time with great impact.

Memes are created (the term was introduced by Richard Dawkins in 1976 in the book The selfish gene as an attempt to explain how cultural information spreads by imitation): these are small pieces of information propagating and survive on the web, just as if they were viruses. And for this reason, they sometimes come back: they continue to survive online. This is enhanced by the way social networks work: to continue to be widespread, a meme needs to cause intense emotions, such as anger towards immigrants or the fear that a vaccine will cause autism. Emotions that can take root if you don’t look at the real data. The next question is: am I sharing fake news because I don’t notice it or because it confirms my bias?

The quality of information on social media thus ends up losing relevance; people take as true what confirms their beliefs. The spread of fake news proliferates on this credulity. We accept new information only if it is consistent with our already structured belief system. One of the trends is to simplify causal mechanisms; but in the way I express the question (for example “chemtrails damage”) there may already be the conditions for the answer. And here we return to what was said previously about critical thinking and the questions that first of all those who analyze data and create graphical representations must ask. Differently, a “healthily” skeptic (and not as a conspiracy theorist) critical approach to information is a good way to evaluate the information we get.

The issue becomes: not only we tend to self-deceive by selecting only information that confirms our point of view, our worldview and our prejudices, but we tend to be in circles of like-minded people. This phenomenon is strongly amplified by the algorithms underlying social networks, which strengthen our stay in echo chambers, niches of people with common interests. This can lead to the polarization of users who feel a strong belonging to interest groups, a sort of real tribes (Quattrociocchi et al., 2016).

Credits: jamesclear.com

DATA → REPRESENTATION → PERCEPTION → BIAS → POLARIZATION → DEBATE?

At this point, if we live our “on life” in niches, if we feel our belonging to tribes facing each other with irreconcilable points of view, how can we get out?

In the online discussion, we deal with many challenges (V. Gheno, B. Mastroianni, 2018), due to the distancing effect generated by social media, the emotional involvement that leads us to defend our world, the rationalization that accentuates the weight of words, leaving less margin of interpretation due to the absence of paraverbal, the public dimension that makes what we write available, permanent and reproducible. The (demanding) viable way is to take the field and accept the debate, sharing a common grammar of “engagement”, an involvement in discussions through a non-contentious dissent potentially fruitful of new meanings.

Mastroianni (2020, pp. 80–97) lists six strategies to “join the fray” in our on life, according to disagreement and objections of our interlocutor: ignore (in the case of aggressive expressions), accept the reasoning of others (and give a profitable continuation to the discussion), partially accept (to move from competition to cooperation in the creation of meaning), ask for reasons or proofs (to bring out the real goal of the interlocutor), reject and refute the objections (if the argument is relevant and the evidences compelling), personally attack the interlocutor (unsuccessful move, is the step before any quarrel).

These different approaches, depending on the situation, widen (or not) different spaces for debate and dialogue, even if starting from different positions. Dialogue, or rather the “happy dispute” (“disputa felice” in the original italian definition by Mastroianni, 2017) is like a dance, which however requires the commitment of all the participants involved. Although each of us, depending on the arguments, has his/her own point of view and biases, a useful attitude in the discussion of others’ opinions should be to open up to doubt and uncertainty, to crack the solid building of certainties (which sometimes risk becoming unshakable faiths).

Not always consciously, we witness or are actively involved in debates between different narratives. Both in science, intended as the analysis and interpretation of phenomena based on verifiable methodologies, and among citizens/users. In this second case obviously on another level, intended as opinions and visions of the world. Underlying this discourse still remains the need to open up to uncertainty and to the complex paradigm of network thinking to look at the world and choose how to represent it.

Conclusions

Let’s finish resuming the question that gives the title to this essay: how can we interpret the world around us? There is no single answer, but it’s important to keep the space open for discussion, sharing some common “rules of the game”:

  • the inclination to question ourselves to understand the information and data visualizations shown to us;
  • the willingness to deal with different viewpoints, focusing on contents, without diverting on personal attacks.

There is therefore a matter of worldview and storytelling. Science is objective in its method, but it is also pluralistic, it accepts differences, indeed it is alive thanks to the ability and desire to question the existing. This should not be confused with conspiracies, that is questioning but without speaking the same language, and without reliable and shared sources and methods.

The citizen, the user, all of us, found ourselves on many occasions trying to interpret and understand graphs, data visualizations, representations. And this, on the one hand, implies the need to have a good “toolbox”, made of the skills necessary to approach and critically evaluate the displayed information. Hence the first half of the title of this essay: complex visualizations.

At the same time, a further deeper level intersects with the first one, which concerns the model of the world, the paradigm underlying the representations that designers, scientists, journalists create and present: the complex view of the world and the way scientists (and citizens) look at phenomena. We have seen it with the graphs of the epidemic reported every day by the media. A linear and reductionist, cause-effect approach is not enough to explain and understand. The number of positives, hospitalized and unfortunately deaths follows the public health policy decisions taken by a few weeks. 2020 is a clear example of an emerging phenomenon, of a complex and reticular system. Actions on a part of the system that affect other parts in a non-linear way. Edges (actors) are linked together variably. Clusters of subjects, small networks that must not come into contact in order not to spread the virus. The concept of network (as Manuel Lima also explained in Visual Complexity) is one of the emerging metaphors of complexity, useful and necessary to understand the dynamics of phenomena that otherwise we would not be able to explain.

Hence the second part of the title: visualized complexity. Complexity as a new shared paradigm, as a common ground, a new grammar from which to start designing, visualizing, discussing.

TALKING ABOUT DATA means doing politics: not only for the classification system corresponding to three risk scenarios applied in Italy (each color involves different restrictions) during the second wave of Covid-19 at the end of 2020, but for example for climate change, the next great crisis that always remains in the unspoken, in a kind of collective removal.

Credits: Mackaycartoons

With the data we choose, which as we have seen in common perception is synonymous with “objectivity”, we actually express a point of view and the paradigm adopted to see the world. We provide interpretations and tools, which can / could be used by decision makers. The data analysis and representations are biased, influenced by the questions underlying the process, by the starting assumptions: “if you torture the data long enough, they will confess anything” (D. Huff, 1954). So maybe the next time we see displayed data (in a graph or a more complex data visualization), we can think not just which question they answer, but what others they leave out. And how this will affect the subsequent debate. We live and act in complex interconnected systems: facts, data, representations, decisions, debate, public opinion. Here written as a list, but which must be thought of as an interconnected network.

In closing, we can paraphrase Hans Rosling (2018) who says “the world cannot be understood without numbers. And it cannot be understood only with numbers”. In addition to the Factfulness referred to by Rosling, we also need and will increasingly need Visual Literacy to be aware of what we are looking at, from what point of view, with what goal for the future.

Bibliographical references:

A. Cairo, The Truthful art. Data, charts, and maps for communication, New Riders, 2016.

A. Cairo, How charts lie. Getting smarter about Visual Information, W. W. Norton & Company, 2019.

K. Börner, Atlas of knowledge: Anyone can map, The MIT Press, 2015.

R. Dawkins, The Selfish Gene, Oxford University Press, 1976.

M. Del Vicario, A. Bessi, F.Zollo, F. Petroni, A. Scala, G. Caldarelli, H. E. Stanley, W. Quattrociocchi, Echo chambers in the age of misinformation, Proceedings of the National Academy of Sciences Jan 2016, 113 (3) 554–559

J. Drucker, Humanities Approaches to Graphical Display, in Digital Humanities Quarterly 2011 5.1.

L. Floridi, Pensare l’infosfera. La filosofia come design concettuale, Raffaello Cortina Editore, 2020.

V. Gheno, B. Mastroianni, Tienilo acceso. Posta, commenta, condividi senza spegnere il cervello. Longanesi, 2018.

D. J. Hand, Dark data. Why you don’t know matters, Princeton University Press, 2020.

D. Huff, How to Lie with Statistics, W. W. Norton & Company, 2009.

D. Kahneman, Thinking, Fast and Slow, Farrar, Straus and Giroux, 2011.

M. Lima, Visual Complexity. Mapping patterns of information, Princeton University Press, 2011.

B. Mastroianni, La disputa felice. Dissentire senza litigare sui social network, sui media e in pubblico, Franco Cesati Editore, 2017.

B. Mastroianni, Litigando s’impara. Disinnescare l’odio oline con la disputa felice, Franco Cesati Editore, 2020.

L. Masud, F. Valsecchi, P. Ciuccarelli, D. Ricci, G. Caviglia, From Data to Knowledge. Visualizations as transformation processes within the Data-Information-Knowledge continuum, in 2010 14th International Conference Information Visualisation.

R. Paul, L. Elder, The miniature guide to Critical Thinking. Concept and tools, Rowman & Littlefield, 2020.

W. Quattrociocchi, A. Vicini, Misinformation. Guida alla società dell’informazione e della credulità, Franco Angeli, 2016.

H. Rosling, Factfulness. Ten Reasons We’re Wrong About the World and Why Things Are Better Than You Think, Flatiron Books, 2018.

A. Testa, Il coltellino svizzero. Capirsi, immaginare, decidere e comunicare meglio in un mondo che cambia, Garzanti, 2020.

T. Vigen, Spurious Correlations. Correlation does not equal causation, Hachette Books, 2015.

Editorial Coordinator of Complexity Education Project; Digital Learning Manager. Explorer in complexity, data visualization, network science