What Is The Boundary Between Data Visualization and Other Types of Images?
A DVS #Historical-Viz channel discussion
The Data Visualization Society (DVS) #Historical-Viz channel is devoted to discussing lesser-known examples from history, but because we all come from diverse backgrounds, the conversation occasionally drifts towards the philosophical. Since this conversation is on slack, I’ve made an attempt to capture the important concepts and conversational flow so that others can share the information and join in the commentary.
It is important to note that almost all of the posts have been edited for grammar and content in order to present a more focused narrative.
Stephanie Tuerk: Ok, so here is a question: what is the boundary between a data visualization and other types of images? (In particular, I am thinking of the diagram, but other kinds of descriptive images as well) And how do we think about the location of this boundary historically? I.e. is the distinction that we would make for something made in 2010 the same as what we would make for an image from 1910 or 1810? I ask not to establish boundaries to police, but rather because I think that a lot of knowledge is produced through the process of attempting to determine ontological distinctions between things/ideas, even if there are always exceptions to those distinctions.
Jason Forrest: Thank you for asking! I have (obviously) been thinking a lot about this question. It’s my opinion that visual information design *should* overlap with data visualization as both help the audience understand something that is immaterial. What does the group think?
Stephanie Tuerk: How do you define “visual information design”? Smiley face but also a serious question! But yes, I was attempting to generate general discussion about this rather than prompting an attempt to come up with an actual, agreed upon, universally-applicable answer to my question. (No one answer is going to be sufficient) Curious to hear what people think!
Jason Forrest: Well, that’s the question of the hour! I left it a bit vague on purpose, but it could be statistical chart making or scientific illustration, but does that include architectural renderings? Does that include music notation? What about artist renderings?
Stephanie Tuerk: Yes, exactly. So let me rephrase: what are the properties of an image that makes it a (data) visualization versus some other kind of image?
Arnold Platon: I’ll latch onto this thread because it’s connected to something that’s been bugging me. Looking at the coat of arms of Austria-Hungary I’ve been thinking if heraldry can be a form of information design, especially when it comes to more complex entities. I mean this one has information about governance (the fact that its the union of two monarchies, which dynasty), a hierarchy concerning territory (core and periphery encoded into how the arms are stacked). I feel in a sense they can be somewhat analogous to modern “political system charts”. Just a thought… :thinking_face:
Jason Forrest: I find this idea absolutely amazing! It’s an excellent example of “information” that is transmitted visually.
Stephanie Tuerk: The addition of this coat of arms is making wonder whether there is a difference between “information” and “ideas.” I think that an argument could be made that basically all images embed some kind of idea in them. This is something that many art historians, particularly those who have embraced a shift to “visual studies” would claim, and work like, say, John Berger’s “Ways of Seeing” or Michael Baxandall’s “Painting and Experience in the Fifteenth Century” argue that even if the subject matter is, say, a still life, ideas of the time are embedded (and thus can be read in) the manner of execution of the artwork.
The coat of arms reminded me of 14th C Siennese Allegory of Good and Bad Government frescos which also encode information in a way, but…I think that few of us would say that these paintings are visualizations? Or, at least I would not, and I feel like expanding the definition to include them (the frescos) weakens the argument that a visualization is a distinct type of image.
I think that in my mind what is starting to emerge as a litmus test is some sort of two-way encoding of information in an image — meaning, there is a system for translating the information into graphic form, and the same? a? system for the viewer to retrieve it on the other side. In the case of coats of arms, people at the time may have indeed been able to extract the information (by virtue of knowing the system that produces a coat of arms). And maybe there is a case to be made that the symbols/tropes of early modern art — particularly those related to religiosity — were actually legible at the time? IDK, just thinking out loud here…
Not sure if people are familiar with Charles Sanders Peirce’s work on semiotics, but whenever I find myself thinking about these kinds of questions, I always make an attempt to think through them with Peirce’s taxonomy of signs…and then I’m like, ugggghhhh, it’s been years since I read that and I really need to refresh before I can really do this. Anyway, throwing it out there as something that I feel like should be included were there ever to be a “Theory of Data Visualization” reader.
Arnold Platon: I actually came to think about the subject because I’ve developed a recent fascination with “political systems diagrams” and while studying historical examples I realized that some (like the Austro-Hungarian coat of arms at the top) are not that far, in essence, from more viz-like examples, such as the 1862 “Diagram of the Federal Government and American Union” by N. Mendal Shafer.
PS. All the other interventions in this conversation absolutely fascinating :thinking_face:
James Lytle: Very interesting convo. As I’m thinking about it now, what stands out in my mind is this idea of mapping and comparison. When it comes to this question of “what is information design or data visualization?” I’m a bit of the Tuftian traditionalist that the heart of DV is about 1) quantitative presentations and 2) meaningful comparisons. There definitely are overlaps into other areas of visual communication, graphic design, symbolism, etc, but I think the heart of DV always finds its way back to meaningful comparisons that support a communication method for context + details. I also agree that it isn’t about getting everyone to agree on _the_ definition of what DV is, but we do have a big pile of best practices, and I am guessing the less those best practices are needed in an image, the less likely it should be considered a “data visualization”, semantically speaking that is :grinning:.
Slight aside — this makes me think about in the early days when “infographics” became trendy online, largely driven by marketing teams who wanted to present something visually engaging. This drove me absolutely crazy in those days because most of them were clearly the case of a graphic designer with zero experience in data vis getting thrown a project to create a cool graphic with numbers on it, and maybe adding some rando charts and curvy arrows and things. Before I knew it, data was becoming more mainstream and people thought, “Oh you make those infographics” and I’m all :expressionless:.
Anyway, you have to choose your battles, so I think in general the most helpful discussion is not “what is a data visualization?” or “what is an infographic?” but rather “what is a good data visualization or good infographic?” Then the definition is more what is good and what are good expectations and less about a rubric of boxes to check. Thanks for humoring my rant :smirk:
Jason Forrest: I think your second point in “meaningful comparisons” is the grey zone. One of the ideas that I’m sort of chasing right now is chasing the boundaries of what is inside of that grey zone. I realized that Neurath’s Isotypes are data-driven, but they are aggregated to the point of making a more generic point.
My particular fascination on this subject is ultimately tied to my curiosity on the nature of persuasion, which I recognize is something entirely different (and not exactly well stated here either, haha)
Stephanie Tuerk: Interesting thoughts James. From my perspective, must be made from quantitative data and must be comparative seem like apt descriptors to me. They even help me clarify why I feel like a musical score is *closer* to being a data visualization than an architectural plan. While to latter embeds information for sure, extracting that information into a quantitative form, while possible, would just require more leaps and bounds than it would than extracting a musical score into quantitative form. Really enjoyed the insight on best practices — it is so not how my mind works to think of defining something that way.
Jason, I’m curious about your interest in persuasion! I feel like the best way to have substantial conversations about unanswerable questions is to understand where people are coming from/have everyone lay their cards on the table, sooooo… I look forward to this interest trickling out over future posts and learning something from it. Also: can you explain how Neurath’s Isotypes have distanced themselves from the data that underlie them? I’m not sure I follow what you mean by “they make a more generic point.”
Jason Forrest: My point, Stephanie Tuerk is that there are no actual numbers here, just an illustration that shows a proportional comparison. I have no doubt that this is based on data, but it has been rounded to the point of being “generic” (or to I mean vague?)
lord: Sorry if I’m missing the point but I would think a data visualisation is an attempt to represent data visually, informatively and accurately. The question of what constitutes data is therefore important… Defining data as recorded information would mean an image that documents the structure of a butterfly’s wing is a data visualisation.
Rounded numbers are still accurate to that level of rounding… The representation of those rounded numbers still have meaning. If the scales representing the data are arbitrary or not referenced, they are not informative and therefore not visualising the data. Data visualisations without a reference key are not complete data visualisations.
Just a few thoughts…
Elijah Meeks: I am so on board with this topic and I think the current trend in computer-assisted data visualization that tends to ignore diagrammatical tradition is a really bad one. Also, +1 on Peirce, there’s a reason why I named my dataviz library “Semiotic”. Another title for the bookshelf: John Bender has some interesting explorations of this topic here.
Stephanie Tuerk: lord, to continue the conversation I will ask: is a photograph of a butterfly’s wing a data visualization? Certainly a photograph of a butterfly’s wing documents the wing’s structure, no? I would also ask, what constitutes “information” for you? The question wasn’t intended as a practical one, but more of a philosophical one. (Hence, the earlier claim that there is no singular or correct answer) I posed it because I think that if I am arguing (or, we are arguing) that data visualization has a particular salience in contemporary culture (it’s certainly an argument I’m interested in making/do make), I think it’s important to be able to discuss “why data visualization now?” i.e. what kind of knowledge can data visualization produce that diagrams, figurative images, etc. etc. can’t produce. In other words, what is unique to data visualizations. I also think that developing a definition of a data visualization that is not merely “it’s a chart”, but instead is a set of properties, will allow us to continue to open the realm of what a data visualization is, and continue to innovate, and when people are like, “what are you talking about, you data viz people can’t claim that territory for yourselves/can’t claim to be experts in this thing you just did, you just make charts!” we can reply by saying, no, here is the historical tradition of (data) visualization and if you look at the common threads/definitive properties of all of these you will see that my avant-garde-ish thing I made is indeed a data visualization. Obviously all of this doesn’t matter all that much in the context of, “ok, I have a data set and someone wants a visualization in two weeks, what do I do?” but I think it’s important that at least some people in the data viz world think about these broader theoretical questions!
lord: Good question Stephanie Tuerk… By “image that documents the structure” I meant one that labels/documents key aspects and therefore the data is the information that is relayed by the given key. Like this diagrammatic map of a butterfly¹ wing by Vladimir Nabokov:
I wasn’t thinking of just a lone picture of a butterfly, or a butterfly’s wing. But there’s no reason why a picture of a butterfly can’t be part of a visualisation… or even a butterfly itself. If the intent is to portray the information “butterfly”… Is one data point representative of the data? Is a chart with one data point a data visualisation? Maybe a number of data points are necessary… Or is it only when key aspects of the data are labelled and highlighted that it becomes a visualisation?
Something I keep coming back to is the idea that the information needs to be represented meaningfully with some kind of systematic analysis behind it so that understanding is gained. The collection of butterflies pinned in the case above is a lot of data about “butterfly” — the butterflies are even named — but is it meaningful? After Nabokov had systematically analysed butterflies he was able to order the aesthetic forms in an instructive way, giving meaning to the underlying data:
(I’d like to say that no butterflies were harmed in the making of these posts… but… )
I have a question, James Lytle and Stephanie Tuerk… why do you feel the data needs to be quantitative? Can qualitative data not be visualised… or does it only constitute data visualisation if the underlying data is quantitative
Pierre Dragicevic: Stephanie Tuerk, I’ve been having the exact same questions as you. Jacques Bertin’s monosemic/polysemic distinction can help understand conceptual differences between data visualizations and other types of information-carrying images: “A system is monosemic when the meaning of each sign is known prior to observation of the collection of signs” […] “a system is polysemic when the meaning of the individual sign follows and is deduced from consideration of the collection of signs” (Semiology of Graphics). Others (e.g., Leland Wilkinson, Yuri Engelhardt) talk about visualizations as having a grammar. Robert Kosara also has a very interesting discussion on visualizations vs. other types of images:
What is Visualization? A Definition
What is a visualization? The word is problematic, and there have been very few definitions that try to define this…
In his blog post, there’s an interesting comment from Hadley Wickham that visualizations should be “invertible”. This is easier to understand in the context of automatically-generated computer visualizations, which have been often described as multi-stage processes that turn raw data into images (the “visualization pipeline”). The idea is that a reader should be able to start from the image and get back to the data somehow. None of these analyses goes very deep into what makes a good vs. bad visualization, but I think it’s interesting to try to reflect on what an artifact is (a philosophical question) irrespective of whether it is effective at achieving its designers’ intent (an empirical question).
lord Insightful, Pierre Dragicevic… there’s something else too (I don’t know how this is expressed in theory) where there is emergent understanding from the collected signs (a gestalt) that is qualitatively different from the various data.
Pierre Dragicevic: Right. There is a perceptual side to that (e.g., ensemble perception) and also a cognitive side: researchers talk about going from data to knowledge, insights, sometimes wisdom. Those are key aspects of visualization but it’s much more difficult for me to think in that space.
Jason Forrest: Pierre Dragicevic — this “invertible” idea is VERY interesting and one that I think is especially important when the subject matter is more intangible. I started thinking about this a lot when I wrote about Hilma af Klint as she was using a Theosophical semiology, which itself was part scientific and part poetic, so her visualizations were designed to be studied and meditated on as (invertible?) spiritual guidance.
lord, the mix of qualitative and quantitative I think is one of the defining issues at the moment, and I think it’s something that separates much of the more analytical work from being more persuasive. IMO, it’s that editorial/qualitative nudge that people seem to connect to.
Stephanie Tuerk: Jason Forrest, I would argue that the “invertible” requirement regulates not only the how information is symbolized but what *kind* of information can be encoded in a visualization. Meaning that “information” that is too complex for its comprehension/perception to be easily and unambiguously confirmed can’t be represented in a visualization, i.e. images that attempt to embody that kind of “information” (which I would call something like a “concept” or “idea”) are some other type of image aside from a visualization. That is to say, representation is a larger category than visualization.
Jason Forrest: I’m not sure I agree, as an illustration of “heaven” is still a translation of an idea. But maybe I’m pushing too far at an extreme with that example. Would you consider an early diagram of an atom a visualization even if it turns out to be wildly inaccurate later on?
Stephanie Tuerk: No, I would consider any diagram of an atom to be a diagram, not a visualization! Contemporary or historical, irrespective of the image’s relationship to an actual atom.
Also, to me the status of something as a visualization has nothing to do with the veracity of the information that is visualized (responding to your comment about “even if the image is inaccurate later on”) but more so on the fact that the information shown begins as something that the author has in a form that is completely distinct from any graphical representation…and then is translated into graphic form….and then can be retrieved by the viewer and reconstructed in the distinct from graphical form.
lord: I agree with Stephanie Tuerk about the Bohr Atom… while data drove his theory, he wasn’t visually presenting any underlying data — he was representing a theory of atomic structure… it was a theory visualisation…
Jason Forrest: maybe it’s graphic translation?
James Lytle: lord, Good question. So, in general, I tend to view DV as the language of “How much?” or the grammar of scale, which grounds the heart of vis in how much of this or that (how much time, how much space, how much who/what). Naturally, in order to properly describe who or what you are talking about it is helpful to describe the profile of things (with numbers and characteristics) but then it is back to how those things relate to each other (ie how does this system work or interrelate, how are the parts of the petal structured in relation to each other?) So system diagrams are still very much vis to me in as much as it communicates how different pieces relate to one another.
Arran LR: I liked Boris Müller’s piece on considering DV as a cultural image interesting:
Picture, Depiction and Deception: Why Data Visualisations are Cultural Images
Data visualisations are usually created by computers — but they are not technical images. Every visualisation is an…
I think the question I’m interested in is considering DV as communication and as a tool for insight.
Stephanie Tuerk: I totally agree with the argument that data visualizations are cultural images, but I think that there is a lot of literature in the history of science for example that argues that the kinds of images that Müller is setting up as strawmen/”technical” images are also cultural images. I mean, it’s the same argument that you see a lot these days (or at least I see a lot) arguing that the data themselves are cultural products and not “objective.” Essentially…cultural/technical is a false binary.
The question of “why quantitative” is a good one. (Am I allowed to say, “oh lord, that is a good question!“?) I guess first I want to retract a bit and say that I think that ordinal information can be visualized as well, but I think that once you get to the idea of turning an idea into an image, you are in linguistic territory (and that of classic semiotics) — in that you are making an image the idea of cat the same way that the word “cat” is the signifier of the signified “cat.” I guess that is to say, I think that a characteristic of visualization is that it, in and of itself, doesn’t create meaning, although it communicates information. (People may interpret meaning from it, but that meaning is not in the visualization itself.)
I mean, there is no one definition and sure some people are going to want to say that all diagrams are visualizations. But then we ignore that there is a smaller class of things within diagrams that make meaning in a very specific way — in which graphical elements are used to represent discrete relationships whose success in faithfully transmitting information relies on the lack of interpretability of the original information— and fail to recognize that that is something distinct because we have expanded the term visualization to include the larger class of things.
Anyway, I went to the wrong source (Tufte). Pierre already mentioned this, but (as almost always), Bertin provides sound counsel on the matter.
lord: I loved the way you described the importance of relationships James Lytle. It’s these relationships that allow a discerning person to pick the signal from the noise…. Like, what aspects of a butterfly are functional (wing dimensions) and what are too variable (have too much noise) to be analytically interpreted (eg number of spots).
So I have another question… If the diagram of an atom is not a data visualisation… What about something that represents an equation? There’s no actual data… But the equation encodes data. Does that count? Example; a graph that represents the equation of an aerofoil or an animated metronome that is defined by a sinusoidal equation.
Pierre Dragicevic: I like the equation case, that’s a tough one. You can also imagine you have a simulated dataset and visualize it with a set of complex plots. Most people would probably say they are visualizations, and yet that’s just a more complex version of your equation case. You could argue the parameters of your models are the data. In the case of a sinusoidal equation, your data would then be a set of three quantities (period, phase and amplitude). An interesting implication is that a sinusoidal curve can be seen as a technique for visualizing three quantities. That’s one possible response to your question, there are probably many others.
About butterflies: many would probably agree that naturally-occurring phenomena like footsteps on a beach are not visualizations. Yuri Engelhardt discusses this in his thesis. To him, a visualization needs to be an artifact (i.e., something created by a human). So if you come across a bunch of dead butterflies during a walk in the woods, you could infer a lot of things using many of the same perceptual and cognitive skills as the ones you use when you read a visualization, but you couldn’t really call this a visualization. In your photos, however, the dead butterflies were purposefully arranged in a way that makes it presumably easier to extract information. That, I think, comes closer to a visualization.
Stephanie Tuerk: …but does visualization not imply the human act of translation of information from a non-visual medium into a visual medium?
Jason Forrest: See, that’s where I am still/too. In my world, the data aspect is important, but not all-encompassing.
Stephanie Tuerk: I mean, I would never argue that a display of butterflies does not convey information. It very much does! So to me, it returns to a different unanswerable question, which is, what are we hoping to get out of defining the term visualization? For me, I’m trying to find a definition that distinguishes (data) visualization from other forms of representation — like, a visualization may often be a diagram, but I’m interested in what makes not all diagrams visualizations. Obviously one could also want to define it as broadly as possible for different reasons, and then I guess the question is, when does something stop being a visualization? I’m not sure if these are two sides of the same coin or not…
lord: I agree, Stephanie Tuerk that there needs to be a human act of presenting the data in a form that facilitates understanding that could not be gained otherwise… But I’m on the fence about the form not being visual to start with.
If you take a bar chart and plot a single dot for each observation instead of an incremental rectangle, you are visualising a number. Each dot has been arranged informatively… The overall visualisation gives you information you can glance at and reveals more about the data than you’d get from just looking at a bunch of numbers.
How is this different from Nabokov’s scientific arrangement of butterflies? If I had been counting butterflies and placed them one on top of each other, like I did with these chocolates… Is that no longer a visualisation?
Stephanie Tuerk: lord as I see it, if you are counting the butterflies, and then using the butterflies that you counted to make the visualization by stacking them one on top of the other, the information you are visualizing is a number, not the butterfly itself. (This using the things you count to make the histogram is a particularly cheeky kind of visualization — it’s probably an index in Peirce’s terms) versus in scientific displays, what is on display is the butterfly itself.
lord: Not really… I picked them up and arranged them in order of type, one on top of each other. Nabokov’s arrangement reveals even more about the data because he analysed more details about the butterflies and then arranged them in a structured way that reveals more about them to the discerning eye.
Stephanie Tuerk: By the same token, if one took all of the Impressionist paintings in MoMA’s collection, and put them in a room, and ordered them by artist and decade, and put them on the wall in these groupings so that you can see the visual differences between them, has one made a visualization? I guess for me visualization involves a signified and a separate signifier (the graphic element that represents the signified), which come together to make a sign.
Jason Forrest: You mean like this:
MoMA The Museum of Modern Art - Google Arts & Culture
Founded in 1929, The Museum of Modern Art (MoMA) in midtown Manhattan was the first museum devoted to the modern era…
Stephanie Tuerk: When is something the object itself and when is it a representation of the object? If you put images of artworks in a timeline, to me they come way closer to actually being “data”. They aren’t the paintings themselves. If you took a room at MoMA and painted a timeline on the wall and hung paintings on it though…1. omg would the art world be pissed, and 2. I feel like that is really some kind of precipice of something! To me that is the equivalent of the histogram of candy bars. (I appreciate the provocation btw!)
Jason Forrest: I honestly think that most museums are exactly arranged by timeline. Some actually have a line and year, so I don’t see that aspect as that unusual.
Stephanie Tuerk: Yeah, I mean, after all, these “artworks” here TRULY are data in that what we see is literally a reconstruction of hex code somewhere. Oh god, now I just made the whole internet a data visualization.
Thanks to everyone in the #Historic-Viz channel! We’ll be posting more regularly so please follow The DVS publication to get more stories right in your inbox!
¹ Lord would like it acknowledged that all thoughts on butterflies grew from conversations with David Low, without whom there would have been no gestalt.
The Data Visualization Society is developing meaningful resources in order to establish a discourse and remove barriers between practitioners across tools and industries. To sign up please register at datavisualizationsociety.com/join