Visualize… Data… Action! What Dataviz Has in Common With Documentaries

Documentaries employ a number of useful techniques to marry information and narrative

Joshua Smith
Nightingale
Published in
11 min readAug 31, 2020

--

Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. More importantly, it’s a fun way to connect things we love — visualizing data and kicking up our feet for a movie night. All week, Nightingale is exploring the intersections between data visualization and all kinds of entertainment.

From the sensational Tiger King, to the meme-inspiring Ancient Aliens, to the hypnotic Blue Planet, to the [cathartically?] disastrous Fyre, documentaries are a mainstream, pop-culture vehicle for “truth-telling” and nonfiction storytelling.

Anecdotally, it feels like lately I’m more likely to be recommended a documentary on a topic versus a book or even a podcast. Not only are they doing increasingly well at the box office, documentaries have increasingly captured more of the share of content we consume from the big screen — from less than 2% of all films and TV series 50 years ago to 20% last year.

Line chart of documentary releases as a percentage of all film and TV releases where it has risen dramatically since 2000
Visualization made by the author

Documentaries are the natural byproduct of our love of motion picture storytelling expressed during the information age. Documentary producer Dan Cogan, whose company has produced or funded over a hundred documentaries, said, “We are in the Golden Age of documentary film making. … There has never been as great storytelling in nonfiction film as there is today.”

We get to learn while enjoying all the things we love about fiction narratives: edge-of-our-seat intrigue, obsession over mystery, teary-eyed redemption arcs, comedic tragedies.

What a wonderful time to learn.

While I’d love to wax poetic on documentary narratives, this piece is really about the similarity to data storytelling: both arts are informing via narrative.

Given the rising interest in documentary as a medium that so artfully marries learning and story, it makes sense that similar demands would fall on the “data” world. Our users and audiences aren’t satisfied with facts—they desire entertainment, exposition, intrigue.

We crave the form of meaning that only stories can deliver. It’s not enough to know something, we want to experience it through the perspective of a narrator and characters.

Along with the rise of documentaries, we can follow the rise of “data storytelling.” Since 2011, Google searches for the phrase “data storytelling” have dramatically increased (with a noteworthy acceleration in 2015 after Cole Nussbaumer Knaflic published her discipline-defining title, Storytelling with Data).

Line chart of Google searches for “Data Storytelling” from 2004 to 2019 where there is a major jump in 2015
Visualization made by the author

What might have once been a buzzword is now codified into a sought-out skill set. Nussbaumer’s book was followed by this book and this book and this book and this book and this book and this book and this book and this book and I’m sure others that aren’t waving their hands at the top of Amazon’s search algorithm. Storytelling, UX, and information design was a track at this year’s DataVizLive conference. It’s long been a criteria for judging in Tableau Conference’s IronViz competition.

Data storytelling isn’t a trendy buzzword, it’s a preference in how we consume information and how we learn.

Don’t give me a chart—give me a story.

So we’ve seen a parallel in the rise of the documentary and data storytelling, both crafts that pair information with narrative, insights with entertainment.

Although both crafts are currently very prevalent, documentaries have been at this game for a lot longer. While identifying the “first” documentary requires navigating some academic arguments on genre, the consensus is that the ethnographic Nanook of the North, by Robert Flaherty, was the first—in 1922.

An advertisement for the first documentary, Nanook of the North.
Advertisement for the American documentary film Nanook of the North (1922), from the insert after page 8 of the February 3, 1923 Exhibitors Trade Review. Image from Wikimedia Commons

Fast forward nearly a century: Nussbaumer’s groundbreaking book Storytelling with Data came out in 2015. Of course people were pairing narratives with data before that book, but it feels pretty clear that as a practiced and disciplined craft, documentaries are our elder.

And, just as our elders can say “Oh, I’ve solved that before, just try this,” documentaries can show us a number of techniques we can use to marry information and narratives.

The myth of neutrality

Documentaries—especially observational documentaries—create a sense of truth about what we’re seeing, especially when the film crew are “invisible.” This isn’t just true of documentaries, it’s true of a lot of video.

However, filmmaker choice guides what we see. Editing determines the order in which we see things and what we don’t see. Things like music and filmography impact the value we place on what we’re seeing (for example, the same interaction between two people could be perceived as positive or negative, depending on the background music).

On the more malicious end, Nanook of the North intentionally staged the narrative to portray the Inuits in a negative and subhuman light. Much of what we saw of the protagonist, Nanook (whose actual name was Allakariallak), is distorted or fabricated. Many of Flaherty’s directorial decisions played into colonial voyeurism, forcing real people to fit into a Euro-American narrative of Indigenous people.

Yet providing an “unbiased” recording of reality isn’t possible (this isn’t to justify Flaherty’s colonialist narrative—our biases can certainly be intentional and malicious). In the literary version of social science’s Observer Effect, simply bringing a camera into an environment changes the behavior of the subject being filmed. In addition, some environments can’t be filmed as they are, and so observing “truth” may require modifying the environment. Nanook of the North provides another example here, as the filmmakers rebuilt the igloo as a three-walled structure to accommodate the large cameras. Giving viewers a glimpse of the living conditions inside an igloo required modifying the igloo itself.

Much of this holds true for data storytelling. As analysts, we navigate similar decisions as documentary filmmakers. Filmmakers choose what to observe, we choose what data to analyze. Filmmakers choose their camera lens, we choose our analytic technique. Filmmakers edit out material and rearrange sequences, we choose which data to filter out and the ordering of our insights. Filmmakers choose music, images, fonts—so do we.

As analysts, we have to admit that we rebuild igloos, too.

There’s no such thing as a neutral documentary, and there’s no such thing as a neutral data story. Documentaries have been thinking about this far longer than we have, and it seems like perhaps a critical first question we can bring to the wisdom of our elder craft.

Voice-overs and narration

One of my biggest turn-offs in data storytelling are walls of text. Like a real wall, a large body of text can prevent your audience from accessing your message, especially when used up-front. I find myself skimming the first few lines, then either skipping the rest or just closing the page altogether. Here’s one of the worst examples, from my own portfolio:

A Tableau dashboard showing a large body of text on the left and a chart on the right. The data is from ESPN.
An example of terrible use of text from my own work. The data from this work came from ESPN.

Meanwhile, just this past weekend I was watching a documentary. About two minutes in, I found myself bored and distracted as a narrator paced around a set, rambling through a tedious exposition. I turned the documentary off.

Instead I turned on Cosmos: A Spacetime Odyssey. From the moment the studio logo disappeared, I was greeted with stunning images of beautiful landscapes, paired with a tribute Carl Sagan quote. The camera zoomed in to that landscape to find the narrator, Neil deGrasse Tyson, who provided me with a 14-second introductory exposition packed with intrigue and mystery. I was hooked as Tyson’s voice-overs elegantly explained the wondrous high-definition images I witnessed.

Video for the intro to Cosmos: A Spacetime Odyssey.

For a similar mastery in data storytelling, we can look to the Guardian’s 2017 “Bussed Out,” an award-winning data journalism piece about homelessness in the U.S. Journalism is a medium that typically relies on text, but from the get-go, “Bussed Out” integrates the exposition with images and animated data visualizations.

This piece really captures the essence of masterful narration as the charts fill out while I scroll through the story of individuals and society. Words live with charts, rather than just above, below, and beside them. Any lengthy bodies of text are accompanied with formatted quotes and plenty of negative space, creating openings in what might otherwise feel like walls.

Too often I feel as if I’m given an instruction manual for a chart: Here’s what it shows, read it like this. “Bussed Out,” on the other hand, narrates my experience, seamlessly weaving evidence and explanation. We want to avoid being a tedious narrator, instead weaving visual voice-overs into our evidence.

Interviews

I recently checked out the Netflix reboot of Unsolved Mysteries. I nostalgically remember Robert Stack’s distinctive voice as he narrated the mystery from foggy alleys, dark churches, cemeteries, or staged dispatch centers. The new Unsolved Mysteries, though, lacks a narrator. Instead, interviews are spliced together, feeding us clues and emotions like a carefully calculated multiple course meal, each preparing us for the next. The trailer, also lacking a narrator, showcases their interviewing technique:

The difference is compelling: Whereas the original Unsolved Mysteries were intriguing and suspenseful, I found this reboot to be tragic and sad as I felt hints of the loss of the friends and family being interviewed. In the original Unsolved Mysteries, the narrator was a mediator, protecting me from the tragedy by turning it into suspense. When the narration comes directly from the friends and family, I feel much closer to the story—and the raw emotion that comes with it.

Lisa Charlotte Rost discussed the inverse relationship between aggregation and empathy in her 2017 Open Viz presentation, A Data Point Walks Into a Bar: Designing Data for Empathy. It’s just easier to empathize with individuals we can see or hear.

To that effect, I can still remember the first time I saw interviews paired with a data story: ProPublica’s “Losing Ground.” I stumbled across this in 2014, and I can still recall the powerful photograph of Earl Armstrong, a cattle rancher, standing in his field under a foot of water.

I can remember the sound of his voice from the short recording. Granted, I’ve returned to this image over and over—but that’s because of the powerful way ProPublica paired aggregated data with individual interviews. Leading up to my discovery of “Losing Ground” I had seen my share of charts and data stories about climate change and rising coastal waters, but this was the first one that really stuck in my mind.

Both Unsolved Mysteries and ProPublica let people tell their own story. Whenever possible, they get out from between the subjects and the audience. All too often we think of data storytelling and data humanization as “telling the stories of the people in our data,” but I’d argue that we should look for opportunities let people tell their own story, rather than assume we have the power and right to speak on their behalf. We won’t always have that opportunity, but I think it’s a powerful device we should at least try to employ.

Reenactments

I’m a big fan of the show River Monsters — it blends my hobby of fishing with my interest in folkoristics, all tossed together with a level of excitement. The host and angler, Jeremy Wade, chases down marine monsters by following stories of bizarre and often dangerous encounters. The show alternates between interviewing and reenacting the story they are telling. For some examples, watch some of the video below (content warning: mildly graphic descriptions and images of injury and death):

Forensic Files, Unsolved Mysteries, and quite a few SyFy channel mystery shows employ this technique. Of course, it helps to inject suspense into the show—but it also provides educational value to help us understand the progression of events. Often reenactments are paired with step-by-step narratives to help our imaginations materialize the story: First this happened like this, then that happened like that, and so on.

Perhaps the most obvious parallel between reenactments in documentaries and data viz is seen in animations. A strong example comes from NYT’s “The Dangerous Flaws in Boeing’s Automated System.” This piece brilliantly combines relatively simple graphics with scrolling annotations, narrating the plane’s angle of flight so the reader can follow exactly how faulty sensors may have caused a crash. We see how sensors are supposed to work, we see the safe angles for flight and we’re given a vocabulary. Then, the animations show us what happens when a sensor provides a false reading, and the diagram makes the potential danger very clear:

Static screenshot from NYT’s “The Dangerous Flaws in Boeing’s Automated System”.

While reenactments might bring animations to mind, there are plenty of strong static “play-by-plays” out there, such as “How the Notre Dame Cathedral Fire Spread,” another example from NYT. Static images provide us a play-by-play breakdown of how the fire began and spread, and how French firefighters sought to combat the fire. Each moment provides us with a beautifully drawn cross-section paired with photographs of the fire, allowing us to connect what our eyes saw to the hidden and vulnerable architecture.

All too often my own tendency is to summarize events over time in a single line chart—perhaps for the above, I’d have shown some measure of fire intensity plotted at each hour. However, this reenactment parallel technique allows readers to understand what’s happening within each of those dots along a line.

We, and our readers, don’t experience time as a line—we live in a single dot at a time, a single moment. Reenactments—and the NYT team’s storytelling in this example—let me experience each dot, enabling me to contextualize the data and the insights into the narrative almost as if I was omnisciently witnessing the fire in person.

The few concepts I’ve outlined above are just examples. The more I dig into documentary narrative techniques, the more parallels I find to examples of excellence in data storytelling. Our mediums are different, so these parallels are never one-to-one—but the similarities are strong enough that I’m convinced we should look to the art of documentary filmmaking as our elder, with a lifetime of experience to share.

Perhaps the most poignant opportunity to see the similarities between documentaries and data storytelling comes from a quote by five-time Emmy Award winning documentary filmmaker Ken Burns:

I realized very early on that the laws of storytelling also apply to the documentary. That instead of the documentary necessarily being didactic and educational and, you know, politically advocating, it could also just tell a story using the same expositional tools that a feature film would. And then you’ve got the possibility of moving people at that same level. And you have the added advantage of it being true. Steven Spielberg and I obey the same laws of storytelling. And the only difference is he can make shit up, and I can’t.

Love y’all.

--

--

Joshua Smith
Nightingale

I am a user experience researcher, a data scientist, and a public folklorist.