Creating data stories with ScrollyTeller

Ryan Shackleton

Published in

IHME Tech

10 min readSep 8, 2020

ScrollyTeller is a JavaScript library that streamlines the process of building scrolling data stories from tabular data

ScrollyTeller helps take chart and tabular data and convert it to scrollytelling HTML

Introduction

In the last 5+ years, the data visualization community has seen a burgeoning of data storytelling in various forms. A popular format is the “scrollytelling” data story, where scrolling is the primary user interaction; scrolling controls both the pace of narrative text that moves across the screen, and animations within a dynamic data visualization or relevant content. Unlike video content, users can move forwards or backwards at their own pace and the data visualization can contain other interactions that reinforce the data story or allow further exploration of the data. From a web development perspective, the basic technical implementation of scrollytelling is relatively straightforward, and there are great examples of tutorials for how to create them that are described below. That said, there are many moving parts that must be managed: scrolling text, multiple interactive data visualizations, and the usual mobile responsiveness that is made more difficult by scrolling and “sticky” elements.

Bloomberg measles scrolly — Bloomberg’s 2015 scrollyteller on measles outbreaks

I work at the Institute for Health Metrics and Evaluation (IHME), where multiple stakeholders collaborate to create data stories about global health issues. In an organization context, managing the many components of a data story and the technical infrastructure around them is quite complex. Subject matter experts, data scientists, graphic designers, copywriters, and many other people play a role in the creation of a data story, but few are fluent in HTML or JavaScript. As data visualization developers it is our job to manage all of these elements while also creating complex interactive data visualizations. Because many data stories are time sensitive, streamlining all aspects of the creation process becomes very important, especially the technical aspects. Good storyboarding can make a huge difference in reducing unnecessary code changes and text edits, but modifications to all parts of a data story are common during the creation process.

The Guardian’s groundbreaking 2017 Bussed Out data story

Motivation

Like many other software libraries, ScrollyTeller was created out of frustration. When creating scrollytelling data stories, I encountered several frustrations that centered around the management of editorial text and code.

Too much HTML boilerplate

First, I found that scrollytelling data stories can require a lot of very brittle, boilerplate code and HTML. Here’s an example of some HTML boilerplate you might need to create three text elements that scroll past your chart. There’s a lot of repetition here: each <div> element representing some narrative text (a “step” in scrolly parlance) differs only by the paragraph content and the index of the step, making this ripe for some templating.

Too much JavaScript boilerplate

Another example of brittle, boilerplate code in scrolling stories is the event handling code that links scroll events to actions in the data visualizations. Take Jim Vallandingham’s excellent how-to article describing how to build a scrolling data story. While the article is exceptional and very easy to understand, it’s easy to see how slight changes to the story can result in a lot of code change. I’ve modified Jim’s code slightly below, but the approach is still the same: defining an array of functions to respond to each scroll step.

What if we wanted to switch the order of steps 2 and 3? In that case, we would need to change the HTML above to re-order the steps, then ensure that the new indices of steps 2 and 3 correspond to the functions we want to trigger. In longer data stories with much more complex edits, code changes like this become unwieldy and incredibly error prone. Furthermore, different developers might use different methods, such as hash mapping instead of array indices, to link “steps” with chart events, making collaboration between developers even more difficult.

Too many text changes

Another frustration is that, as a data visualization developer in a large organization, I find myself responsible for making many text changes to the narrative part of the story. Those text changes may come from copywriters, researchers, designers, or my own minor changes to update text styles. While the collaborative aspects of the editorial process are incredibly rewarding, modifying editorial text in code is tedious and places me (the web developer) between the editor and the copy they are working on. That “editorial friction” can slow the process down and stifle the collaborative creative process that makes visual stories possible.

Office Space meme: could you come in on Sunday to make text changes? — Thaaaanks Peter

ScrollyTeller features and use cases

So how does ScrollyTeller help fix these issues? Let’s talk about a few key features that alleviate these problems, and a few more that help multiple developers work together on a story. The use cases below are just brief overviews that highlight key features of the library. For a much more in-depth look, see the ScrollyTeller tutorial, which serves as both a template for creating scrolling data stories a tutorial for web developers on how to use the library.

The scrolling data story that serves as a tutorial for how to use ScrollyTeller to create a scollytelling data story.

Reducing HTML boilerplate

ScrollyTeller’s first functionality was to use JavaScript to template HTML text boxes that scroll in a data story using data from .csv files, Google/Excel spreadsheets, or database tables. Each row in the data table represents one scrolling text box, with various text fields to populate a title, paragraph, and some link text. This relatively simple functionality reduces the boilerplate HTML that data visualization developers need to manage and puts editorial power back in the hands of copywriters because almost anyone can edit a data table.

Animation from the ScrollyTeller tutorial showing which columns in a data table correspond with each HTML element in a scrolling text box

Reducing text edits and ceding editorial changes to copywriters

When creating the Tobacco Control and Child Mortality visualizations, we configured a development version of our data story to fetch data from a shared Google Spreadsheet using the Google Spreadsheets API. The data visualization team set up the basic structure of the spreadsheets from storyboards that outlined the basic story flow, then we shared the spreadsheets with our editorial teams. Copywriters and other stakeholders could then update the text in the Google Spreadsheet and reload the visualization to see the changes in real time, significantly reducing the burden of making minor text edits in code. Giving editorial teams control over the text also allowed them to feel much more in control of the story, and propose new changes to the story by adding new rows to the spreadsheet. The data visualization team could then add the relevant events and chart animations to complete the story.

A Google Spreadsheet containing the tabular data for a scrolling data story.

Facilitating easier text translation

At IHME, many of our web tools are used by policymakers around the world, so translating text and localizing our applications is an important, if not very time consuming task. While creating the the Tobacco Control visualization, we found that translation of the application into several languages was much easier with the text components of the visualization already in tabular form. In that case, we were able to share our spreadsheets containing the story “narration” as-is with translators, who returned the spreadsheets, translated, in our data format (we also hid the columns that didn’t need to be translated using the “hide columns” functionality in Google Spreadsheets). Upon receiving the translated text, we only needed to add a language column to our data tables, fetch data by language, and our visualization was translated without any major code changes (other than adding the dropdown to change language).

The Tobacco Control visualization: available in English and Spanish

Standardized event handling

Creating the interactive data visualizations (charts) that underpin many data stories is complicated enough, so it’s nice to have a framework in place to handle the scroll events that trigger animations in the visualizations. There are many libraries that handle scroll events, and ScrollyTeller relies heavily on the much lighter weight Scrollama library, written by Russell Goldenberg, to detect scroll events and scroll progress.

Each scrolling text box corresponds to a chart view

Like Scrollama, ScrollyTeller has several event handling functions that are triggered when a block of text scrolls to a predefined percentage of the page height. ScrollyTeller also provides framework for triggering chart events: each narrative text block contains a trigger field that can contain a JSON string. The JSON string is parsed to a JavaScript object literal and passed to an event handler to trigger changes to the chart. Below is an example of two rows representing narrative text with their trigger field shown. In this case, there is just a year property that changes from 1950 to 2008 between the two rows.

Tabular data showing the title and trigger JSON for the data story below

We can configure a chart component to update when new state is passed to it, in this case, in the form of the year property that transitions from the 1950 data to the 2008 data.

A bubble chart component that transitions between years when a year property is passed to it from the JSON trigger.

A code snippet is shown below that summarizes how the event handling functionality works in ScrollyTeller. Here we are using es6 destructuring to pick properties from an object passed to onActivateNarrationFunction(). In this case, we pick the year property from the state variable, and pass that to our graph’s render function. The ScrollyTeller configuration also provides a function (buildGraphFunction()) to build a data visualization component (a graph) and stores a reference to the component as a graph property for use in the event handler. Notice that the state variable could contain any JSON parse-able entity: arrays, strings, numbers, or even nested objects.

In practice, we find that passing JSON to chart functions makes the process of modifying and updating charts highly flexible, so responding to requests to change the story don’t usually require code changes at all. Story changes might just mean modifying or adding a property to JSON in a Google Spreadsheet. This means that during collaborative meetings, we can sometimes make changes in real time without having to redeploy code to see the results.

It’s worth noting here that like many scroll based libraries, ScrollyTeller provides a progress property to an onScroll() function for smoothly transitioning between chart states as text scrolls up and down the page. We find that animations that link directly to scroll progress are much more satisfying and allow much finer control over chart interactions.

Management of trigger “state”

With increasingly complicated components come increasingly complicated JSON triggers. Sometimes we might need three or four properties to control chart styles, series visibility, highlighted data points, or pre-specified ranges of data. To maintain consistent state when scrolling forwards or backwards, we would have to store all of the state as JSON in the trigger field of every data row. This turns out to be very verbose and difficult to maintain, so we added functionality to accumulate trigger properties from the top down to ScrollyTeller.

In the table above, the series of JSON triggers is shown in the center column, with the merged state property that ScrollyTeller passes to the event handlers. Notice that the state at each row consists of the accumulated triggers that preceded it, so advancing backwards removes or updates the previous state variable accordingly.

Other useful features: sections

ScrollyTeller configuration is grouped by sections, so each section can have its own data, chart components, and state. We found this feature to be very useful when multiple developers collaborate quickly on the same project. Adding a new chart usually just means importing the relevant configuration and adding it to a JavaScript object that ScrollyTeller receives upon instantiation.

Other useful features: standardized control flow

ScrollyTeller configuration also standardizes the control flow associated with data stories, providing asynchronous functions to fetch narrative text tables and chart data, modify the chart data, then build the chart when the application loads. The data and graph in each section are stored as properties and are accessible as arguments in all of the event handlers.

Other useful features: scroll tracking with Google Analytics

ScrollyTeller also provides configurable tracking of scroll events in Google Analytics at the section level, or even down to the level of each narrative text block. We found this feature useful for determining how far users progressed in a data story, where we “lost” them, and where users spent the most time.

Wrapping up

I hope this provides a high level view of ScrollyTeller and how it can be used to make creating scrolling data stories much smoother, more collaborative, and more flexible. To get started, I highly recommend checking out the ScrollyTeller repo, which can serve as a template project for your own data story, and going through the ScrollyTeller tutorial in detail to understand how to set up a project.

I would like to acknowledge IHME for supporting this project and the developers on IHME’s Data Visualization team who contributed to ScrollyTeller and the data stories described in this article, including Evan Laurie, Katherine Beame, David Schneider, Komal Ali, Michael Fernandes, and Ben Hurst. Please don’t hesitate to get in touch if you want to contribute to this project or are interested in working with our team. I would also like to acknowledge the many other developers upon whose open source software ScrollyTeller is based, especially Russell Goldenberg’s Scrollama library that provides the scroll event handling functionality that underpins ScrollyTeller.

Best of luck telling great data stories!