Mapping Media Attention Span

Schema Design Studio

Published in

Schema Design

14 min readFeb 15, 2019

The Lifespan of News Stories

Explore the visualization

1. Overview

Although the information age has brought an abundance of information into our daily lives, it has not done so without a cost: a scarcity of attention. Businesses and news outlets are constantly fighting for our attention with an endless stream of content. As citizens who want to be well informed, how do we cope with this overload of information?

“The Lifespan of News Stories” is a data visualization article that uses Google Trends data (click here for Simon Rogers’ explanation of what Google trends data is and what it means) to explore how news events develop over time. The article describes how recurring patterns in Google Trends data give insights on how we consume news events. This project was a collaboration between Schema Design and Google Trends, with Art Direction by Alberto Cairo and stories selected by Axios.

https://vimeo.com/315759626

In this Medium article, we thoroughly illustrate Schema’s process in developing the project. We hope you enjoy reading about the project as much as we enjoyed working on it!

2. Our Process

At Schema, our process is collaborative and hands-on. We engage deeply with our clients and collaborators. We emphasize dialogue and prototyping as methods for inquiry, discovery, and validation.

Our projects begin by exploring the space and defining the problem (Discovery), which then inform a design hypothesis and brainstorming solutions (Concept Design). Next, we create tangible solutions that can be evaluated (Prototype/Evaluate) before implementation and testing (Build).

We believe that design is iterative — an ongoing process of creating and learning. We value ongoing relationships with our clients, where we combine expertise in the pursuit of knowledge and the best possible outcomes.

3. Discovery

Defining the problem and exploring Google Trends data

To start, we used the Google Trends web tool to quickly visualize search interest in news events. This exercise exposed patterns in how these events surfaced, peaked, and faded away. From observing the development of search interest in an event over time, we suspected that a sudden increase in event search interest was triggered by the publication of one or more articles from different news channels. Typically, the resulting spike of intense search interest completely fades away in (approximately) one week; alternatively, it could again intensify if there is a second trigger within the same event (as illustrated in the Hurricane Irma example, below).

The shape of a news event

Finally, we were able to categorize the event type by looking at the shape of the graph. From the shape analysis, we created three groups (visual examples below):

Unexpected events: search interest grows fast and fades aways slowly
Expected events: search interest grows slowly and fades away fast
Overlapping events: events with two or more follow up events that trigger spikes

Unexpected events: Las Vegas Shooting

Expected events: Solar Eclipse

Overlapping events: Hurricane Irma

How time and geography play out

During our analysis, it became clear that the window of time in which an event was framed was critical to understanding its path. A window of two months (30 days pre and 30 days post event) most adequately captured the complete development of search interest in all event types, especially those with multiple peaks and valleys. Our curiosity was piqued by the change of engagement over time, especially when considering the additional factor of geography.

This relationship is illustrated with the examples of the Solar Eclipse and Hurricane Irma. As seen in the animations below, search interest in the Solar Eclipse falls (as one might expect) in the states most heavily affected by it. Similarly, search interest in Hurricane Irma peaks in Florida, where the hurricane moves into the US (and, subsequently, into the dataset) and up the panhandle.

Solar Eclipse vs. Hurricane Irma

4. Concept Design

Creating a hypothesis

Equipped with information from the Discover phase, we then begin the Concept Design phase: developing a hypothesis and brainstorming possible solutions. From our analysis of the Google Trends data in Discover, we were able to better understand how news events develop over time and raise questions to help us create a story flow.

Initial Hypothesis

We started with the assumption that we could infer the “attention span” of a news event (e.g., Trump leaving the Paris climate accord) by measuring the length of a tail on a graph of search interest in a related topic over time (e.g., Environment). The tail starting point would occur immediately after media channels either stopped or reduced coverage on that story. The idea behind this hypothesis is that search interest in a topic will always revert to the original trend after the coverage of that story ends, leaving a tail of search interest. The “attention span” of a news event is, then, how quickly that spike returns to the original trend.

What triggers a spike

Triggers instigate alterations in the normal search interest trend of a news event. In other words, triggers are moments when several media outlets produce articles around a particular event that immediately attract the public’s attention. The question then becomes, how can we identify these triggers in the data? To begin to answer this, we defined three characteristics that could both describe and differentiate a non-trigger from a relevant trigger:

Intensity — Google Trends’ measurement of the peak of the spike relative to the trend
Duration — How long this event has been in the news
News coverage — How many news sources covered this event

Once we developed these guidelines for selecting meaningful news events, we brainstormed research questions that we would want to ask the data, and organized them into four categories:

Events
How do news events differ in intensity, duration, and attention span?
Topics
What is the attention span of different news events on the same topic? (e.g., for the Police Brutality topic, how do we compare different stories, such as the Trayvon Martin Shooting, Ferguson, and Baton Rouge?)
Regions
How does the media attention span differ by region? What topics keep more people’s attention in each region?
Popularity
Do more popular news events (events with a higher overall search interest trend and more intense spikes) have a longer attention span than less popular stories? Or do popular stories intrinsically have a short attention span?

With these core ideas in mind, we developed concept designs for three visualizations (shown below) that would capture these aspects:

How the event search interest develops over time
How the event search interest compares to that of news events in the same topic
How the event search interest varies by region

5. Prototype

Visualizing the data and verifying our hypothesis

In the Prototype phase, we verify our hypothesis by creating designs and plotting data. Typically, we work through three levels of prototype fidelity (lo-, mid-, and hi-fidelity), as each level helps us answer a different question.

Lo-Fi Designs

Working with low-fidelity prototypes is a good way to quickly verify whether or not a potential solution will work, or deserves further development. Our Lo-Fi designs consisted of quick mock-ups to visualize our assumptions. At this phase, we tend to further diverge our ideas to explore new possibilities, allowing us to later converge with the knowledge that we have explored enough solutions and are focusing on the best direction.

Visualizing the data

Our explorations began with converting a set of sample search terms into area graphs to emphasize the various “shapes” created by each trend. Once we collected various lengths and patterns, we experimented with different comparison methods, as well as with how these patterns could translate to a location-based format.

Verifying our hypothesis

As our collection of sample shapes grew, we noticed that similarly-shaped events often had a clear topical or categorical connection. This led us to explore different ways of grouping topics, examining which methods of story categorization were most effective at yielding similar shapes.

From this observation, we additionally hypothesized that one can identify and categorize different types of events by their shape when plotted as search interest over time. This was in addition to our original assumption that news event search interest over time is a linear trend with occasional spikes caused by triggers (e.g., a news article related to that event), and these triggers alter the normal trajectory of a trend for a certain time period, and at the end of which the search interest tends to revert to its original trend.

Mid-Fi Designs

During the Mid-Fi Designs phase, we start converging our ideas and laying the foundation for more detailed design decisions. For this project, we built the initial prototypes in D3.js to further validate our assumptions and visualize the final data, as seen in the charts below.

Hi-Fi Designs

In the Hi-Fi designs phase, we refined the visuals and designed interactions, such as animations and tooltips. During this time, we typically have closer conversations with the development team and fully define the steps to build the visualizations.

6. Build

The Build phase is an iterative process of learning from the data, reevaluating the designs, and defining a consistent methodology. In this phase, both the development team and the design team work together to develop a methodology for data processing and data visualization. Additionally, at this point in the process, design decisions can be reevaluated based on new information from the data.

Data Processing

The Build phase began with a deeper dive into the actual ‘behind-the-scenes’ data that supports the visualizations. Our partners at Axios assembled a list of relevant news events from the past year; we then mapped each event to a relevant search term, and quantified search interest in both the lead-up and in the aftermath of the event. To do this, we used the Google Trends API to return the search trend across a 60-day window, centered at the event’s peak search date.

The Google Trends API returns normalized results between 0–100, and limits the maximum number of comparison terms that can be used at one time. Since we wanted to compare search interest across ~50 events, each of which occurred at different times throughout the year, we needed a workaround. We decided to compare each event against a common, relatively stable search term: “Google News”. This way, each set of event search interest results was expressed as a proportion of the same comparison term, ensuring a fair comparison across events.

Next, we needed a way to quickly visualize the search interest trend results. We used Python and Jupyter Notebook to quickly plot each event’s search results by date (illustrated below).

It was immediately apparent that there was considerable variation in the degree of search interest across events, despite the shared baseline (“Google News”). This was not a complete surprise, since it seemed reasonable that there may be an extremely large difference in the number of searches for an event like “Hurricane Michael,” and, for example, the “Space X launch.” Nevertheless, since our intent was to show the rise and fall of search interest surrounding each event, we employed a logarithmic scale to better compare the shapes of search interest patterns across events.

We also observed instances in which there were secondary search interest peaks outside of the main peak. It was hard to determine whether this was due to search interest related to the same event, or if that search term became associated with a separate event that occurred around the same time period. We attempted to identify the latter cases by setting a minimum threshold around the main peak. Moving away from the main peak in either direction (meaning forwards or backwards in time), once search interest dropped below our threshold, we zeroed out the remaining dates in that direction. This approach helped us preserve search interest specifically pertaining to the event in question.

Article Selection by Axios

Once we had our search trend results, the next step was to obtain metadata from the news article pertaining to each event. Each event was associated with a specific news article at Axios; thankfully, Axios uses similar html formatting across all articles. This made it easy to scrape and store the relevant web content (e.g., Authors, Title, Image URL, etc.) from each article. We used this information to populate event cards for the final visualization.

Methodology

This visualization depicts United States search metrics based on a set of 2018 news events, curated by Axios. Each event was represented by a unique search term (or terms). The Google Trends API was used to obtain the date when search interest peaked for each event throughout the year.

A separate Google Trends API request, using the same search term(s), was used to quantify the evolution of search interest across a 60-day window surrounding each event, centered at the event’s peak date. To ensure a fair comparison across terms, the search interest for each event is expressed as a proportion of the search interest for a standard baseline search term during the same time period (in this case, “Google News”). For example, a search interest score of 3 on March 12th would mean that, on that day, search interest for that term was 3x greater than search interest for “Google News”.

For each event, we filtered the search interest timeline by zeroing out search scores in either tail (i.e., moving backward and forwards in time from the peak date) once the search score at a given date dropped below a predefined threshold (search interest score of 0.15).

Regional results were obtained using the same methodology across a 16-day window surrounding each event. Search interest for each event was then expressed as the proportion of search interest for the standard baseline search term (“Google News”) for the same demographic market area. To reduce variability in metro regions, the score for each day was computed as the average search interest across a 5-day window, centered on that day.

Visualizations

Timeline

The timeline animation was achieved by updating the x-axis domain during each animation frame. This way, only a subset of the entire year of data was displayed in the graph at one instance. For each animation frame, area paths were recalculated so that they coincided with the x-axis domain updates. Area paths outside the current date domain were excluded from calculations.

https://vimeo.com/315759528

Regional Map

To achieve the regional chart animation, the chart needed to update its data display for each tick of the piece’s internal timer. Each day required its own aggregate set of search data, grouped by metro region. For each passing day in the piece’s internal timer, the regional map would update with that day’s set of data. To simulate fluid changes of the data between each day, a cubic easing transition was used with a duration tuned to the same speed as the piece’s internal timer.

https://vimeo.com/315759567

Time Scrubber

Both the timeline visualization and regional chart animation are dependent on the piece’s internal timer. One of the greatest challenges of the piece was keeping the two charts’ data displays in sync with the internal timer. The charts animate with each passing day of the timer, and also transition fluidly when the time scrubber is manipulated.

Topic Chart

To display each term’s search interest area on the same scale, each term’s data was normalized to a maximum of 100. Search interest areas were then overlaid on the same chart with peaks stacked on top one another, which allowed for shape comparisons. To achieve this, search term dates were recalculated as the number of days before or after the peak date. Each chart used an x-domain of -15 to 15 days.

https://vimeo.com/315759600

7. Project Takeaways

This project made it clear that, sometimes, it’s less about immediate takeaways from the data, and more about uncovering insights that are hiding beneath the surface. It is important to be open to approaching the data in an unexpected way to unearth the story behind it. In previous collaborations with Google, we have approached Google Trends data by comparing the magnitude of search interest between different terms; however, in this project, the intensity was not as important as the development of search interest over time. We soon realized that our story was actually in how search interest varied over time, not simply in the peak magnitude of the event search interest. Interestingly enough, regardless of the intensity of interest in an event, similar events had similar graph shapes.

Throughout the process of developing this project, we definitely gained an appreciation for the vast spread in search interest that different news events can elicit, as well as for the difficulty in what it takes to define an “event”. Additionally, it was quite challenging to link an event to a particular search term, since we knew that decisions we made could yield dramatic changes in results. Asking questions about how a news event can be measured via search behavior proved very demanding, as it was difficult to relate particular news events to search behavior. There are some news events for which using a single name as the search term might do an accurate job of returning the interest specific to that event. For instance, a person like Barbara Bush was not in the news very much prior to her death, and thus the interest for the search term “Barbara Bush” likely is specific to her death. However, in other cases, like the resignation of Jeff Sessions, the person at the center of a news event might have been involved in the lot of other news events throughout the year, and restricting the search term to that person’s name has the potential to return broad search interest results that are difficult to ascribe to a particular, single event.

As Simon Rogers said, “[Google] trends data can provide a powerful lens into…how people around the world react to important events.” From this project, we determined that it is possible to use Google search interest data to gain insight into not only how people around the world react to important events, but also into how society is evolving, based on which events persist in public memory, and which do not.