Passing the Baton at the WSJ: How New Tools and Features Migrate Among Our Teams

Louise Story
WSJ Digital Experience & Strategy
6 min readOct 21, 2020
Talk2020 user interface

Innovative new ideas can have many twists and turns in the development process, and I want to share a great tale of one of our newest features, Talk2020.

Talk2020 is a tool that helps our audiences look up past statements by the presidential candidates and their vice presidential partners. The value proposition: Look up what they said, when they said it and analyze every word for yourself.

It has been a hit on debate nights. People watching the debate want real-time accuracy checks. This is a tool that lets them do it for themselves.

But Talk2020 actually started life as a reporting tool for our Washington, D.C., bureau more than two years ago. Here, some of the leaders from our Digital Experience and Strategy team who worked on it will share the story of their contributions to Talk2020 and talk about when they passed the baton.

Becky Bowers, Strategy Editor for Washington Coverage

In the world long before Covid-19, a handful of folks in the D.C. bureau gathered on Monday afternoons for Show & Tell. They were something of a digital braintrust — those who work in data and data visualization — and back in late 2018, data news editor Anthony DeBarros was telling the rest of us about some work he had done to scrape transcripts of President Trump’s speeches into a database.

That work powered an article DeBarros wrote with a White House reporter in October 2018. He saw the possibility for a full-fledged tool for other reporters to use, too. I admit I cocked my head a bit — how would a database we’d built be any more useful than a traditional search engine?

Turns out, I was very wrong. Anthony built a prototype tool using a small PostgreSQL database and Django, a web framework with deep news roots. By summer 2019, he was demoing the tool for members of our New York-based R&D team. And that’s where the real alchemy happened.

Alyssa Zeisler, R&D Chief and Senior Product Manager for Editorial Tools

Our R&D team was touring news bureaus when Anthony DeBarros showed us his project. We immediately spotted the potential to turn it into a more robust newsroom tool ahead of the elections. We worked with Factiva to expand the database to include public statements from all the presidential candidates, added tweets through a Twitter API, and integrated machine-learning models to identify language patterns and locations, helping our reporters track candidates’ stances on various issues. This helped reporters find information faster and conduct new types of analysis using political rhetoric. Pretty soon the database contained 30 years and tens of thousands of public statements made by a number of political figures.

User interface of internal tools version of Talk2020
The internal tool version of Talk2020

Acquiring data sets is one of the most common requests the Newsroom Tools team (a combined effort between the R&D and the Tools product team) receives from reporters and editors. We frequently work with reporters and editors to turn unstructured data into usable formats and use computational approaches to help our reporters quickly find quotes, data and context for their stories. With these tools and approaches, we’re creating new ways of finding and telling stories that scales our ability to collect, refine and translate data into usable forms for our reporters and audiences.

The internal Talk2020 tool was immediately popular when we rolled it out a year ago. Paul Beckett, the Washington Coverage Chief, said the tool “is one of the best things we have done for this election cycle.” Senior correspondent Jacob Schlesinger said, “It saved me a ton of time and also gave me better and more focused results.”

It led to many stories, including these:

“The Wall Street Journal identified 23 commonly used three-, four- and five-word phrases and their variations spoken by candidates during the four Democratic presidential debates and tracked the number of times they were said.”

“According to a Wall Street Journal analysis of public statements by the candidates, Mr. Biden more frequently uses words like “growth” and “opportunity,” while Sens. Sanders and Warren are the most likely to invoke terms like “billionaire,” “Wall Street” and “rigged.”

As the tool was used more frequently internally, there was a growing call to open it up to our audiences. User research showed people want access to raw information they can dig through on their own. And that’s where the baton passed again, to my colleague Tyler.

Tyler Chance, Product Director

In spring 2020, we dug into a user-facing version of Talk2020. We had a great foundation in the R&D tool, which covered the use case of journalists, but we needed to figure out how this content could be utilized to answer an audience need. Going through our UX team’s research on reader habits, we zoned in on the feedback that readers desired:

a) The ability to use WSJ as a knowledge base

b) to use that experience in a “show, don’t tell fashion”

c) to have the ability to share what they’d learned with people they know

So, with this information, we took the experience back to the drawing board. The user persona we needed to focus on was a much more casual user who could potentially be doing these searches during a live debate. No longer did this aim to be an academic tool for someone to take a lot of time and write analysis. This product needed to have an extra layer of “browsability”, so someone could quickly come to their best result.

So, with that, we implemented a few things to maximize ease of navigation and shareability.

  • Natural Language Search. Using the Amazon Kendra service, we were able to supply users with a search box through which, beyond just the input of keywords, they could actually ask questions, i.e., “What have the candidates said about fracking?”
  • Categorization and Taxonomy. For people who care more about an entire topic instead of singling out a quote, we made a taxonomy that would help categorize these quotes for higher browsability. This means that anyone could see what every candidate in the system has said about the economy without having to do a search.
  • Flexibility. The political conversation and what are deemed the most important topics can vary week by week. We put in a system that would allow for us to add speakers and quote categories easily. So, for instance, we were able to quickly include the vice presidential candidates without having to rebuild the data ingestion from scratch.
  • Atomic Sharing. Each quote is shareable on its own, so people have the ability to show what they’ve learned with the people they know.

There are a lot of ways we will be able to use not just this tool but also the insight it brings for the future. Being able to look at the 11,000+ kinds of queries that have been performed also gives us a glimpse into the ways that people are searching for content. Since launch, we’ve had a level of visitors not far off from our live coverage page, a far more established part of our site. This allows us to look at those habits between search and browse and learn what to maximize for future iterations.

We are excited to see what other audience needs we could address with tools like this.

And to bring it all full-circle, as the debates kicked off this fall, Becky Bowers was one of the journalists in Washington who used our new WSJ.com feature of Talk2020 in real-time during the debates and tweeted the results.

--

--

Louise Story
WSJ Digital Experience & Strategy

Journalism leader with a background in product, technology, investigative reporting and masthead-level editing.These columns largely focus on news & technology.