What You Don’t Know You Don’t Know: Using Machine Learning to Unearth Audience Engagement Insights

Jennifer Brandel
We Are Hearken
Published in
9 min readJun 21, 2022

I’m excited to share a new development in how newsrooms can use machine learning to uncover and serve their communities’ hidden information needs.

But before that — I want to share a story about how we came to this discovery.

What Beats Would The Public Design For Newsrooms? Hint: Not The Ones Newsroom Have

Back in 2012, I was running a news experiment at WBEZ Chicago called Curious City (it’s now in its 10th year — hooray!). The underlying idea was that, in our team’s little corner of the newsroom, we would let the public tell us what information they needed, instead of deciding what to report on for them through the very squishy concept of “news judgment” and traditional editorial meetings.

In order to solicit their information needs, the Curious City team wrote this broad invitation for the public to send us their questions: “What do you wonder about Chicago, the region or its people that you’d like us to consider investigating?”

At the beginning, we wondered if we should narrow the scope of our request and instead ask our audience to send us questions about a specific beat, because the invitation we had written could generate all different kinds of audience questions. And you know what? It did. And I’m so glad we didn’t limit our request to the beats our newsroom was already covering.

When we accumulated 1,000 questions from our listeners, we decided to do some good old-fashioned analysis. We read through a spreadsheet of all of the questions and categorized them by topic.

Turns out most of the questions did not fit into one of our newsroom’s traditional beats. For instance, someone might want to know what it’s like to live on minimum wage in Chicago, or what it’s like to be a conductor on the CTA. There were enough of these questions about individual experiences and people wanting to live vicariously through or understand others that we created a whole category called “what’s it like to …” I have to chuckle imagining that a newsroom would ever come to that conclusion independent of this process and say, “We should hire a reporter to just do this kind of reporting.” Sure, typical profile pieces can sometimes hit on some of this same kind of coverage, but those tend to cover the most powerful or exceptional sorts of people, not everyday folks who are living in a particular circumstance.

We also learned that Chicagoans had many questions about their built environment — roads, tunnels, infrastructure, etc. — which we grouped into a category we called “urban planning.” And then there was the most delightful surprise to me — people had a ton of questions about how things got to be the way they are, about local history. That became one of the most popular and fascinating categories of questions, and though I don’t have the raw numbers, I imagine at least 25% of the stories Curious City reports have a major focus on a piece of Chicago history.

Based on this experience, if I were running a local newsroom I would conclude I should hire a history reporter and an urban planning reporter. After having seen this trend in other partner newsrooms doing general assignment series, I truly believe these would be some of the most useful and relevant reporters that any local newsroom could hire!

Below are screenshots from WBEZ’s beat structure from their website back in 2014 compared to Curious City’s beat structure in the same year. As you can see, only one category (economy) is the same.

The beat structure at WBEZ circa 2012 vs. the beats that emerged from asking the public what they wanted the newsroom to cover circa 2013, then 2018.

Examining our assumptions to consider that WBEZ’s traditional beat structure may not have been serving the information needs of Chicago was an illuminating experience. I wish I could report that the Curious City team was successful at lobbying for reporters to cover these particular categories. We were not. But we were successful at growing our series to being the most successful ongoing series WBEZ has ever done (and continues to this day), which has produced the site’s most popular and shared stories for years on end. And it’s grown from being 1.5 full-time staff and an intern to a robust desk that collaborates across the newsroom and with other newsrooms and civic and cultural organizations.

Engagement Insights At Scale

So… it’s with much excitement that 10 years after I started this public-powered approach to journalism at WBEZ, I can share that Hearken, the company that was born from the lessons of Curious City, has a technology that can now help newsrooms automatically categorize questions from the public into different themes and topics! Rather than spending an entire week of human time analyzing a spreadsheet to understand where the interest and opportunity was gathered like we did, newsrooms can categorize questions in minutes and learn valuable insights to apply to their editorial decision-making.

A screenshot displaying one of the categories Hearken’s auto categorization system has identified from a set of questions about COVID-19.
A screenshot displaying one of the categories Hearken’s auto categorization system has identified from a set of questions about COVID-19.

Thanks to funding from GNI, we were able to develop an auto categorization feature to help make sense of engagement at scale. We’ve been finding Hearken customers who are curious to give this feature a go, and wanted to share some early exciting examples of how different newsrooms have been thinking about and testing auto-categorization of audience submissions.

Logo for The Philadelphia Inquirer

Philadelphia Inquirer

Megan Griffith-Greene, Senior Service Journalism Editor

How they tested the auto-categorization feature: 2020 brought a major shift for the Inquirer in every facet of how we worked. One thing that helped our service team work to meet the essential information needs of first the pandemic, and later the election, was transforming our Curious Philly program into a listening tool to help us assess what our audience needed most. And audiences responded — they sent us hundreds of questions about how to navigate the world in its changed state, what was safe, how the rules worked, what the science meant, and how to participate in the election. The input from our audience was rich and critical in shaping our coverage. But it was also overwhelming and onerous to identify themes, respond to people individually and use their input to its full potential.

The problem reveals a pain point for most newsrooms: How can an audience listening program not overwhelm the resources of a small newsroom? We were happy to work with Hearken on a product solution that would help identify patterns in reader responses, identifying themes such as “vaccine,” “transportation,” “city services,” and “local history.”

While we have not yet used the auto-categorization feature in our Curious Philly program, I look forward to how it will increase the impact of the module as an engagement tool and help us manage workflows in the future.

What they are learning: Listening efforts are essential to our work but can be crushingly time consuming. Auto-categorization could be a powerful tool for identifying themes that aren’t immediately apparent to help us identify information needs and prioritize how we use our resources.

Logo for Graham Media Group

Graham Media

Dustin Block, Audience Development Lead

How they are using the auto-categorization feature: We are replacing comment sections on select ClickOnDetroit stories where the odds for a productive conversation among readers are low with Hearken forms. In open comments, we usually just get trolls, but so far with Hearken forms we’re seeing a mix of much more valuable feedback including news tips and questions. COVID-19, crime and political stories are broad topics where web producers switch to Hearken over comments. Auto-categorization helps us organize feedback into buckets for our producers to review and respond or assign for further reporting. We’re still organizing our categories, but we’re seeing groups form around questions, comments for the station, and major topics like traffic, weather and politics.

What they are learning: Replacing comments with Hearken is creating a customer service tool for our producers. Incoming material immediately appears in a Slack channel for producers to address, and then filters into the Hearken EMS for categorization and trend analysis. It’s giving us confidence to expand the comment-replacement experiment to more types of stories knowing we can efficiently sift through the material. We also believe it will improve comments by limiting opportunities for trolls and encouraging conversations around topics with a greater chance for success. Matching engagement tools to the material may be one way to deepen our relationships with audiences.

What We’re Excited For Newsrooms To Try

Auto-categorization to guide elections coverage

Hearken has paired up with media critic and NYU professor Jay Rosen to teach more newsrooms how to do a specific public-powered approach to elections coverage called The Citizens Agenda. The key to this model is for newsrooms to start elections coverage by asking the public they’re serving this question: what do you want politicians to be talking about as they compete for your vote?

Since 2020, we’ve trained dozens of newsrooms in launching this approach, and what they’ve learned is fascinating:

  • The information the electorate is asking for is not what newsrooms would expect — people have questions about different topics and concerns than reporters have
  • Politicians are not always talking about the concerns the electorate has — people have questions about different topics and concerns than politicians have
  • The process of including the public in elections reporting results in more original content that better meets the electorate’s needs and builds trust between the public and newsroom
  • Reporters felt more confident in and fulfilled by their coverage choices by basing their decisions off of listening to the public rather than following traditional editorial judgment or horse race politics

(We’re offering this training via Election SOS for the 2022 midterms — learn more here.)

A newsroom who uses this Citizens Agenda approach and generates hundreds or thousands of submissions from the public could use the auto-categorization function to quickly see what themes, trends and categories are coming up and to then shape their coverage accordingly.

Logo of LAist + KPCC

Newsrooms like KPCC/LAist in Los Angeles are already doing this as part of their Voter Game Plan series. They’re interviewing the candidates for L.A. Mayor and District 3 L.A. County Supervisor, and basing the interviews they do with these candidates on the questions and concerns coming in from the public, which they collected via Hearken forms on the event registration page.

The questions are also informing a quiz to help Angelenos identify which candidates most align with their values (KPCC/LAist is using THE CITY’s “Meet Your Mayor” framework). Once the candidate discussions are concluded, KPCC/LAist will switch to a Hearken form that collects people’s questions about the election more broadly.

What KPCC is learning

The themes in the questions are just as useful as the individual questions. KPCC/LAist invited some audience members to ask their exact questions during the individual discussions. But they also analyzed the questions they received to identify the topics and angles that are on voters’ minds. This helped them decide how many questions to devote to each topic in the live interviews and the quiz, as well as how to approach the wording of the questions in the quiz to be voter friendly.

They wrote more about how questions themes guide editorial decisions in this 2021 reflection.

There are layers to the task of categorizing questions. KPCC pulled in questions from a variety of platforms — including Hearken, GroundSource, social media and post-cards. To find themes and insights, KPCC/LAist first categorized the questions by topic, such as homelessness, climate change, and public safety. Then, for the topics that had a lot of questions, they categorized them further to get to the heart of what the question for the candidate should be. For example, within the category of homelessness, there were questions about more mental health support for unhoused Angelenos, ending practices that criminalize living on the streets, and broadening who is included in policy making.

All of this valuable categorization KPCC did has so far been done by a human, which of course is very time-consuming. The newsroom looks forward to testing out the auto categorization feature to make this process exponentially more efficient.

We’re excited to work with newsrooms who are committed to public-powering their reporting, and having the auto-categorization tool supporting them doing this work with a new level of efficiency and insight.

If you’re curious about learning more, we’re here to answer your questions! Reach out to: info@wearehearken.com

--

--

Jennifer Brandel
We Are Hearken

Accidental journalist turned CEO of a tech-enabled company called Hearken. Founder of @WBEZCuriousCity Find me: @JenniferBrandel @wearehearken wearehearken.com