Making Sense of Engagement At Scale: Hearken’s New Classification Features

Published in

We Are Hearken

6 min readJul 13, 2021

It’s a good problem to have: What if your organization gets more audience input and demand for information than you can possibly handle? You don’t want to leave your anxious public hanging (especially during natural disasters, breaking news, or a pandemic). But you also can’t afford to have your team working non-stop. You need help figuring out what to prioritize. This is where the newest feature of Hearken’s Engagement Management System can help — our automatic classification tool.

(This post is a follow-up to “Classifying Engagement: Hearken Wins Funding from the GNI Innovation Challenge,” sharing what we’re building, how it’s working so far, and what it’ll be able to do when it’s done.)

Hearken’s new classification tool takes the questions and responses that newsrooms get from their audience through Hearken forms and sorts them into groups based on textual similarity. This makes it possible for newsrooms to understand what their audience is most interested in without having to sort every one of their hundreds or thousands of audience submissions manually.

The tool uses machine-learning-enabled natural language processing to group audience submissions into topic areas. Three newsrooms are currently using the classification tool, and we’ll be rolling it out to several more, and soon all of our partners.

How it used to be: When a newsroom received a high volume of audience questions, someone in the newsroom had to manually group similar submissions together. When KQED received hundreds of questions about homelessness, a producer spent half a day dividing the submission into lists to identify the things people were most curious about. That helped shape an FAQ-style reported piece that directly addressed audience questions but was time consuming to put together.

How it can be now: When someone in the newsroom indicates that a group of submissions is ready for sorting, Hearken’s platform groups them automatically using a machine learning-enabled natural language processing system. Natural language processing means the system can read and “understand” similarities between submissions. Machine learning means that if a journalist manually sorts or reassigns a submission, this system will use that feedback to more accurately sort future submissions.

This is the effect we’re intending the tool to have on your audience submissions!

Using those automatically generated groups, journalists can efficiently identify the areas where their audiences most need answers with much less time spent sorting questions. This also makes it easier for the newsroom to respond to everyone who submitted similar questions once an answer has been reported out. And with more people equipped with answers, they can make more informed and better decisions in their lives (aka the purpose of journalism).

The classification tool in action

For Southern California Public Radio (KPCC + LAist), the tool found similarities and sorted submissions related to the COVID vaccine into one of four topics: second dose dates and locations; wait times and logistics; prioritization; or symptoms, side effects and medications.

Our partners will be able to apply this machine learning to sort any list of questions into topic themes. As we’ve been seeing more partners generate audience questions for each reporter and beat (check out Crosscut as a great example), we’re excited to see how this tool paired with Hearken’s proven public-powered process can save every reporter time, provide insights and boost their confidence about how they spend their resources to do the most good for their communities.

What’s next

Soon, we’ll be rolling out the interface for newsrooms to set up sorting for a topic on their own, without any hand-holding by our engineers or account managers (though we’re always happy to help!).

In this mock-up, the Hearken platform asks the user to predict how varied the responses will be, and explains that it’s totally normal to re-run this step multiple times until you find the sweet spot:

In this mock-up, the user can see how the submissions were divided up, and decide if those divisions make sense.

Also: We’re also very close to having the system work for sorting submissions in languages other than English. First up, we’re testing with Spanish and Danish submissions. (Fun fact: Hearken has a Northern European branch based in Denmark, and serves European newsrooms as well as member-based and governmental organizations).

Lessons and challenges

Introducing new concepts: Hearken’s classification interface will probably be the first time our users have interacted with natural language processing or machine learning, and even more likely the first time they’ve trained a natural language processing model. As a result, we have had the challenge of introducing our users to new and complex technical concepts in an interface that still feels familiar and is easy to use. With the help of an outside UI designer and lots of helpful beta testers, we’re feeling good about our ability to address this challenge.

Low volume: Each model needs to be trained using a good number of real submissions (ideally more than 100). That requires that the audience is already engaging before the organization sets up the classifier tool. Also, once the model is trained and running, the audience still needs to be engaging for the newsroom to see the benefits and efficiency of having set it up. For example, there are fewer questions coming into our newsroom partners these days about COVID-19, so while the classifier tool is active, it’s not saving them as much time now as it would’ve six months ago when the question volume necessitated automated sorting. We’re hoping to find ways to identify that key moment when there are enough submissions coming in to train the model but engagement hasn’t hit its peak yet.

Limited resources for non-English languages: Many of the natural language processing resources that our classifier tool uses were primarily developed using English-language source material. Even as they’ve expanded to other languages in the past few years, it is difficult to find resources that are as accurate in non-English languages as the original English-language version.

Our end goal

Ultimately, we want to make it as easy as possible for the information needs of the public to be surfaced, organized, acted upon and responded to by newsrooms so that citizens can do their jobs and make informed decisions in their communities.

We hope that adding this efficiency-creating feature to Hearken’s Engagement Management System further incentivizes newsrooms to pay more and closer attention to engagement, which in turn will help them be as relevant and as worthy of financial support as possible.

Bring this efficiency to your organization

For current Hearken partners, we’d love to add you to our testing group! Send a note to success@wearehearken.com if you’re interested.

For not-yet Hearken partners, reach out to us here. We’ll look forward to learning more about what you’re interested in and seeing if we can support you and the public you serve!

Special thanks to Julia Haslanger and Kieran Hanrahan for generating these insights and creating this post.