Making Sense of Large-Scale Online Conversations
Earlier this summer, we introduced an evolution in Jigsaw’s work focus. We noted then that we were excited to go beyond Perspective API’s ability to bring more voices into online conversations to identify new ways for each voice to be more meaningfully heard. This post provides an update on our progress and introduces a new, open-source library to help make sense of scaled conversations.
Over the past year, Jigsaw has been exploring how to make large-scale online conversations, particularly online deliberations, more impactful and scalable, and to facilitate their use in a wider array of contexts. Partnering with platforms such as Polis, among others, we have sought to address a key challenge for facilitators looking to distill and analyze the large amounts of data generated through these conversations. Too often this critical process can take weeks to complete, leaving the feedback loop open for too long for both policymaker and participant alike, reducing the power and potential impact of these conversations.
Imagine you’ve been asked to collect public input for a city budget council using one of these technology platforms, inviting hundreds or even thousands of people to discuss how the city budget should be allocated. How would you make sense of a “conversation” like this, with hundreds of statements and votes per participant on topics as diverse as healthcare or education? And how would you do so in a way that captured the nuance of everyone’s individual point of view, conveyed real meaning to decision-makers and participants alike, all anchored in the individual statements participants provided? Oh, and also on a tight timeline! This is the challenge we set for ourselves, rooted in rigorous research and understanding of the needs of people, policymakers, and the communities they collectively represent.
Today we’re excited to announce our Sensemaking tools, a new, open-source library for large-scale conversations. Still in ‘beta,’ our Sensemaking tools leverage Google’s industry leading, publicly available Gemini models to categorize and summarize large-scale input into clear insights while retaining nuance. By automating the most complex and time consuming aspects of analysis, we hope to make it possible for more communities to engage in meaningful, large-scale conversations and arrive at informed decisions.
How it Works
Our Sensemaking tools are designed to leverage cutting edge large language models (LLMs) to provide useful, trustworthy insights. This library includes three main functions: learning of topics and subtopics based on provided data, categorizing statements into these topics, and then summarizing the entire conversation with a written report that is grounded in the conversation with relevant citations. Some of the key elements of this experience include:
Flexible Categorization: Jigsaw’s Sensemaking tools categorize the entire conversation to help participants, facilitators and decision-makers understand the key topics and themes. We group statements into digestible topics like “education” and “transportation and Infrastructure.” We can further organize subtopics under each topic, for example “public transportation.” Topics or subtopics like these can be dynamically identified from the conversation data itself, or users can provide their own topics into which statements can be categorized for greater customization and flexibility.
Actionable Insights: Our Sensemaking tools can quickly generate an in-depth summary of conversations to provide meaningful, helpful insights. Summaries identify areas of agreement and disagreement, and key findings or proposals across each of the topic areas discussed. We’re designing our summarization approach to leverage the unique abilities of the latest LLMs, capitalizing, for example, on their large context windows and the ability to use structured reasoning in prompts. By doing so, we’re able to summarize differing viewpoints while maintaining representation, distill insights without losing key points, and understand how opinion groups cluster. Our Sensemaking tools can identify areas of agreement and disagreement across multiple groups, and will accurately incorporate the full range of divergent opinions. To make insights even more actionable, we are also exploring ways to leverage LLMs to visualize rich conversations in easy-to-interpret diagrams and charts.
Grounded Results: Accurate, trustworthy data is critically important, particularly for public policy experts. Our tools help ensure the accuracy of insights by grounding them in source data and providing references, allowing for easy verification. Grounding is the process of connecting AI output to verifiable sources of information — in this case the original statements and votes.
Critically, our tools enable any platform integrating our library to return such results not in hours or days but in minutes, significantly accelerating the feedback loop, enabling deliberation facilitators to distill meaning in close to real-time.
Our Sensemaking tools are designed with a wide range of potential users in mind, and include community leaders eager to gain a nuanced understanding of public opinion quickly, while identifying areas of common ground and contention. We also hope that researchers will benefit as they will be able to study online conversations more efficiently and effectively. Finally, we hope to assist developers of platforms for online conversation and deliberation who might easily integrate Sensemaking into their applications. Doing so can help to provide richer insights to their end users with minimal effort.
As Colin Megill, co-founder of Polis notes, “Polis has always faced a bottleneck in reporting, as conversations are labor-intensive to moderate effectively, and the results are often information-dense. Human facilitators dedicate countless hours to distilling Polis results into reports that are accessible to both participants and decision-makers. The new library from Jigsaw, combined with the partnership, conferences, and validation they are contributing to the emerging field of deliberative technology, marks a critical advancement. These resources will enable facilitators of deliberative processes to make the results of deliberations more accessible, while ensuring safety and reliability with a human-in-the-loop approach.”
Getting it right in the way Colin suggests means more people will have the confidence to participate, knowing that their individual and collective voices are being heard in ways that neither public opinion polls or even periodic elections currently capture. Increased participation, in turn, holds the promise of tighter, more positive feedback loops, deepening a shared sense of citizenship amongst participants and providing decision-makers more incentive to take action. This was our instinct for why we wanted to focus on sensemaking. But it was our research that convinced us.
Why Sensemaking? Learnings from the Field
Our Sensemaking tools are rooted in academic literature and Jigsaw’s own ethnographic and user experience research.
We began by seeking to better understand the factors shaping people’s attitudes toward civic participation and deliberation at the local community level within the United States. Over the past several months, we’ve spent hours sitting with people from all walks of life at kitchen tables and in living rooms, at public meetings and community events. In these conversations we’ve discussed what it takes for civic deliberation to be meaningful, what might be holding individuals back from participating personally, and how technology might help.
Understanding the needs of community leaders
We started our exploration by interviewing people currently soliciting public input, including moderators and decision-makers. From them we learned that AI tools could help them make sense of a wider variety of voices within their communities, particularly to identify areas of broad agreement for which they might take immediate concrete action. But getting input was one thing; managing it another. As one community leader told us with a hint of desperation, “We have to make sense of the data, but we don’t have time to drill down into all of this. How do we make sense of it? How do we tell the story from this process?”
At the same time, they made clear that they didn’t want to leave the entire process to the AI. To ensure accuracy and fairness — and to build public trust — human oversight would be essential for any AI summarization tools.
Understanding the barriers to civic participation
Next, we spoke with diverse members of the public, some of whom regularly participate in civic life as well as others with fewer resources, who often feel voiceless and disenfranchised. Across the board, people told us they wanted to have a better way to see and understand their own communities, and to know more accurately where they fit in in relation to their neighbors.
A young man in his mid-twenties, working in financial administration, sighed as he described the state of political discussion in his community. Despite the town being a “melting pot” of people from diverse backgrounds, there was a strong “disconnect” between groups and an inability, he sensed, for people with differing views to really understand or listen to each other. As a result, he told us, it was hard to know what his neighbors really thought, or to see what the range of views in the community might be.
Another significant insight was gleaned when speaking with people facing barriers to civic participation. From them we learned that sensemaking might be able to do more than provide valuable information back to a group about itself. It could also help encourage individuals to speak up, by making them feel less alone. As one interviewee put it, “I won’t show up [to a deliberation] if I think I’m the only person who cares about this issue.”
One Saturday morning over coffee, we sat down with two young construction workers. Underhoused and formerly-incarcerated, they both felt distant from those making decisions impacting their lives. Despite their strong sense of their right to be included, the idea of expressing themselves in public on local issues felt like an implicit challenge to authority, and deeply uncomfortable. They told us that being able to see the views of others, and in particular the “points of sameness” across the community, might “bring comfort” and give them confidence that their voices were worthy to be heard.
Collaborating with experts
Having rooted our initial insights in deep ethnographic research, we also benefited from our many conversations with partners working at the cutting-edge of deliberative technology. Together with coauthors representing nearly 20 organizations, we have recently published a position paper on how AI might enhance our digital public squares. At the same time, our colleagues at Google DeepMind have recently published pioneering work in Science, demonstrating that LLMs can help people with different views find common ground. As we build our Sensemaking tools, we are proud to be collaborating with the researchers and experts behind both papers.
All of the insights, gleaned from our ethnographic and technical research, clearly indicated that sensemaking could be empowering if done in such a way that people’s nuanced views could be collected easily and in a way that they could see where those views “fit” into the broader conversation. This is exactly what we’re hoping to do, even as we seek to manage the risks of the underlying technology.
Addressing Potential Risks
As we work to enable sensemaking in large-scale conversations, we are cognizant of the potential risks of LLMs, many of which are described in “Opportunities and Risks of LLMs for Scalable Deliberation,” co-authored by Jigsaw software engineer and co-founder of Polis, Chris Small. Chris notes that our library attends to several opportunities discussed in the paper (e.g. AI Topic Modeling and AI Summarization), while addressing key risks discussed as well, including viewpoint bias, hallucinations, and over-reliance on AI, by employing various techniques, like:
- Evals: LLMs may reflect and amplify societal biases, potentially misrepresenting public opinion. To address this, our team has evaluated our summarization function for accuracy and proportionality. We did this by testing how the summarization function handles different scenarios of over/under representation, such as cases where statements are split or when there is a discrepancy between the number of statements in favor of an issue and the number of votes against it. In order to test our sensemaking tool in conversations on sensitive topics with highly divergent viewpoints, we looked at data with these attributes and found that our AI was successful in summarizing and categorizing the viewpoints accurately.
- Grounding: LLMs can sometimes generate inaccurate or misleading summaries. As detailed above, our best-practice approach to grounding greatly reduces the risk of inaccurate information by ensuring that our output is consistent with input data. Grounding citations will also enable readers to access the original statements underpinning the summarization report.
- Open Source: We’re open sourcing the prompts and algorithms that we use to interact with publicly available LLMs for sensemaking. Open-sourcing allows for outside scrutiny to help us spot risks and improve our product design. We also want to support responsible use of LLMs by any developer, whether they’re using our tools or not. By making our GitHub libraries public we are inviting anyone interested to explore the repository and welcome community feedback. We hope sharing our approach, as we go, will be of use to other developers applying AI models in this space.
- Transparency and control: While we have heard in our research with facilitators and community leaders that they are always open to tools that make deliberations easier and richer, we also heard that they do not want to “lose control.” That’s why we are designing our tools with user control in mind: authorized users — such as facilitators or conversation moderators — are empowered to review outputs before they are shared more widely, modify them if needed, and leverage our outputs to create their own new assets and reports where they see fit. Additionally, we’re exploring ways to integrate techniques from Google DeepMind’s common ground research into our offering. This would allow participants to vote on AI-generated commentary, helping us gauge whether they agree with the statement. This feedback would, in turn, be used to enhance the representativeness of our AI-generated summaries, having heard directly from people on the value of our tool’s synopsis.
- Privacy: Our library preserves the privacy of opinion-expressers by processing statements and aggregated vote data without collecting data on individual user accounts. The analysis is done by the platforms using our library and through their own Cloud Vertex account. The data is owned and controlled by them, and is never seen by Jigsaw.
We’re continuously working to improve our tools and encourage the community to help us identify and mitigate potential issues. We plan to further enhance sensemaking by improving accuracy, continuing to address bias, and making our tools even more user-friendly. Our research, development and testing follows Google’s AI Principles as a guiding function. Although our tools are designed to be used with any model or infrastructure a platform prefers, we have primarily focused on using Gemini served on Cloud Vertex AI to leverage the extensive safety and bias mitigation that Google provides.
Help us build the future
For us, Sensemaking is more than just a tool; it’s a step toward making online conversations more productive, inclusive, and impactful. But we also know that building together almost always leads to better, more responsible technology, so we invite you to join us on this journey by exploring our GitHub repository to learn more about Sensemaking. We also welcome your feedback to help improve our tools; please submit feedback via the following form. While these are still early days for this technology, we are eager to work together to improve participation and reinforce trust in the outcomes of online conversations.
By Angelo Carino, Head of Product & Engineering, Jigsaw