Disinfodex: an independent aggregator

--

Disinfodex is a project created during the 2020 Assembly Fellowship at the Berkman Klein Center at Harvard University. One of three tracks in the Assembly: Disinformation Program, the Assembly Fellowship convenes professionals from across disciplines and sectors to tackle the spread and consumption of disinformation. Each fellow participated as an individual, and not as a representative of their organization. Assembly Fellows conducted their work independently with light advisory guidance from program advisors and staff.

The Disinfodex project and this post were authored by Gülsin Harman, Rhona Tarrant, Ashley Tolbert, Neal Ungerleider, and Clement Wolf; the project team has backgrounds in journalism, policymaking, and cybersecurity.

Every week, there are new reports and research emerging about disinformation. As platforms, researchers, governments, and journalists focus on the problem, the pace of the field has increased, and it has become difficult to track and analyze the information. Recognizing this need, our team set out to help those working in the field to better access and analyze publicly available information related to disinformation.

The result of our work is the Disinfodex project, a searchable database indexing public disclosures of disinformation campaigns from major online platforms, currently including Facebook, Instagram, Google, YouTube, Twitter, and Reddit. The tool, which is currently a prototype, was developed during the 2020 Assembly Fellowship at the Berkman Klein Center, and ultimately aims to be a source for analyzing disclosures and trends in the disinformation space.

Building infrastructure to track disinformation

Our team came to the fellowship from a variety of backgrounds, including communications, journalism, technology and design, and we initially set out to help journalists better cover disinformation. However, we quickly realized that the issues newsrooms face in covering this rapidly evolving space are shared by others in the field: there is a profusion of information about disinformation, from open source investigations to research papers, policy papers, intelligence reports and more, as well as releases from platforms as they increase measures and transparency around disinformation.

As a result, it’s not only challenging to keep pace with the information, it can be difficult to assess and analyze the existing data. We researched projects that addressed similar challenges, such as the Berkman Klein Center’s Lumen database, which hosts and makes searchable more than twelve million legal removal notices from a host of online platforms, and the German Marshall Fund and Graphika’s io-archive.org website, which makes it possible to explore “publicly available and rigorously attributed data points from known Information Operations on social media platforms.”

Aggregating public disclosures of disinformation campaigns

These projects, as well as conversations we had with Assembly Fellowship advisors, researchers, and practitioners in the field, inspired us to build a tool to contribute to shared infrastructure in the disinformation space. Though the field is vast, we decided to begin by indexing public disclosures of actions taken against disinformation campaigns that have been issued by major online platforms, in an effort to help practitioners to better understand what has been released publicly, in what manner it has been released, and to develop insights or analysis based on the disclosures. It is designed to help all of those who study online disinformation, from industry to academia, journalism, government, civil society groups and more.

WIP Database Prototype

We devised the database to be an independent aggregator, intended to function as a trusted hub to discover and explore information released by others. Therefore, we felt it was important not to take positions on the content of the disclosures that we aggregate; we will not provide assessments of the practices or policies of platforms, or any other sources we intend to add to the database over time. Disinfodex entries reflect the language the individual platforms use to describe the disinformation campaigns, and do not make assumptions based on the data.

Mitigating risks to aggregating disclosures

Our team weighed several additional considerations and risks while building Disinfodex, including the limited nature of the dataset due to the exclusion of resources from academics, journalists, civil society groups, etc. and the potential for misinterpretation of the data. We have expanded on these considerations in our white paper, which is available on our project website.

Our team also considered the risk of unintentionally amplifying the disinformation campaigns that we aggregated. The question of amplification also formed part of an ongoing conversation in the Berkman Klein Center’s Assembly Forum on when and how platforms should disclose disinformation operations. While it is a challenge, we believe the risk of amplification with Disinfodex is mitigated by the fact that the platforms’ disclosures do not repeat the narratives of these disinformation campaigns.

Seeking feedback and collaboration

At the time of this project’s release, Disinfodex has aggregated and indexed the reports issued by six online platforms, starting in September 2017. We are committed to furthering the work of Disinfodex and welcome feedback from users, thoughts on how we might build upon it, and ideas about groups or organizations we could partner with to help us expand this project.

Our team can be contacted via email at disinfodex@googlegroups.com, and we intend to communicate updates about the project on the Disinfodex website.

For more information on the Disinfodex project, visit the team’s website. Learn more about the Assembly: Disinformation program at www.bkmla.org.

--

--

Assembly at the Berkman Klein Center

Assembly @BKCHarvard brings together students, technology professionals, and experts drawn to explore disinformation in the digital public sphere.