Locked doors: researcher access to social media data — a reading list
This reading list foregrounds our new project on researcher access to social media data. It aims to give an overview of the complex challenges that researchers face in accessing social media data, and current attempts to address those challenges across different contexts.
I’m Sasha Moriniere, a Researcher at the ODI, passionate about investigating power dynamics in digital ecosystems and advocating for more equitable access to privately-held data for public interest researchers.
This blog was written collaboratively with Sophia Worth, Jared Robert Keller and Claudine Tinsman from the Open Data Institute (ODI).
New virtual realms and social media environments are integral to our daily, political, cultural and social lives. There are increasingly calls for research aimed at understanding how these platforms work, the type of content circulating, the engagement they generate and, importantly, how to hold the companies running these platforms accountable. However, to conduct effective investigations, researchers need data and, regrettably, the companies operating these platforms often resist granting researchers.
This reading list compiles resources that engage with the complex questions surrounding why access to social media data is so important for public-interest research — and how to make that happen.
Enabling this type of research is important because it plays a critical role in addressing key challenges that can impact various aspects of society, such as trust in institutions and democracy and public health, including issues like mis/disinformation, political polarisation, mental health. Therefore, our new programme on global data infrastructure seeks to address why more robust social media data access matters.
The reading list below is a starting point for the questions and challenges we are exploring in our new project on enabling access to social media data for public-interest research. We hope that this will be a helpful resource for:
- Researchers seeking to understand how to access social media data
- Policymakers who want to understand the recent calls for increased access to social media data
- People who want a comprehensive view of the landscape.
The need to access social media data for research
Access to social media data is being restricted despite its societal benefit. In 2023, we saw various social media platforms such as X and Reddit restrict access to researchers and these instances have been highly criticised by the research community and beyond.
- Imposing Fees to Access the Twitter API Threatens Public-Interest Research
- Twitter just closed the book on academic research
- Facebook Hosted Surge of Misinformation and Insurrection Threats in Months Leading Up to Jan. 6 Attack, Records Show
- Reddit’s upcoming API changes will make AI companies pony up
- Why Reddit’s decision to cut off researchers is bad for its business — and humanity
- Letter: Twitter’s New API Plans Will Devastate Public Interest Research
- Elon Musk goes to war with researchers
- Twitter’s $42,000-per-Month API Prices Out Nearly Everyone
This troubling trend is causing damage, putting in jeopardy research using social media data to trace the spread of harmful content, mis- and disinformation, news consumption, public health, and elections, the impact of Covid-19 on teacher resignation and mental health and so on.
- ‘Scraping’ Reddit posts for academic research? Addressing some blurred lines of consent in growing internet-based research trend during the time of Covid-19
- Use of Reddit for Social Science Research: A Review of Current Use, Exploration of Potential Sampling Error, and Practical Demonstration Using Reddit to Study Post-Pandemic Teacher Resignation
- How new Twitter API rules could hinder war crimes research and rescue efforts
- When the internet becomes unknowable
- Study warns API restrictions by social media platforms threaten research
These damages highlight and are exacerbated by current power structures at play in data ecosystems, creating subsequent data asymmetries.
- Identifying and addressing data asymmetries so as to enable (better) science
- Rethinking Data and Rebalancing Digital Power
- Access Rules: Freeing Data from Big Tech for a Better Future
- We must fix researcher access to data held by social media platforms
- The Challenges of Conducting Open Source Research on China
The existing solutions, practical guides or resources
Many organisations are working to address these challenges on different fronts.
Academics and civil society actors seek more robust access and are thus formulating recommendations for improved governance or technological developments in this direction.
- Report of the European Digital Media Observatory’s Working Group on Platform-to-Researcher Data Access
- Access to Social Media Data for Public Interest Research: Lessons Learnt & Recommendations for Strengthening Initiatives in the EU and Beyond
- Ethics of Social Media Research: Common Concerns and Practical Considerations
- After the ‘APIcalypse’: Social Media Platforms and Their Fight against Critical Scholarly Research
- Digital Policy Lab (DPL)
- European Digital Media Observatory’s (EDMO) Working Group on Platform-to-Researcher Data Access
- Coalition for Independent Technology Research
Policymakers and government officials have also pushed for more robust data access in recent years. This work has translated into policy and legislation enabling more access in different countries and regions.
- Independent Researcher Access to Social Media Data: Comparing Legislative Proposals
- FAQs: DSA data access for researchers
- Online safety bill: changes urged to allow access to social media data
Different types of organisations are providing more practical tools and guidance for researchers. We have explored existing repositories containing tools for data access on social media (and beyond) and have curated relevant resources that point to various solutions, practical guides, and tools aimed at enhancing social media access for public-interest research purposes.
Some organisations and academic institutions are putting together resources and guidance…
- Social Media Research Guide [the University of Michigan offered a reading list of library materials for those looking to learn more.]
- University of Oxford Best Practice Guidance for Internet-Mediated Research [while the University of Oxford offered more comprehensive guidance]
- Ethics guidelines for internet-mediated research [As have the British Psychological Society]
- ESRC Social Media Research: a guide to ethics
- Good practice in research: Internet-mediated research
- Resources for social media data
Other organisations are providing technical access by developing tools…
- Platform Researcher Access Tools & The Brussels Effect
- Social Media Research, American University of Washington DC
- Investigate TikTok Like A Pro!
- Sentiment Analysis and Text Mining for Social Media Microblogs using Open Source Tools: An Empirical Study
… as well as emerging techniques to investigate better virtual spaces, such as Open Source Intelligence.
- OSINT Framework
- Setting Your Moral Compass A Workbook for Applied Ethics in OSINT
- Curated list of OSINT tools
- Bellingcat’s online investigations toolkit
- These are the Tools Open Source Researchers Say They Need
- Social Media OSINT: A Comprehensive Guide to Gathering Intelligence from Social Media Platforms
- Bellingcat’s resources
We’ve also started to compile technical solutions that come from the social media platforms themselves to enable better access for research.
- Meta Announces New Program To Share Data With Academic Researchers
- Meta Partners with the Center for Open Science to Share Data to Study Well-being Topics
- Researchers will get access to TikTok data — pending company approval
However, these have been subject to scrutiny, and some journalists and researchers have analysed what they consider failed attempts by platforms to open up their data for research.
- The Problem with TikTok’s New Researcher API is Not TikTok
- TikTok’s data rules are keeping researchers from studying the app
- Facebook made big mistake in data it provided to researchers, undermining academic work
Other organisations and researchers are conducting research and providing guidance in related sectors and fields that might be translated to social media access.
- An Introduction to the Dataverse Network as an Infrastructure for Data Sharing
- Towards a sustainable, multilateral, and universal solution for international data transfers
- Mapping the landscape of data intermediaries
- Open banking: setting a standard and enabling innovation
- Case study: Open banking — ODI leads fintech innovation and growth
- OpenSAFELY: factors associated with COVID-19 death in 17 million patients
- OpenSAFELY research
- Privacy-preserving data sharing infrastructures for medical research: systematization and comparison
- Private-sector data for public good: modelling data access mandates
- Data for Research (Data Wrangling)
At the ODI for example, our past work on data access and sharing has demonstrated that there are different types of approaches that can be used in collaboration or separately from one another, that highlight the flexibility and adaptability in addressing data access challenges.
What’s next?
At the ODI, we’ve made a commitment to work to address global challenges by enhancing data sharing and infrastructure.
The first project of this work will be focused on enabling access to privately-held data for public-interest research, with three key areas:
- Enabling access to social media data: We’re rigorously examining and differentiating methods to access platform and social media data globally, including APIs, scraping, regulatory mandates, and alternative legislative approaches. We will consider how these different approaches can and should be used across different contexts.
- Clarifying ‘public data’: Our research will examine the many nuanced different interpretations of ‘public data.’ Through collaboration with experts and stakeholders globally, we aim to generate consensus on what constitutes or should constitute public data, taking into account varying regional perspectives, and their implications for the legality of collection and use.
- Imagining alternative presents: Through this creative work stream we will envision what the last decade might have looked like if researchers hadn’t been able to access social media platform data.
Get involved
To contribute research works to this reading list or participate in shaping the project, please get in touch via email at research@theodi.org. Your collaboration will be welcomed and appreciated.