7 Big Open Issues in Responsible Data Science and AI

Emily Hadley
RTI Center for Data Science and AI
5 min readNov 21, 2022
Photo by Joshua Sortino on Unsplash

Wondering what some of the big open issues are in responsible data science and AI research? Check out this non-exhaustive list of some of the big open questions and challenges. Some of these are issues I’ve encountered in my own work, others I’ve heard about at conferences, and still others are ones that I’ve seen in literature or media. Feel free to comment with challenges I’ve missed.

#1 Using Fairness Metrics for Decision Making in Real-World Settings

A wide variety of fairness metrics have been proposed for machine learning and data science approaches and the calculation of these metrics is becoming more accessible with tools like the Responsible AI dashboard or the Aequitas Bias and Fairness Audit. However, there is not adequate or accessible guidance for how to actually use these fairness metrics in practice. Perspectives on bias and fairness differ not only by sector (such as social media versus healthcare) but also by the perspective of the stakeholder (a patient may prioritize different fairness metrics than a health insurance company). For fairness metrics to become a routine part of decisions regarding when and how to implement an algorithm, stakeholders need to be equipped with resources for how to understand and balance fairness priorities and what fairness thresholds are relevant to their sector.

#2 Fair Ranking in Search and Recommendations

We use search and recommendation features in a wide variety of applications, from dedicated search engines to search and recommendation capabilities embedded in social media applications and websites. Safiya Noble made a strong case in Algorithms of Oppression that search algorithms are not neutral and can perpetuate a myriad of biases. These rankings have had real world impact, such as in a LinkedIn job search. Check out Fair ranking: a critical review, challenges, and future directions for an introduction to the issue and this Medium blog post about how the Vimeo engineering team uncovered bias in search and recommendations.

#3 Responsible AI with a Focus on Children and Older Adults

We joke that children spend too much time on cellphones while older adults are technologically illiterate. Yet, these jokes reflect a reality where children may indeed be addicted to phones and ageism has impacted design choices in technology and AI development. Researchers and practitioners should consider explicitly exploring the impact that a data science or AI approach may have on the child and older adult populations. A good place to start is UNICEF’s Memorandum on Artificial Intelligence and Child Rights and WHO’s Ageism in Artificial Intelligence for Health.

#4 AI and The Global South

The term Global South is often used to identify regions within Africa, Asia, Latin America, and Oceania. Researchers have suggested that these regions face a variety of specific AI-related risks, including oppression, exclusion, and tech waste. AI scholars have advocated for centering international human rights law and human-centered explainable AI through increasing AI literacy. Areas of focus include amplifying and funding the work of AI researchers and educators in the Global South and supporting further postsecondary education programming.

#5 Understandable Explanations

Explainable AI, often abbreviated as XAI, has become one of the hot topics in the responsible data science and AI space. Researchers and scholars are regularly updating, building upon, and releasing new (often theoretical) approaches for explaining the decisions of otherwise inscrutable blackbox models. This taxonomy (Barredo Arrieta et al.) offers a comprehensive overview of the XAI space, and also highlights one of the most important critiques: that an explanation may not be interpretable or understandable, particularly for someone without technical expertise. The potential complexity of explanations has real world implications, as evidenced by global dialogue on the “right to explanation” that ensued after the passage of GDPR. With growing government and regulatory interest in algorithmic transparency, the conversation continues on who is owed an explanation of an algorithmic decision and how explainable that explanation must be.

#6 Addressing Online Hate Content

Online hate and harassment is a major issue, particularly for individuals from marginalized communities. Data science and AI have become important tools for automating the process of recognizing and removing hate speech, but these approaches also come with their own challenges. Tech companies and academic institutions have both invested in exploring innovative methodologies for monitoring hate content. Addressing online hate and harassment in real-world settings continues to be an open area of research.

#7 Bias in language models

Large language models are revolutionizing the way humans interact with text data, from improved search engines to automated news articles to finding bugs in a model. Yet, without appropriate guardrails, large pre-trained language models are likely to perpetuate biases that already exist in text sources. Examples include stereotypes regarding gender, race, profession and religion, with a particular focus on the persistence of anti-Muslim bias. Researchers are actively exploring ways to mitigate these biases. These conversations are particularly important with the increasing accessibility to tools like GPT-3.

# Bonus: Privacy and Regulation

Privacy and regulation are both key issues for responsible AI and data science. Privacy includes the protection of the identity and rights of the entity that provided the data, and work in this space closely overlaps with broader cybersecurity, information technology, and database management efforts. Regulation includes state and federal policies, rules, and procedures, and intersects with public policy, political science, and legal frameworks. Both topics are critical areas of development and research.

Other Opportunity Lists

Thanks for taking the time to read this list. It’s intended as a starting point — feel free to recommend other important topics in the comments.

This blog post is part of a Deep Dive into Responsible Data Science and AI series.

Disclaimer: Support for this blog series was provided by RTI International. The opinions expressed by the author are their own and do not represent the position or belief of RTI International. Material in this blog post series may be used for educational purposes. All other uses including reprinting, modifying, and publishing must obtain written consent.

--

--

Emily Hadley
RTI Center for Data Science and AI

Data Scientist | Enthusiastic about data, nature, and life in general