If We Want Platforms to Think Beyond Engagement, We Have to Know What We Want Instead

Partnership on AI
Nov 10 · 11 min read
A toy claw machine hangs above a pile of emojis

By Claire Leibowicz (PAI), Connie Moon Sehat (Hacks/Hackers), Adriana Stephan (PAI), Jonathan Stray (UC Berkeley CHAI)

Which values should the algorithms driving online content recommendations promote?

What we view on social media, news aggregators, and online marketplaces is largely dictated by algorithms known as recommender systems. These systems are driven by user behaviors such as clicking, liking, or sharing content — collectively known as user engagement. Relying on user engagement metrics can result in the proliferation of misinformation, abusive or extremist content, and addictive use that can negatively affect both individuals and societies. Surely, other values beyond individual user engagement might be better taken into account to improve online spaces. Alternatives include principles such as safety, well-being, agency, justice, and even emotions such as awe and inspiration.

But how to decide among them? And who gets to decide which values are important?

As part of a broader effort to publish a multidisciplinary paper bridging human values and recommender systems, the Partnership on AI (PAI) convened members of its AI and Media Integrity Program (including both those at Partner organizations and not) working in industry, media/journalism, civil society, and academia to consider this topic. PAI asked meeting participants to complete a survey about what values they believed were most important for recommender system-based platforms to attend to. After receiving 29 responses, PAI facilitators then broke participants into groups to discuss the results.

This exercise was not intended to authoritatively determine which human values are appropriate to promote online; insights from a 29-person survey and conversation are by no means statistically significant. However, participants in the AI and Media Integrity Program bring broad experience in handling thorny ethical questions of AI from around the world and diverse professional perspectives including those of computer scientists, journalists, and human rights defenders. With input from these stakeholders, the opportunities for addressing the challenges of recommender systems through participatory input became a little clearer.

Including more stakeholders in the development of recommender systems (beyond just those creating them) will be a critical step towards developing greater understanding of what values should be promoted online. Even then, the best approach to incorporating these perspectives requires further study, as revealed by this survey exercise.

There has been a lot of recent research regarding human values and ethics in AI. We began developing the values survey through assessment of different values mentioned previously in recommender system design literature. The BBC, IEEE, and Berkman Klein Center have all published useful resources on values and AI that informed our survey, and we also relied on foundational human rights documents. Further, efforts like New Public have helped promote a broader evaluation of how to create better online spaces.

Ultimately, we selected 31 values for inclusion in the survey, including everything from agency to duty to self-expression to labor.

Creating the list was a major challenge: our list of values needed to have a level of specificity and nuance to address overlapping and vague concepts. How, for example, should one differentiate precisely among values such as connection, empathy, belonging, and tolerance? In light of this, we aimed for an intermediate level of specificity — not simply “do good,” but not as specific as precise metrics used by product teams, either. We also leaned heavily on previous literature.

Six types of recommender systems were selected for respondents to consider: social media (e.g., TikTok, Facebook), streaming media (e.g., Spotify, Netflix), news aggregators (e.g., Google News), online shopping (e.g., Amazon’s marketplace), video sharing (e.g., YouTube), and targeted advertising. Respondents were first asked to narrow the scope of their values evaluation by choosing the one type of recommender system they were “most concerned about” from the six options. The survey then asked, “How important is it for the platform you are most concerned about to attend to each of the values?” which respondents answered using a five-point numeric scale. The survey hinted that values might be in tension with one another. “We know that different values might conflict. That’s OK!,” read the instructions.

Given that a single list of 31 values seemed longer than a survey respondent could easily digest, we presented the values in four groups. These groups loosely reflect ongoing cultural, psychological, and sociological research around values and value tensions, or when individual and social values can conflict: the relationship of individuals to society, interdependent or community values, personal development and interests, and moral and ethical social order.

Individuals in Society:Agency & Autonomy
The platform should help users achieve their goals. The platform should not manipulate users for the benefit of other stakeholders.
Control
The platform should give users ways to control the content selected for them. The platform should give users ways to control how the content they create is shared.
Freedom of Expression
The platform should not stop users from expressing their thoughts and opinions. A platform should not unduly suppress distribution of user posts. Users should feel safe expressing themselves.
Liberty
A platform should not stop users from exploring certain types of information. Users should be able to pursue their own good in their own way.
Privacy
The platform should protect the privacy of users’ data by using and sharing it only for legitimate aims.The platform should allow users to determine if and how their personal data is collected, used to inform personalization, and shared.
Safety & Security
Users should feel safe using the platform. There should be a low prevalence of harmful user outcomes.
Interdependence and Community:Accessibility & Inclusiveness
The platform should work well for anyone, regardless of their background, ability, or status.
Care/Compassion/Empathy
The platform should foster interpersonal relationships founded on compassion, kindness, sympathy and generosity.
Civic Engagement
Platforms should help users become an informed and effective member of their political community.
Community & Belonging
Platforms should help people come together around shared goals and identities. Platforms should provide users opportunities to become a valued member of a community.
Connection
Platforms should help users build meaningful relationships and feel close to people.
Tolerance & Constructive Discourse
The platform should encourage respect for differences among users. The platform should try to ensure that disagreement or conflict among users is constructive, and prevent it from escalating in destructive ways.
Transparency & Explainability
Platforms should explain how their recommendations appear, answering questions for their users such as “why am I seeing this?” or “Why has this been removed?”. Platforms should publicly disclose and discuss changes to moderation or ranking.
Moral and Ethical Order:Accountability
The platform should have processes for users to bring up problems and resolve them in a timely manner. The platform should explain its decisions and justify them based on clearly defined principles.
Accuracy
Information on the platform should be accurate, or consistently labelled when not credible. Users should be able to effectively judge the credibility of information on the platform.
Diversity
The platform should expose users to varied topics, sources, perspectives etc.
Duty
The platform should help people fulfill their obligations to family, community, nation, and one another.
Environmental Sustainability
The platform should suggest information and products with lower environmental impact. The platform should encourage sustainable practices.
Equality & Equity
Platforms should strive for equality of certain types of opportunities and outcomes.
Justice & Fairness
Platforms provide morally justifiable, fair & equitable distribution of benefits, harms, and risks.
Knowledge & Informativeness
Users should see items that keep them informed about topics they care about.
Progress
The platform should promote the advancement of society.
Personal Development and Interests:Happiness & Well-Being
Users should see content that leads them to experience contentment, joy and pleasure, both ephemerally and over the long term. Platforms should help users feel satisfied with their lives.
Inspiration & Awe
Platform should show users content that will inspire, motivate, or guide them.
Labor
The platform should help users to pursue and engage in meaningful work.
Mental Health
Platforms should help users protect and improve their mental health. Platforms should not encourage unhealthy types or amounts of use.
Physical Health
Platforms should help users protect and improve their physical health, such as by providing accurate health information and encouraging behaviors that contribute to physical health.
Recognition & Acknowledgment
Platforms should provide ways for other people to recognize a user’s contributions or worth.
Self-Actualization & Personal Growth
Platforms should help users reach their full potential. Platforms should address users’ need to learn new things and develop skills.
Self-Expression
The platform should empower users to express their identity (including personality, attributes, behavior) and to decide how it is presented to others.
Tradition & History
The platform should help people live and preserve their cultural heritage and traditions.

This survey has a number of limitations that were clear from the beginning, beyond the most salient one, the survey’s sample size. Despite the inclusion of interdependent values that are significant in non-Western cultural research, the survey was inevitably shaped by a mostly U.S. perspective. While PAI’s membership includes organizations from non-Western regions, the bulk of survey respondents came from the U.S. Still, the exercise provided broader and more systematic insight into value priorities than our small author group could offer. In addition, PAI gained beneficial perspectives and directions from its own membership about desired research and policy directions.

After participants took the survey, we broke the participants into three groups to discuss their responses and to offer feedback. Two thematic challenges for pinpointing values became clear: how to negotiate value tensions and how to provide adequate context for this negotiation.

The challenge of negotiating values was, for discussants, a question of ethical trade-offs.

How, for example, does one approach the trade-off between giving people the agency to personalize what they see on social media versus promoting diversity? What is the trade-off for someone else’s diminished safety versus liberty when an individual decides what is for their own good? One solution offered was for platforms to operate on the “do no harm” principle by prioritizing major problems and promoting other values only when harm was completely mitigated. Value trade-offs also connected to questions regarding system design. Many participants agreed that there should be transparency around who decides which values are important and what guiding principles platforms use to negotiate them.

We also learned a little more about what kinds of context may be needed to prioritize values.

Participants recognized the same geographical limitations of the survey that we did: Both the list of values and the opinions of the survey respondents cannot be generalized globally. They also asked questions about the scope of the survey’s own aspirations: Are we ranking the most important problems for systems today, or describing the values of a hypothetical system as we’d like it to be? For that matter, are we ranking based on what the ideal version of freedom of expression would look like around the world? Some, noting the ways that the survey alternated between focusing on platform power and user agency, asked if the questions took for granted that platforms actually have the power to affect things that are potentially only within the user’s power to change, such as attending to one’s physical health.

29 responses do not offer statistically significant insights. However, taken as a qualitative supplement to our own considerations, and alongside facilitated conversation about the survey, even these few responses provided interesting avenues to explore for future work.

Value prioritization may be different depending upon the type of recommendation product.

Figure 1. Venn diagram highlighting values rated as very important for two types of recommender systems.

Consistent with viewpoints expressed in group discussion, survey respondents rated values differently depending on the platform product they selected. Only two of the six product types received enough responses for meaningful assessment: news aggregators and social media. On such a small set of data, median scores provided a way to roughly generalize priorities. Several values were rated as very important for both news aggregators and social media recommenders. For both types of recommender systems, accessibility & inclusiveness, accuracy, accountability, privacy, and tolerance & constructive discourse were ranked with the highest median score of 5.0 (Figure 1). In news aggregators, justice & fairness and equality & equity were also ranked with the highest median score. For social media, the values of safety & security, mental health, control, agency & autonomy, and transparency & explainability were also ranked as such. Happiness & well-being and connection numbered as least worthy of attention for news aggregators designing recommender systems, while inspiration and awe were of lesser importance for social media platforms.

Figure 2. Visualization depicting trends in median scores for the 31 values, by sector.

Value prioritization may differ according to where you work.

Participants from different sectors evaluated values’ priority differently (Figure 2). Only the value of accountability appeared across all sectors among the highest prioritizations, perhaps understandable given the self-selecting ethical AI interests of PAI participants. People from industry tended to think that every value was worthy of at least some attention. Civil society respondents rated a larger portion of the 31 values as most important, which makes negotiating trade-offs more difficult. Academic responses lay somewhere in between.

Simple value prioritizations may result in values competing with one another.

The responses revealed values that are potentially in tension with one another. For example, civil society members rated safety and agency highly. When it comes to users who may have an unhealthy relationship to dieting, how might one promote the value of safety (“Users should feel safe using the platform. There should be a low prevalence of harmful user outcomes.”) alongside agency (“The platform should help users achieve their goals.”)? Negotiating tensions like these in a transparent way will require tools outside of surveys, such as multistakeholder efforts to produce technical definitions, taxonomies and datasets, and/or standards.

Surprises within value prioritization may suggest avenues to better understand tensions and trade-offs between individual and community values.

Freedom of expression, even for this U.S.-centric bunch, was not rated as among the highest priorities. (Interestingly, while freedom of expression did not number among the highest priorities, accountability did for participants from all sectors.) This ranking may reflect participant sensitivity to the fact that freedom of expression may not be valued as highly across the world in contrast to other values such as privacy and safety. At the same time, we can ask questions such as why community & belonging is ranked lower for social media, an ostensibly social platform, than for news aggregators. Understanding the answer to this question, and others stemming from value tensions, may help provide a path toward value negotiations and realization in recommender systems.

In the end, we learned a bit more about how to help larger groups of people offer input into values for recommender systems. Equally desirable values for individuals and society may conflict and require ethical trade-offs, which means that the more that can be done to contextualize participants’ understanding and approach, the better. Bringing specificity to the broader conversation by addressing different types of recommender systems individually might help better align values and algorithms, since they have different needs and characteristics. Further, since we identified that different sectors and cultures might bring with them differing opinions on which values to prioritize algorithmically, special attention must be placed on incorporating input from these groups in inclusive and comprehensive ways.

There are a number possibilities for more comprehensive value-definition efforts. This work surveyed AI experts, but we would also want extensive research with recommender system users, both qualitative interviews and large, quantitative surveys (as the BBC has recently done). More ambitiously, it may be possible to involve multiple stakeholders directly in the development of value-sensitive optimization metrics, as the WeBuildAI project has demonstrated.

In the short term, PAI is participating in a multidisciplinary paper describing the opportunities and challenges in aligning human values and recommender systems that will help ground the field. This work will include proposed metrics in place of user engagement for designing recommender systems that promote different value priorities. Through these efforts, participants in the AI and Media Integrity Program can continue learning how to create more participatory design processes for the technologies that affect platform users all around the world.

Acknowledgements: The development of this survey was led by the post’s authors as well as Parisa Assar (Meta) Alon Halevy (Meta), Sara Johansen (Stanford), Lianne Kerlin (BBC) Polina Proutskova (BBC), and Spandana Singh (New America’s Open Technology Institute).

The authors of this Medium post, listed alphabetically at the top, contributed equally to its creation.

AI&.

Advancing positive outcomes for people and society

AI&.

A publication to highlight the practice of developing and ensuring AI technologies are ethical, transparent, and inclusive for the benefit of people and society

Partnership on AI

Written by

The Partnership on AI is a global nonprofit organization committed to the responsible development and use of artificial intelligence.

AI&.

A publication to highlight the practice of developing and ensuring AI technologies are ethical, transparent, and inclusive for the benefit of people and society