How to use data for good — 5 priorities and a roadmap

Scaling social impact is not easy, but it is worth it

We live in challenging times. From climate change to food insecurity and forced migration, the difficulties confronting decision makers are unprecedented in their variety, and also in their complexity and urgency. Our standard policy toolkit seems stale and ineffective while existing governance institutions are increasingly outdated and distrusted.

To tackle today’s challenges we need not only new solutions but also new methods for arriving at solutions. Data and data science will become more central to meeting these challenges and to social innovation, philanthropy, international development, and humanitarian aid. From the analysis of satellite imagery to mapping poverty to using social media data to track the global digital gender gap, “Data Science for Social Good” provides great promise.

The potential (and challenges) of Data Science for Social Good was the focus of a dedicated workshop held by the ISI Foundation, UNICEF and The GovLab as part of the 2019 Web Conference on May 14, 2019. It also explored ways to develop new data collaboratives to unlock data and data science capabilities from both the public and private sector.

Throughout the workshop, distinguished keynote speakers and paper presenters addressed a wide range of data science applications and experiments. Several participants focused, for instance, on the role of social media in addressing public concerns, such as informing the public on vaccinations, communicating during crises, and promoting electoral integrity. Others discussed how technology could be used to meet local needs, whether that be in health or education.

The workshop also had a panel from private sector representatives. Chaya Nayak from Facebook, Claudia Juech from Cloudera Foundation and Paul Ko from LinkedIn reflected on their efforts to leverage data science for good through projects like Social Science One and the Economic Graph.

While the overarching message emerging from these case studies was promising, several barriers were identified that if not addressed systematically could undermine the potential of data science to address critical public needs and limit the opportunity to scale the practice more broadly.

Below we summarize the five priorities that emerged through the workshop for the field moving forward.

1) Become People-Centric. Much of the data currently used for drawing insights involve or are generated by people.

These insights have the potential to impact people’s lives in many positive and negative ways. Yet, the people and the communities represented in this data are largely absent when practitioners design and develop data for social good initiatives.

To ensure data is a force for positive social transformation (i.e., they address real people’s needs and impact lives in a beneficiary way), we need to experiment with new ways to engage people at the design, implementation, and review stage of data initiatives beyond simply asking for their consent.

As we explain in our People-Led Innovation methodology, different segments of people can play multiple roles ranging from co-creation to commenting, reviewing and providing additional datasets.

(Photo credit: Image from the people-led innovation report)

The key is to ensure their needs are front and center, and that data science for social good initiatives seek to address questions related to real problems that matter to society-at-large (a key concern that led The GovLab to instigate 100 Questions Initiative).

2) Establish Data About the Use of Data (for Social Good). Many data for social good initiatives remain fledgling.

As currently designed, the field often struggles with translating sound data projects into positive change. As a result, many potential stakeholders — private sector and government “owners” of data as well as public beneficiaries — remain unsure about the value of using data for social good, especially against the background of high risks and transactions costs.

The field needs to overcome such limitations if data insights and its benefits are to spread. For that, we need hard evidence about data’s positive impact. Ironically, the field is held back by an absence of good data on the use of data — a lack of reliable empirical evidence that could guide new initiatives.

The field needs to prioritize developing a far more solid evidence base and “business case” to move data for social good from a good idea to reality.

3) Develop End-to-End Data Initiatives. Too often, data for social good focus on the “data-to-knowledge” pipeline without focusing on how to move “knowledge into action.”

As such, the impact remains limited and many efforts never reach an audience that can actually act upon the insights generated. Without becoming more sophisticated in our efforts to provide end-to-end projects and taking “data from knowledge to action,” the positive impact of data will be limited. To become more sophisticated about end-to-end may require leveraging designers and engaging other disciplines; as well as developing new capacities among both data scientists and decision-makers.

4) Invest in Common Trust and Data Steward Mechanisms. For data for social good initiatives (including data collaboratives) to flourish and scale, there must be substantial trust between all parties involved; and amongst the public-at-large.

Establishing such a platform of trust requires each actor to invest in developing essential trust mechanisms such as data governance structures, contracts, and dispute resolution methods. Today, designing and establishing these mechanisms take tremendous time, energy, and expertise. These high transaction costs result from the lack of common templates and the need to each time design governance structures from scratch.

Absent templates or data steward initiatives like a contractual wheel of data collaboration, the start-up and maintenance costs to establish trust may be too high for many projects, especially those envisioned or launched by smaller organizations.

(The Contractual Wheel of Data Collaboration)

5) Build Bridges Across Cultures. As C.P. Snow famously described in his lecture on “Two Cultures and the Scientific Revolution,” we must bridge the “two cultures” of science and humanism if we are to solve the world’s problems.

As a field, data science requires collaboration among multiple disciplines and wide-ranging fields of expertise and knowledge. One pathway toward this goal is to nurture what we at The GovLab call “bilinguals,” individuals who possess both domain and data science expertise. Similarly, we need to bridge the two other dichotomies limiting the potential of data for social good: the divide between policy practice and scholarship and between private and public sectors.

Taken together, these five priorities provide a roadmap as to accelerate data science for social good in a systematic, sustainable and responsible ways. They may also provide the building blocks to develop more coordinated approaches to what is becoming a fragmented field, including the somewhat divergent focus at this week’s ITU Summit on AI for Good (which ultimately depends on data).

To implement these five priorities we will need experimentation at the operational but also institutional level. This involves the establishment of “data stewards” within organizations that can accelerate data for social good initiative in a responsible manner integrating the five priorities above. — Stefaan G. Verhulst

(Grateful to Natalia Adler, Daniela Paolotti, Ciro Cattuto, Leo Ferres, and Andrew Zahuranec for their input).

Reposted from apolitical, May 30, 2019