Data Stewards: Data Leadership to Address 21st Century Challenges
Identifying and connecting leaders within the private sector to increase the responsible use of data for good
We live in challenging times. From climate change to terrorism, the difficulties confronting policy makers are unprecedented in their variety, but also in their complexity. Our existing policy tool kit seems stale and outdated. Increasingly, it is clear, we need not only new solutions but also new methods for arriving at solutions.
Data, and new methods for organizations to collaborate in order to extract insights from data, is likely to become more central to meeting these challenges. We live in a quantified era. It is estimated that 90% of the world’s data was generated in the last two years — from which entirely new inferences can be extracted and applied to help address some of today’s most vexing problems.
In particular, the vast streams of data generated through digital transactions, mobile phones, social media platforms, GPS devices and other sensors, when analyzed responsibly, can offer insights into societal patterns and behaviors. These insights could be harnessed to inform decision-making, solve policy issues, provide for new scientific insights, enable more targeted interventions and improve public service delivery. At the same time, large datasets create their own problems — of complexity and noise, risks to privacy and security, and the potential to have a disparate impact on already vulnerable populations.
One of the greatest tasks of our era may be figuring out how to unlock and harness the value of this data to provide actionable insights for positive social and economic impacts.
How to harness the potential of private data toward public ends?
Over the last two years, we have focused on the opportunities (and challenges) surrounding what we call “data collaboratives.” Data collaboratives are an emerging form of public-private partnership, in which information held by companies (or other entities) is shared with the public sector, civil society groups, research institutes and international organizations. These entities — a new form of collaboration for the data age — have now been tried out across sectors, and around the world. In Estonia, anonymized mobile phone data is being used to understand the volume of tourism and foreign workers, and to tailor government services and transport options to them accordingly. In Namibia, satellite imagery and telecom businesses are sharing data to help track the spread of malaria. Data collaborative-type entities have also emerged in sectors as varied as agriculture, climate change mitigation, migration, economic development, and poverty alleviation, among many others.
For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations that has a mandate to explore ways to harness the potential of their data towards positive public ends.
Today, each attempt to establish a cross-sector partnership built on the analysis of private-sector data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.
As a consequence, the process of establishing data collaboratives and leveraging privately held data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.
By establishing data stewardship as a corporate function, recognized and trusted within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked.
With support from the Hewlett Foundation, the GovLab has embarked on a project called Datastewards.net toward conceptualizing and professionalizing data stewardship (and the use of data collaboratives) and establishing well-defined data responsibility approaches. Brightfront and Adapt are our operational partners.
To take stock of current practice and scope needs and opportunities we held a small yet in-depth kick-off event at the offices of the Cloudera Foundation in San Francisco on May 8th 2018 that was attended by representatives from Linkedin, Facebook, Uber, Mastercard, DigitalGlobe, Cognizant, Streetlight Data, the World Economic Forum, and Nethope — among others.
Participants described their experiences with data collaboration, mapped key opportunities and challenges, and also discussed the roles and responsibilities of “data stewards” — groups or individuals within corporations responsible for how data is collected, stored and used, and upon whose decisions the ultimate success or failure of data collaboratives often depend.
Four Key Take Aways
The discussions were varied and wide-ranging.
Several reflected on the risks involved — including the risks of NOT sharing or collaborating on privately held data that could improve people’s lives (and in some occasions save lives).
Others warned that the window of opportunity to increase the practice of data collaboratives may be closing — given new regulatory requirements and other barriers that may disincentivize corporations from engaging with third parties around their data.
Ultimately four key take aways emerged. These areas — at the nexus of opportunities and challenges — are worth considering further, because they help us better understand both the potential and limitations of data collaboratives.
1Maturity: Attendees pointed out that the field of data collaboratives is fledgling, still ill-defined, and that this poses certain challenges to more widespread adoption.
In particular, awareness among the public at large is limited, which leads to ambivalence or even outright suspicion. Corporations themselves also often don’t appreciate the value of sharing data (or, like the public, remain skeptical), which similarly limits the amounts of private data that are shared or made available for the public good. There often exist confusion or a lack of understanding on how to pursue the aims of public good in the context of a profit driven business model.
In addition, conference attendees pointed out that both holders and recipients of data often lack the requisite skills or resources to adequately use and maximize the potential of shared information. Increasing human and technical capacity — along with raising general awareness — is therefore critical to expanding the use of data collaboratives. More generally, the field requires more research, and a better, shared understanding of what works, and what doesn’t.
2 Transaction Costs: Conference participants also pointed to the high transaction costs (not limited to financial costs) faced by both sharers and recipients of data.
These can take the form of costs of preparing data; identifying and vetting potential partners; de-risking data (e.g., to ensure privacy); and of negotiating both the legal and commercial terms of sharing between participants.
Technical interoperability can also be an issue; data accessibility and usability remain key challenges.
Importantly, transaction costs are especially burdensome for small businesses and other entities that are often under-funded, or under-resourced in other ways. Eliminating or mitigating these transaction costs is therefore essential not only to more widely disseminating data collaboratives, but also to ensuring a level playing field that may widen the scope for innovation.
3 Scaling: Despite lessons learned from existing examples, it is often difficult to scale or replicate data collaboratives. What works in one instance frequently seems to be less successful in another. A promising small-scale sharing project may have trouble spreading its wings to include new and more data. These are all key issues to be dealt with if the potential of data sharing is to be realized.
In order to overcome these challenges, participants suggested the need for new unified brokerage platforms in which there will be mechanisms that could help groups find new opportunities or identify new partners for collaboration.
Pooling of efforts, expertise and experience is perceived as essential, especially for smaller groups that lack the necessary resources.
Overall, there remains a clear need for best practices and standardized pathways that can help guide small, incipient projects into larger, more successful — and more socially resonant — collaborations.
And finally, there is a need buy-in from senior leadership at the C-level — leaders with a mandate to make decisions and set direction — to embrace the opportunity, set an example, and take responsibility for leveraging data for public good.
4 Community of practice: Finally, there was broad consensus at the conference on the need for a well-established community of practice and expertise where current and future data stewards could share experiences and resources. Such an environment, which could take the form of either formal or informal structures, would offer a safe environment for practitioners to trade stories, tools and lessons.
Some called for the creation of a repository that would include a collection of existing MOUs, contract language, and legal frameworks, as well as a mapping of firms pioneering data uses for public good.
Furthermore, senior leadership from the private sector could assume ownership of such collaborations and recognize that the value of their data can be leveraged for public good without compromising their operational objectives.
Moving forward we will further explore the various forms and manifestations that such a community could take, toward the launch of a first-of-its-kind Data Stewards Network in the next few months. To bring together and support this community of responsible data leaders, in the immediate term, we will:
- Continue outreach and identification of individuals and businesses acting as innovators in the data collaboratives space to build a directory of key actors and develop the community of practice;
- Expand our analysis of roles and practices of data stewardship toward a definition that is informed by practice and need;
- Conduct targeted engagement and fact-finding efforts with data stewards and their peers to bolster our understanding of their needs, challenges, and areas of opportunity;
- Develop a set of tools, including for instance a data stewardship project gallery and directory, aimed at sharing knowledge and empowering those working to unlock the societal value of private sector data;
- Develop a repository of existing MOUs, data sharing agreements and standard contract language;
- Develop the contours of a data responsibility framework and decision tree that can help actors (especially SMEs) to develop data collaboratives projects;
- Identify and design new ways to broker relationships and collaborations between data supply and demand actors;
- Organize a “data-stewards of the year” event to “name and fame” committed practitioners; and
- Organize a series of meetings and workshops with the Data Stewards Network around the world toward bringing additional data stewards to the table (including, for instance, representatives from small and medium-sized enterprises, or previously under-represented regions) and identifying ways to mobilize the Network to address specific challenges and strengthen communication to maximize learning and adoption of Data Stewardship practice.
All of the above will eventually be shared at our dedicated site at datastewards.net.
With the creation of key infrastructure and pathways for collaboration, learning and best practices in data stewardship, we anticipate reducing barriers and unlocking new, scaled beneficial partnerships between the private and the public sector — including public officials, civil society, researchers, and international organizations. Working together, data provided by and about communities can have a clear mechanism to return value to the population, directly improving their quality of life.
We encourage members of the nascent data stewards ecosystem to get in touch with me or through email@example.com, so that we can collaborate together to move forward the field of practice for all.
(Thanks to Andrew Young, Rose Shuman, Mita Paramita, Claudia Juech, Sarah Lucas and Kara Selke for their input to this blog post)