Making Data Collaboration More Systematic, Responsible, Sustainable

Key Takeaways from the 2nd Data Stewards Camp in Cape Town

Esoko is an award-winning agriculture technology firm based in Ghana that aims to improve outcomes for smallholder farmers and enable them to maximize the market value of their crop yields. The company’s data-driven offerings, for example, are enabling farmers to track inventory and conduct market research in a more evidence-based manner. But while farmers and the agriculture sector in Africa are always the intended beneficiary of Esoko’s work, they are not the only users of the company’s data. Esoko also collaborates with government and NGO partners to provide them with useful agricultural information to help them conduct land surveys, target social protection programs, and assess the on-the-ground impacts of their work.

Similarly, AB InBev, a multinational corporation that produces beers and other beverages, is experimenting with new ways to leverage data not only to improve its internal operations, but also to create positive impacts on society. Its SmartBarley program, a collaboration with other businesses, governments, and NGOs, aims to operationalize the corporation’s agriculture data and insights drawn from that data to enable smallholder farmers to improve their “productivity and environmental performance.”

How data collected by the private sector, including companies like Esoko or AB InBev, can help transform public policy decisions or help solve public problems was the focus of our second Data Stewards Camp held in Cape Town at the Cape Innovation and Technology initiative earlier this month. Participants included representatives from Esoko and AB InBev, as well as MCC Group, MyBioMe, Siyavula, Research ICT Africa, Kwantu, Open Data Durban, University of Cape Town, and Stellenbosch University, among several others.

The Need for Data Stewards

Access to privately held data through data collaboratives has the potential to make policy making more evidence based, like determining progress toward achieving public-facing objectives, from national priority areas to the Sustainable Development Goals (SDGs) and other targets. Yet despite the promise, establishing and sustaining these new collaborative and accountable approaches require significant and time-consuming effort and investment of resources from both data holders on the supply side, and institutions that represent the demand.

Our initiative starts from the premise that, by establishing Data Stewardship as a function, recognized within the private sector as a valued responsibility, the practice of Data Collaboratives can become more predictable, scaleable, sustainable and de-risked.

Our Second Data Stewards Camp

On December 3rd in Cape Town, the GovLab and Adapt, along with the Global Partnership for Sustainable Development Data (GPSDD), held the second Data Stewards Camp aimed at surfacing pathways for advancing the practice and increasing the positive impact of data collaboration and data stewardship across Africa. (Highlights of our First Data Stewards Camp held in San Francisco can be found here).

Bringing together Data Stewards and other stakeholders from across the African data ecosystem, the workshop focused on a number of key issues, including:

  • the value proposition(s) of these emerging practices;
  • questions related to risks and mitigation strategies;
  • technical, legal, and cultural barriers and challenges;
  • tools and methodologies for creating an impact through data collaboration and stewardship;
  • best practices for achieving sustainability; and
  • innovative metrics of success and evaluation techniques, among other issues.

Key Takeaways

The workshop, and a public panel held afterward, highlighted the importance of fostering conversations amongst data stewards and other stakeholders around those topics. In particular, the workshop made evident that to make data collaboration more systematic, sustainable, and responsible, data stewards will play an essential role yet they will need new approaches and tools toward:

  • Improving the way we assess the potential value and impact of data collaboratives;
  • Determining the readiness of the ecosystem for data collaboratives;
  • Identifying, assessing, and, if needed, creating intermediaries to address gaps and needs within the ecosystem;
  • Developing a practical tool or framework to help corporations make decisions throughout the data value chain;
  • Establishing data collaboratives at the local and city level.

1. Improving the way we assess value and impact of data collaboratives

A survey of the workshop participants indicated that 100% of those attending felt that privately held data can serve the public good. Yet despite the overall consensus of the potential, there was no clear agreement among participants that there was an expectation from the public for companies to share private data (only 30% strongly agreed and 30% strongly disagreed), nor was there belief that strong incentives exist for the private sector to leverage their data for the public good (0% of participants strongly agreed).

Participants made clear the need for tools, strategies, and frameworks for assessing the potential value of data collaboration. Especially as the real and important risks surrounding data collection and sharing often dominate the conversation, data stewards often lack targeted mechanisms for understanding the potential value of a particular collaboration — or the risks of not sharing data. In order to move from a fully risk averse approach to data use, data stewards need tools to help weigh the potential risks of data sharing against the potential rewards. Moreover, such a value assessment tool could also play a key role in ensuring that data leveraged as part of a data collaborative is fit for purpose and well targeted for addressing the root causes of the problems at hand. A well defined value proposition upfront will also allow for a more rigorous impact assessment down the line.

In addition, several indicated that the value proposition for data collaboration is still not well understood among key stakeholders, including corporations, and expressed the need for more case studies and stories on existing practice.

2. Determining readiness for data collaboration

In order to make an informed decision regarding the viability of a data collaborative opportunity, data stewards need to determine whether the demand side of the equation (e.g., researchers, public authorities or civil society partners) has the requisite infrastructure and capacity in place to meaningfully act on the data. In some cases this is a technical question — if data is to be shared, partners need to possess the technical capacity to analyze (and protect) it. In other cases, namely when insights drawn from the data or data science expertise are shared rather than actual datasets, readiness is more of a question of partners’ ability to act on that intelligence in a meaningful way toward addressing public problems.

Readiness for data collaboration is also a question for data stewards when looking internally. Participants highlighted that data cleaning or additional processing can be required in order to prepare data to be shared with external parties. Although this preparation can represent an additional step for data holders to undertake, there is also potential internal value that could arise from such improvements to the quality and utility of data.

3. Considering intermediaries to address gaps and needs

Quite often, data collaboratives involve additional stakeholders beyond the data suppliers in the private sector and data users in the public or civil sector. A wide range of third-party intermediaries can potentially add value, credibility, and rigor to data collaboratives. Data collaboratives often leverage the skills, expertise, and capacity of intermediaries in academia, topic-relevant community-based organizations, and data science organizations, among other actors. Importantly, participants noted that for data collaboratives in the African context, funders and international organizations often have a seat at the table from the start. Several also raised concerns about providing certain data users, including government, direct access to their data or insights and felt that the use intermediaries may be more appropriate in some cases.

Strategies for identifying potentially valuable intermediaries, and for maximizing the added value of different third parties engaged as part of the collaboration could aid data stewards in achieving their objectives for maximizing the public value of the data their companies hold.

4.Meaningfully assessing strategies throughout the Data Value Chain

Positioning a data collaborative for success involves the consideration of risks — which tend to differ depending on the stakeholders, use cases, and types of data at play. Workshop participants identified a need for tools, such as data responsibility frameworks and decision trees, to help data stewards meaningfully assess the risks of data collaboration opportunities. Such tools could act as an evidence-based checklist of key considerations and questions to ask related to potential risks across the data lifecycle, helping data stewards to become more systematic in determining whether or not data collaborative opportunities are viable and worth pursuing in earnest.

Participants also identified an important and related gap in the current evidence base surrounding data stewardship and collaboration: the lack of insights regarding data collaboratives that succumbed to risks. While success stories are often amplified and shared, failures are more likely to be swept under the rug, obscuring potentially valuable lessons on risk identification and concrete examples that could inform the development of risk mitigation strategies going forward.

5. Focus on City Data Collaboratives

Finally, participants made clear that data collaboration at the city or municipal level represents an area of particular opportunity in the African context and beyond. Not only do city governments often enjoy a level of agility beyond that of provincial or national governments, but officials working at the city level are also closer to the public problems experienced in their communities, making them optimal partners for ensuring that data collaboratives are fit for purpose.

What’s Next

In 2019, GovLab and Adapt will continue to work together to make data collaboratives more systemic, sustainable and responsible by providing more tools and resources, learning and networking opportunities, to build and maintain a common practice among data stewards across the globe.

The five key takeaways described above, combined with the insights from the initial Data Stewards Camp event — which highlighted the need for a data stewards network — provide us with a clear roadmap for support the Network and growing the ecosystem of data stewards.

In particular, in the coming months we will seek to support data stewards and the creation of data collaboratives by developing, among other things:

  • A value and impact assessment framework for data collaboratives and series of case studies;
  • A set of criteria and enabling conditions for determining the readiness of the ecosystem for data collaboratives and need for intermediaries;
  • Decision trees to help data stewards assess options throughout the data value chain;
  • An assessment and overview of city data collaboratives that can inform more evidence based design of data collaboratives at the city level (including in Cape Town).

We are also planning another Data Stewards Camp in London with the Open Data Institute in the first quarter of 2019 to make and share progress on the above.

If you are interested in learning more about the Data Stewards Network, or to participate in the upcoming convening in London, please reach out to Stefaan Verhulst (stefaan@thegovlab.org).