Addressing the Challenges of Drafting Contracts for Data Collaboration
By Andrew Young, Andrew J. Zahuranec, Stephen Burley Tubman, William Hoffman, and Stefaan Verhulst 
To deal with complex public challenges, organizations increasingly seek to leverage data across sectors in new and innovative ways — from establishing prize-backed challenges around the use of diverse datasets to creating cross-sector federated data systems. These and other forms of data collaboratives are part of a new paradigm in data-driven innovation in which participants from different sectors provide access to data for the creation of public value. It provides an essential new problem-solving approach for our increasingly datafied society . However, the operational challenges associated with creating such partnerships often prevent the transformative potential of data collaboration from being achieved.
One such operational challenge relates to developing data sharing agreements — through contracts and other legal documentation. The current practice suffers from large inefficiencies and transaction costs resulting from (i) the lack of a common understanding of what the core issues are with data exchange; (ii) lack of common language or models; (iii) large heterogeneity in agreements used; (iv) lack of familiarity among lawyers of the technologies involved and (v) a sense that every initiative needs to (re)invent the wheel. Removing these barriers may enable collaborators to partner more systematically and responsibly around the re-use of data assets .
Contracts for Data Collaboration (C4DC) is a new initiative seeking to address these barriers to data collaboration . Its charter members include The GovLab at NYU, the United Nations Sustainable Development Solutions Network’s Thematic Research Network on Data and Statistics (TReNDS), University of Washington — Applied Physics Laboratory, and the World Economic Forum. The partnership, launched in early 2019, has already yielded a number of outputs, including a project inception brief, the Contractual Wheel of Data Collaboration tool — which presents key considerations for the development of data sharing agreements — and an initial analytical framework [5,6].
On September 23, 2019 at One World Trade Center, in the context of the United Nations General Assembly, the partners held a workshop to further advance the project, expanding the field’s understanding of the needs, opportunities, challenges, and risks related to establishing the basis for data collaboration. More than fifty participants from across the data ecosystem explored how greater transparency, access, and understanding of data-sharing agreements can advance data collaboratives and improve people’s lives. Within these confines, the discussion centered on six key items:
- Building a constituency: Data actors need to build a community of collaboration to create common frameworks, norms, and principles and to encourage the replication of successful practices;
- Articulating a shared narrative: Data actors need to create common languages and shared principles and values;
- Understanding the diffuse ecosystem: Data actors should recognize that roles are not binary. The ecosystem of data providers, users, and enablers is diverse and complex;
- Understanding power asymmetries: Data actors should acknowledge that partnerships are not equal. It is important to identify power asymmetries at the outset and ensure both parties are equally equipped to engage in the partnership;
- Engaging individuals: Data actors should make data sharing more inclusive, especially to marginalized communities from which the data is collected; and
- Incentivizing contribution: Data actors can motivate the sharing of data by incentivizing each other and demonstrating opportunities through good cases or examples.
Addressing Legal Barriers to Data Collaboration
The event began by introducing three broadly defined parties often involved in data collaboration:
- Data Suppliers: those who curate and provide data;
- Data Demand Actors: actors wanting to use new sources of data to achieve public good; and
- Ecosystem Enablers: parties that provide financial, technological, or human resources to facilitate collaboration .
From this framework, workshop attendees discussed the barriers they commonly encountered in the contract-development process as suppliers, demanders, or ecosystem enablers. Beginning with data suppliers, participants noted problems related to:
- Lack of a clear and well-defined demand side: To create effective partnerships, data suppliers acknowledged the importance of understanding the landscape of potential users of data. The demand side of data collaboration is often broadly defined, making the launch of specific, targeted data collaboration (and associated legal artefacts) challenging.
- Jurisdictional complexity and variability: In sharing data, data providers can face different legal landscapes and rules depending on their context. For example, one legal practitioner cited her experience with having to operate in the European Union’s General Data Protection Regulation (GDPR) and how it differs from the United States’ legal landscape. Ensuring alignment with different regulatory regimes, especially for international collaborations, can be a major challenge.
- Privacy and security: Data assets can be sensitive, containing personally identifiable or demographically identifiable information. As such, collectors often must abide by strict protocols aggregating and disseminating of information . Suppliers are required to ensure data use (and sharing with additional third parties) respects the privacy of data subjects and any pre-existing confidentiality agreements.
Data demand actors often face challenges such as:
- Power asymmetries: In data collaboratives, parties do not all have the same capabilities, resources, or authority . A small NGO, for example, would not enjoy the same level of in-house legal support as a collaborating multinational corporation, potentially leading to the inequitable distribution of an initiative’s benefits. Data demanders acknowledged that contracts for data collaboratives should recognize parties’ comparative advantages and weaknesses.
- Sustainability: Often data collaboratives are short-term and one-off projects. Limited engagements can create difficulties in establishing more sustainable legal (and financial) bases for effective collaboration .
- Fragmentation of Institutional/Business Mandates: Organizations are rarely united in all aspects of their work. Actors within an organization often have disparate interests that must be accommodated or acknowledged.
Ecosystem enablers, meanwhile, noted they faced problems pertaining to:
- Data Literacy: A contract is unlikely to be successful if the legal authorities drawing up the agreement do not understand the data assets and the practical work required to use them. However, communicating highly technical language and requirements to an outsider not well-versed in the topic can be difficult.
- Legal/Contractual Background: Similarly, those actually operationalizing the contract might not understand all aspects of the agreed-upon contractual framework. If technology operators don’t understand the legal, contractual, and liability issues their organization faces, they are unlikely to act in a way that accommodates those concerns.
The conversation made clear that while a framework of supply, demand, and enablement is useful, the practice of data collaboratives requires more flexibility. The roles, responsibilities, and needs of actors can be fluid; corporations, NGOs, governments, foundations, startups, law firms, regulators, and individual citizens can fit into one or all three of these roles depending on the context.
Solutions for Addressing Legal Barriers to Data Collaboration
These challenges are undoubtedly significant. Stakeholders across the ecosystem recognized that there were no easy solutionsThe workshop provided an opportunity to discuss immediate actions to accelerate progress toward addressing these challenges, as well as longer term objectives and potential solutions.
First, the workshop attendees agreed on the importance of creating an evidence base of current practice in the creation of legal documentation for data collaboration beginning with the creation of a shared repository of data-sharing agreements, a process already underway as part of the C4DC initiative. Individuals talked about building a glossary of shared terminology to eliminate confusion and creating a virtual hub for actors to share best practices. Lastly, many participants saw value in creating a standard engagement and communication strategy to assist them in engaging with the public and holding more conversations with like-minded organizations on standards for the contract-writing process.
In the longer term, participants focused on three major themes that, if addressed, could steer contracting for data collaboration toward greater effectiveness and legitimacy.
Data Stewardship and Responsibility: First, much of the discussion centered on the need to promote responsible data practices through data stewardship. Though part of this work involves creating teams and individuals empowered to share, it also means empowering them to operationalize ethical principles .
By developing international standards and moving beyond the bare minimum legal obligation, these actors can build trust between parties, a quality that has often been difficult to foster. Such relationships are key in engaging intermediaries or building complex contractual agreements between multiple organizations . It is also essential to come to an agreement about which practices are legitimate and illegitimate.
Incorporation of the Citizen Perspective: Trust is also needed between the actors in a data collaborative and the general public. In light of many recent stories about the misuse of data, many people are suspicious, if not outright hostile, to data partnerships. Many data subjects don’t understand why organizations want their data or how the information can be valuable in advancing public good.
In data-sharing arrangements, all actors need to explain intended uses and outcomes to data subjects. Attendees spoke about the need to explain the data’s utility in clear and accessible terms. They also noted data collaborative contracts are more legitimate if they incorporate citizen perspectives, especially those of marginalized groups. To take this work a step further, the public could be brought into the contract writing process by creating mechanisms capable of soliciting their views and concerns.
Improving Internal and External Collaboration: Lastly, participants discussed the need for actors across the data ecosystem to strengthen relationships inside and outside their organizations. Part of this work entails securing internal buy-in for data collaboration, ensuring that the different components of an organization understand what assets are being shared and why.
It also entails engaging with intermediaries to fill gaps. Each actor has limitations to their capacities and expertise and, by engaging with start-ups, funders, NGOs, and others, organizations can improve the odds of a successful collaboration. Together, organizations can create norms and shared languages that allow for more effective data flows.
Next Steps and Call for Participation
To collaboratively make progress on these and other objectives, The GovLab, World Economic Forum, TReNDS, and University of Washington — Applied Physics Laboratory are convening additional partners and experts from across the contracts for data collaboration ecosystem. This emerging community of practice will seek to identify and act upon opportunities to advance more systematic, sustainable, and responsible approaches for Contracts for Data Collaboration. If you are interested in joining this campaign, please contact The GovLab’s Chief of Research and Development Stefaan Verhulst (firstname.lastname@example.org).
You can also participate by sending a redacted data-sharing agreement to TReNDS Analyst Hayden Dahmm (email@example.com) Party names, confidential materials, and other identifying information should be removed as necessary from these documents as they will be made available online in a searchable database. This collection will be used to study and advance understanding of data-sharing arrangements for the common good.
For more information on the Contracts for Data Collaboration project, visit www.contractsfordatacollaboration.org.
About the authors:
Andrew Young, Andrew J. Zahuranec and Stephen Burley Tubman are Knowledge Director, Research Fellow and Research Intern at NYU’s GovLab (https://www.thegovlab.org/) respectively; William Hoffman is Head of Data-Driven Development at the World Economic Forum; Stefaan Verhulst is Co-Founder and Chief Research and Development Officer at GovLab and Editor-in-Chief of the open access journal Data & Policy (cambridge.org/dap)
 The authors would like to thank C4DC project partners Jessica Espey and Hayden Dahmm of SDSN Trends and Scott David of University of Washington — Applied Physics Laboratory, for their review of this blog post, and their co-design and co-facilitation of the C4DC Workshop discussed in this post.
 “Data Collaboratives.” 2019. Accessed October 9, 2019. http://datacollaboratives.org/.
 TReNDS “Partnerships Founded on Trust: Introducing Contracts for Data Collaboration (C4DC).” 2019.. https://www.sdsntrends.org/research/2019/4/24/partnerships-trust-c4dc?locale=en.
 “Contracts for Data Collaboration.” 2019. Accessed October 9, 2019. https://contractsfordatacollaboration.org/.
 See TReNDS
 “Introducing Contracts for Data Collaboration — The Governance Lab @ NYU.” 2019. http://thegovlab.org/new-initiative-contracts-for-data-collaboration/.
 Young, Andrew. 2019. “Enabling Responsible Data Ecosystems to Use Data for Good.” Medium. September 20, 2019. https://medium.com/data-stewards-network/enabling-responsible-data-ecosystems-to-use-data-for-good-f9e1651eaa29.
 Garg, Radhika. 2018. “Open Data Privacy and Security Policy Issues and Its Influence on Embracing the Internet of Things.” First Monday 23 (5). https://doi.org/10.5210/fm.v22i5.8166 .
 GovLab, The. 2019. “Data: The Lever to Promote Innovation in the EU.” Medium. March 28, 2019. https://medium.com/data-stewards-network/data-the-lever-to-promote-innovation-in-the-eu-a1d13404698d.
 Verhulst, Stefaan G. 2019. “How to Use Data for Good — 5 Priorities and a Roadmap.” Medium. https://medium.com/data-stewards-network/how-to-use-data-for-good-5-priorities-and-a-roadmap-df96c3477abc.
 Young supra note 4.
 Data trusts have often been invoked as one such model for contractual arrangements. See: “What Is a Data Trust? — The ODI.” n.d. Accessed October 11, 2019. https://theodi.org/article/what-is-a-data-trust/.