The Emergence of National Data Initiatives: Comparing Proposals and Initiatives in the United Kingdom, Germany, and the United States

Stefaan Verhulst and Roshni Singh

Data & Policy Blog
Data & Policy Blog
Published in
10 min readDec 3, 2024

--

Introduction

Governments are increasingly recognizing data as a pivotal asset for driving economic growth, enhancing public service delivery, and fostering research and innovation. This recognition has intensified as policymakers acknowledge that data serves as the foundational element of artificial intelligence (AI) and that advancing AI sovereignty necessitates a robust data ecosystem. However, substantial portions of generated data remain inaccessible or underutilized. In response, several nations are initiating or exploring the launch of comprehensive national data strategies designed to consolidate, manage, and utilize data more effectively and at scale. As these initiatives evolve, discernible patterns in their objectives, governance structures, data-sharing mechanisms, and stakeholder engagement frameworks reveal both shared principles and country-specific approaches.

Image by KanawatTH from CanvaPro

This blog seeks to start some initial research on the emergence of national data initiatives by examining three national data initiatives and exploring their strategic orientations and broader implications. They include:

Brief Introduction to the Three Initiatives

Technical White Paper Challenge announcement from Wellcome
Announcement of Germany’s National Data Institute (Dateninstitut) from Bundesministerium des Innern und für Heimat (BMI)
  • The United States’ National Secure Data Service (NSDS)
    The National Secure Data Service (NSDS) is a U.S. initiative authorized by the CHIPS and Science Act of 2022 to advance the secure and efficient sharing of sensitive data for evidence-based policymaking, research, and innovation. The project is being implemented as a five-year demonstration project by the National Center for Science and Engineering Statistics (NCSES) under the National Science Foundation (NSF), in consultation with the Office of Management and Budget and the National Artificial Intelligence Initiative Act of 2020 Interagency Committee​. The NSDS seeks to address systemic challenges in federal data-sharing by building a shared services platform designed to streamline data access, linkage, and analysis while maintaining stringent privacy protections. Key features of the initiative include the deployment of advanced privacy-preserving technologies such as secure multi-party computation, synthetic data generation, and privacy-preserving record linkages. A key part of the NSDS is to facilitate collaborations with public and private stakeholders under America’s Data Hub Consortium. Its ultimate vision is a government-wide infrastructure that supports a federated approach to evidence-building, enabling seamless collaboration across federal agencies, academia, and industry.
The National Secure Data Service Demonstration (NSDS-D) project overview from the National Center for Science and Engineering Statistics (NCSES)

Comparing the Initiatives in 10 ways

In the below, we compare the three initiatives along 10 analytical questions. Given that all three efforts are either being designed or under construction, the assessment may need to be updated as more details emerge.

1. Purpose and Strategic Objectives

Analytical Question: What are the primary strategic objectives of each initiative, and how do they reflect the respective country’s broader socio-economic and political priorities?

  • United Kingdom: The NDL seeks to improve access to public sector data for health and scientific research, addressing barriers to data usage and enhancing interoperability.
  • Germany: The NDI focuses on digital sovereignty and innovation through secure, ethical, and interoperable data use, aligned with EU regulations.
  • United States: The NSDS aims to enhance federal data sharing for evidence-based policymaking with a privacy-centric infrastructure.

2. Governance Structure and Implementation

Analytical Question: How are the data initiatives structured and governed? What are the implementation timelines and current operational statuses?

  • United Kingdom: The NDL is in the conceptual phase, engaging stakeholders to define technical and governance frameworks.
  • Germany: The NDI is in development, with pilot projects informing its modular structure and governance involving public, private, and academic sectors.
  • United States: The NSDS is in a pilot phase under NSF, testing scalable data-sharing models with federal and academic partners.

3. Sectoral Focus and Stakeholder Involvement

Analytical Question: What sectors are primarily involved in each initiative, and how do stakeholders contribute to shaping data policies and innovation efforts?

  • United Kingdom: Focus on health and life sciences, engaging researchers and technical experts.
  • Germany: Broad sectoral collaboration with emphasis on AI, public administration, and research.
  • United States: Primarily government research, with plans to expand to academic and private sectors.

4. Approach to Data Sharing and Interoperability

Analytical Question: How does each initiative approach data sharing, and to what extent do they prioritize interoperability across sectors and borders?

  • United Kingdom: Standardizes public sector data access for interoperability across government, academia, and industry.
  • Germany: Enhances interoperability domestically and within the EU through standardized protocols and secure exchanges.
  • United States: Focuses on privacy-preserving interagency data sharing, with potential external partnerships.

5. Privacy and Security Considerations

Analytical Question: How are privacy and data security prioritized in each initiative, and what technologies or policies are in place to ensure compliance?

  • United Kingdom: Prioritizes secure data management and public trust.
  • Germany: Emphasizes data sovereignty, security, and compliance with GDPR.
  • United States: Incorporates privacy-preserving technologies and seeks to comply with federal privacy laws.

6. Innovation and Technological Leadership

Analytical Question: How do the initiatives contribute to technological leadership in AI, data analytics, and other emerging technologies?

  • United Kingdom: Supports research-driven innovation, especially in health and life sciences.
  • Germany: Advances AI and cybersecurity leadership through ethical innovation.
  • United States: Strengthens data infrastructure, laying groundwork for technological advancements, including artificial intelligence.

7. International Collaboration and Impact

Analytical Question: What role do international collaboration and global leadership play in each initiative, and how might these initiatives impact global data governance?

  • United Kingdom: May facilitate international research collaboration with high-quality datasets.
  • Germany: Aligns with EU policies, emphasizing secure and ethical data sharing.
  • United States: Primarily domestic focus but could serve as a global model for privacy-centric frameworks.

8. Ethical and Legal Frameworks

Analytical Question: How do ethical considerations shape the governance and implementation of each data initiative, and how are they embedded in legal and regulatory frameworks?

  • United Kingdom: Guided by ethical principles and adherence to UK GDPR.
  • Germany: Embeds ethics in governance, aligned with GDPR and EU laws.
  • United States: Adheres to the Evidence-Based Policymaking Act, emphasizing ethical data-sharing practices.

9. Progress and Maturity of Initiatives

Analytical Question: What is the current status and maturity of each initiative, and how effectively are they achieving their strategic goals?

  • United Kingdom: In conceptual design with ongoing technical development/outsourcing.
  • Germany: Advancing through pilot projects and governance design.
  • United States: Testing pilot projects with progress depends on legislative support, which may be shifting.

10. Challenges and Future Outlook

Analytical Question: What challenges do these initiatives face, and what are the future prospects for their development and success?

  • United Kingdom: Faces challenges in equitable data access, infrastructure, and public trust.
  • Germany: Challenges likely regarding balancing digital sovereignty with global interoperability and ethical standards, as well as changing political priorities.
  • United States: The changing political landscape as well as legislative approval and scalability of privacy-preserving solutions remain hurdles.

Insights and Reflections

The comparison of the United Kingdom’s National Data Library (NDL), Germany’s National Data Institute (NDI), and the United States’ National Secure Data Service (NSDS) reveals ambitious efforts to advance data access, re-use, and governance. At the same time, they demonstrate distinct priorities and varying degrees of maturity and operationalization.

More importantly, these initiatives insufficiently emphasize critical aspects of data stewardship and the need for a social license. Without integrating these dimensions more explicitly, their ability to deliver trusted, sustainable, and impactful outcomes may be constrained, particularly in the context of rapidly shifting political landscapes.

Image by Monster Ztudio from CanvaPro

Below, we list a set of Insights, Shortcomings and Reflections:

  1. Acknowledging Data Stewardship as a Core Framework: While these initiatives address interoperability, privacy, and data governance, they fall short of explicitly framing their work around data stewardship principles (including FAIR principles). Data stewardship involves not just managing data but ensuring its reuse is done in a systematic, sustainable, and responsible way. The evolving political environments in Germany, and the US necessitate a more proactive focus on embedding stewardship into the DNA of these initiatives to maintain credibility and relevance.
  2. Prioritizing a Social License for Data Use: Public trust cannot be taken for granted. Establishing a social license that reflects community preferences and expectations, and goes beyond just collecting consent, is essential to fostering long-term support for these initiatives. This involves engaging communities early, transparently addressing concerns about privacy and equity, and demonstrating tangible societal benefits. The NDI’s multi-stakeholder approach and the NSDS’s collaboration framework are promising starts but must go further in explicitly building mechanisms for ongoing public engagement and accountability.
  3. Optimizing for priority questions and use cases: For these initiatives to achieve their full potential, they must adopt a use-case-driven approach and identify priority questions to guide their design and optimization. By focusing on clearly defined societal or economic challenges, these initiatives can better ensure their data libraries and governance frameworks are tailored to deliver measurable impact.
  4. Adapting to Shifting Political Landscapes: The political contexts in the UK, Germany, and the US have shifted significantly, impacting priorities and the feasibility of long-term strategies such as the ones proposed above. These shifts may demand adaptive strategies that can withstand political transitions and continue to deliver value.
  5. Engaging the Private Sector and Civil Society: Multi-sector data collaboration remains critical to the success of these initiatives. However, current approaches seem to lack robust mechanisms to ensure equitable participation by private and civil society actors. For example, while the NSDS has a structured partnership approach, it must expand its engagement scope to ensure these collaborations meaningfully shape policies and implementation. Similarly, the NDI should leverage Germany’s strong civil society to build grassroots trust and innovation.
  6. Shifting from Centralized to Distributed Models: While centralized platforms like the NDL, NDI, and NSDS aim to streamline data governance, a more distributed approach — focused on local or sector-specific platforms — could enhance flexibility, scalability, and responsiveness to diverse stakeholder needs. Experimentation with such distributed models may be needed to determine whether they are better at addressing regional, cultural, and sectoral variations, fostering tailored solutions while maintaining interoperability.
  7. Scaling Innovation Faster: Achieving scale requires modular, iterative models, as perhaps was the idea behind Germany’s pilot-driven framework and the US’s phased demonstration project. However, scaling requires rapid learning and adjusting from those iterations or pilots and must also align with stewardship and trust-building goals to ensure these initiatives remain sustainable and impactful.
  8. Addressing the Rapidly Changing AI Landscape: Advances in AI are reshaping how data is created, re-used, and analyzed. These national initiatives must acknowledge the role of AI in leveraging data and embrace today’s innovation in creating data commons that balance extraction and access.
  9. Setting Up an International Exchange Initiative: Establishing an international exchange platform could facilitate the sharing of research, lessons learned, and feedback among these and similar initiatives globally. Such a platform could promote collaboration, enhance interoperability, and ensure alignment with best practices while fostering innovation in data governance. It could also serve as a hub for benchmarking progress and creating scalable national data initiatives.

About the Authors

Stefaan Verhulst is Co-Founder and Chief Research and Development Officer as well as Director of The GovLab’s Data Program. He is an Editor-in-Chief of Data & Policy, the open-access journal published by Cambridge University Press.

Roshni Singh is Research Assistant at The GovLab.

***

This is the blog for Data & Policy (cambridge.org/dap), a peer-reviewed open access journal published by Cambridge University Press in association with the Data for Policy Community. Interest Company. Read on for ways to contribute to Data & Policy.

--

--

Data & Policy Blog
Data & Policy Blog

Published in Data & Policy Blog

This is the blog for Data & Policy (cambridge.org/dap), an open access journal for the impact of data science on governance. Editors-in-Chief: Zeynep Engin (UCL, Data for Policy), Jon Crowcroft (Cambridge, Turing Institute), Stefaan Verhulst (GovLab, NYU). Published by CUP.

Data & Policy Blog
Data & Policy Blog

Written by Data & Policy Blog

Blog for Data & Policy, an open access journal at CUP (cambridge.org/dap). Eds: Zeynep Engin (Turing), Jon Crowcroft (Cambridge) and Stefaan Verhulst (GovLab)

No responses yet