Call for Examples: Open Data, Large Language Models, and Generative AI

Hannah Chafetz
Data Stewards Network
2 min readFeb 9, 2024

By: Hannah Chafetz, Sampriti Saxena, and Stefaan G. Verhulst

Since late 2022, we have witnessed the meteoric rise of generative AI and large language model (LLM) applications, including notable tools like ChatGPT, Bard, and Claude. These platforms have quickly become staples for many in terms of how they seek and access knowledge. However, the intricate relationship between open data and generative AI, and the vast potential it holds for driving innovation in this field remain underexplored areas. As these technologies continue to evolve and integrate into various sectors, understanding and leveraging open data could unlock new levels of inclusion and efficiency in generative AI applications.

What are the intersections between open data and generative AI? How can generative AI be used to democratize open data? How can open data make generative AI more equitable and qualitative?

Over the last few months, The GovLab’s Open Data Policy Lab has been working to address these questions and develop new resources to accelerate the use of open data for generative AI for the public good.

Currently, the team is working on a white paper that dives deep into these topics and expands upon how open data and generative AI could intersect.

We are looking for examples from across the globe of where open data has been used for generative AI or vice versa to be featured in our white paper.

  • Have you come across any new generative AI interfaces that leverage open data as training data?
  • Are you an open data provider leveraging generative AI on your open data portal? Or, are there any key use cases you’ve heard of?
  • Have you seen any use cases where open data is used to fine-tune generative AI models?
  • Has generative AI been used to extract insights or new knowledge from open data?
  • Have you come across any examples of generative AI being used to generate synthetic data to expand training sets?

Let us know by emailing us at datastewards@thegovlab.org

--

--