Our Five Conjectures to Explore in 2023 as They Relate to Data for Good

Published in

Data Stewards Network

9 min readJan 18, 2023

by Hannah Chafetz, Uma Kalkar, Marine Ragnet, Stefaan Verhulst

Image by Unsplash/Drew Beamer is licensed under CC0.

From the regulations proposed in the European Artificial Intelligence (AI) Act to the launch of OpenAI’s ChatGPT tool, 2022 was a year that saw many policy and technological developments. Taking stock of recent data and technology trends, we offer some conjectures as to how these ideas may play out over the next year. Indeed, predictions can be dangerous, which is why we position the below as conjectures — propositions that remain tentative till more evidence emerges — that can help advance the agenda and direction of responsible use of data for the public good focus areas.

Below, we provide a summary of the five conjectures that The GovLab will track and revisit throughout 2023.

Conjecture 1. In 2023 … non-traditional data may be used with increasing frequency to solve public problems.

Complex crises, from COVID-19 to climate change, demonstrate a need for information about a variety of developments quickly and at scale. Traditional sources are not enough: growing awareness and (re)use of non-traditional data sources (NTD) to fill the gaps in traditional data cast a spotlight on the value of using and combining new data sources for problem-solving. Over the next year, NTD sources could increasingly be called upon by decision-making to address large-scale public problems.

NTD refers to data that is “digitally captured (for example, mobile phone records and financial data), mediated (for example, social media and online data), or observed (for example, satellite imagery),” using new instrumentation mechanisms and is often privately held. Our recent report discussed how COVID-19 was a “watershed moment” in terms of generating access to non-traditional health, mobility, economic, and sentiment data. As detailed in the report, decision-makers around the world increasingly recognize the potential of NTD sources when combined with traditional data responsibly. Similarly, developments in the war in Ukraine presented a pivotal moment regarding the use of NTD sources. For instance, satellite images, social media narrative trends, and real-time location mapping have supported humanitarian action and peacebuilding.

These are just two examples of the increasing interest in NTD to solve public problems. We predict that this trend could continue to expand as technological advances continue to make non-traditional data more widely available to decision-makers. Already, the financial sector is increasingly incorporating non-traditional data to inform decisions such as assessing lending risks, for example. Recently, the fintech business Nova Credit and HSBC partnered together to exploit cross-border data to allow immigrants access to credit by predicting creditworthiness via digital footprint and psychometric data. This trend is compounded by increased legislation aiming to open up the re-use of private sector data, particularly in Europe. The increased attention to NTD sources signals a need to prioritize the alignment of the supply and demand of NTD and develop a systematized approach to how it can be integrated within decision-making cycles.

Conjecture 2. In 2023… corporations, governments, and the public may prioritize the use of data to combat the climate crisis.

Over the next year, CleanTech and GreenTech — technologies to mitigate or reduce the effects of human activity in the environment — could boom. More specifically, we predict that data will become an essential element in combatting climate change by providing stakeholders with the tools to better understand the issue at hand, and remedy it.

Public pressure on companies to practice social responsibility by investing in and using sustainable technologies has mounted, leading to calls to apply data to the climate crisis and invest in techniques to reduce data’s environmental footprint. The World Bank found that information and communication technology (ICT) use, while contributing to electricity and greenhouse gas emissions, has “remained flat” over the past decade thanks to the use of renewable energy sources. Furthermore, ICT use helps cut down other resource consumption, resulting in a net positive environmental impact.

Within operating processes, data and digitalization are assisting the shift toward clean energy in three main ways. First, high-quality data can enhance procedures for monitoring and tracking emissions and contribute to the improvement of existing processes. Second, it accelerates the development of new tools and less carbon-intensive techniques through design and operation procedures. Third, it enables new business models with financial incentives to hasten the transition via better management of the diversified energy mix that is forming as the world moves away from fossil fuels and more effective collaboration. Thus, we foresee a growth of research into environmental assessments and monitoring of data initiatives.

Instances of data and AI used to accurately measure and reduce emissions are already in the works. For instance, to better understand its carbon footprint and mitigate it, Microsoft deployed the most up-to-date technology for data collection in order to reduce any unfavorable environmental effects throughout the whole Microsoft ecosystem, including its partners and suppliers.

In 2023, enterprises and governments may continue to face increased pressures to limit their carbon emissions, refine their environmental responsibility practices, and assess the ways by which they understand the environmental impact of the tools and resources they use.

Conjecture 3. In 2023… private companies and government officials could increasingly open access to data, algorithms, and AI tools for public transparency, accountability, accessibility, and informed decision-making.

Users of data and AI tools are demanding the ability to interrogate the training data and algorithms at play in order to trust and verify system results. Learning from examples of bias and discrimination created by unchecked algorithms in the past, authorities are increasingly regulating the use and disclosure requirements of these tools. Further, to build social following and trust, tool creators are turning to open-by-default practices. Over the next year, open data and AI regulation and business practices could increase to meet market demand for accountable data tools.

With the rise in advances and interest in AI, policymakers have taken steps to govern the creation, use, and reuse of this emerging technology. For instance, the European Union’s (E.U.) Open Data Directive requires member states to implement a data reuse framework across public sector agencies and remove undue obstacles to data sharing and reuse. Similar cases are on the rise across the world. The Open Data Policy Lab (ODPL)’s living repository contains over 50 examples of recent legislative acts, proposals, directives, and other policy documents scoping this trend. A review of these resources demonstrates growing public interest in open data and data collaboration.

Moreover, regulators are taking steps to monitor AI use and development. The proposed U.S. Algorithmic Accountability of 2022 seeks to mandate impact assessments and public disclosure of AI tools acquired and used in the private sector. To this end, businesses themselves are adjusting practices to open their data. Businesses are recognizing the value of ‘open’ practices to create more equitable and trustworthy tools that are received with greater public confidence and use.

ChatGPT, the latest chatbot tech innovation from OpenAI, for example, has been recognized for its open-access design. Similar examples are multiplying. For instance, Amazon Web Services (AWS), Meta, Microsoft, and TomTom combined efforts to launch the Overture Maps Foundation, an interoperable open map data tool that serves as a shared, open-access resource that can improve mapping services globally.

Policy development to oversee mandatory and nice-to-have openness and accountability measures for data and AI still remains nascent, which indicates that this trend is likely to continue growing in the coming years, as government and private sector entities become increasingly concerned with demonstrating responsible data use and cultivating sustainable social licenses with data users and broader society.

Conjecture 4. In 2023… the public may demand new processes to establish the social license for reusing data, strengthening data privacy and ‘right to be forgotten’ regulation.

A slew of data protection and limits to data mining by online platforms are slated to be read across local and national governments, demonstrating the growing regulatory environment of constraining Big Tech practices. Increasingly, the public is calling on decision-makers to enact data privacy measures to limit where, when, how, and from whom data can be collected, stored, and used in order to reclaim individual agency over data and curtail surveillance capitalist business models. Over the next year, we could see the ratification of much legislation and regulation over data rights that could radically upend previously ‘too-big-to-fail’ online platform data practices.

Over the last few years, many countries, especially those in Europe, have adopted major pieces of legislation to protect their citizens’ data. Acting as a trailblazer, the European GDPR and its accompanying suite of data services and digital platform legislations have inspired similar legislation in California, Virginia, and other such consumer data protection regulations across the world. However, the implementation of these legislations appears to still be in its initial phase. Thus far, the E.U. has launched a strong offensive stance to crack down on data protection by imposing heavy fines and limits on companies like Meta that violate these laws.

The Future Today Institute notes that many attempts to pass privacy laws were put on hold by the pandemic, but these efforts have since been rekindled. For instance, the Data Protection Act, which would establish a new federal data protection agency, is being reintroduced by U.S. Senator Kirsten Gillibrand. Moreover, we are witnessing an increase in privacy measures, particularly in the E.U, India, and China that place a strong emphasis on limiting the transfer and storage of personal data outside of national boundaries.

Furthermore, data protection measures are focusing on protecting vulnerable groups, such as children, from harmful mental health and exploitation consequences of online media. Proposals including the U.K. Online Safety Act and California’s Age Appropriate Design Code Act place the onus on platforms to protect children’s data and privacy online. If ratified, they will severely curtail exploitative data-mining business models favored by many online platforms. Since 2019, the Responsible Data for Children Initiative has been advocating for children’s unique data protection needs and multidisciplinary methods to enforce them. As well, attention is being paid to protecting migrant and refugee data rights across humanitarian initiatives to correct inherent power and agency imbalances between data subjects and data users. We foresee greater policy interest and willingness to flesh out concepts like digital self-determination to increase agency, empowerment, and knowledge about the use, collection, and permissions granting of data in humanitarian contexts.

A majority of these data governance measures are being pushed by citizen fears and consumer pressure. According to the 2022 Cisco Consumer Privacy survey, 80% of people are willing to take action to protect their data and a growing number of people are cognizant of their national data protection laws. The uptick in widescale awareness and demand for data privacy and protection is likely to continue increasing as data collection continues to hike.

Conjecture 5. In 2023… generative and adaptive AI is poised to disrupt major industries and reshape how we use data and AI for decision-making and product creation.

Increasingly, data has been applied to power generative and adaptive AI tools that can assist policymakers in complex decision-making environments. Novel ways of applying language and image processing tools, from education to art, demonstrate this AI’s disruptive ability for a myriad of industries. Over the next year, the use of these tools could become more widespread, prompting the need for data stewards to understand how they work and oversee their responsible use.

Adaptive AI has increasingly been put to the test in complex conflict situations to reduce human risk and improve data analysis accuracy. For example, the United Nations Development Programme (UNDP) developed a machine learning algorithm to help identify and classify war-damaged infrastructure. The model uses text mining to extract information from reports and categorize it by type of infrastructure. This model has been replicated by multiple UNDP country offices to help with post-conflict reconstruction and gauge the needs of vulnerable populations. The private sector is also getting into the game: Scale AI, for instance, is using AI to better anticipate and predict Russian airstrikes in Ukraine. The emerging role of AI in PeaceTech coupled with vigilance over its use and discrimination risks by data stewards, we believe, could help push the needle on ‘good’ uses of dual-use technologies.

In addition to addressing tactical and operational needs, generative AI has given way to a digital renaissance — pushing the boundaries of art and music. As mentioned previously, tools including ChatGPT, DALL-E, Character.AI, and a host of other AI tools have hit the market for designing creative outputs. Capitalizing on this gold rush — the generative AI market is estimated to reach USD 109.37 billion by 2030 — private sector companies are ramping up their use and design of this technology across the board, from applying DALL-E to accelerate protein design to writing school essays indistinguishable from student work=.

Tech optimists believe that generative AI will be able to enhance the creative process of artists and designers as existing tasks will be enriched by generative AI systems, speeding up the ideation and, ultimately, the production phase. Others, however, are concerned about the threats to copyright and intellectual property protections used to legally protect artistic and creative work, as well as convincing deepfake creations that make disinformation and misinformation harder to thwart due to the AI.

Currently, hardly any regulation governs the generative and adaptive AI landscapes. Policymakers may move to act swiftly to balance AI innovation within existing legal regulations.

***

In 2023, The GovLab will be following these conjectures to see which come true. We will also use them to enhance our knowledge products and guide our efforts on the responsible use and reuse of data and data regulation.

To stay in the loop on these topics and our work, follow The GovLab on Twitter, Medium, and the Data Stewards Network Newsletter.

Our Five Conjectures to Explore in 2023 as They Relate to Data for Good

Written by The GovLab