Nationalize the Data: How the US Can Beat China in AI

Created by Natanaelginting (

A member of the class of 2020, Daniel Brennan studies Military History and Political Theory. He is a United States Marine Corps Officer Candidate, a Student Fellow at Penn’s Perry World House, and a rower on the Men’s Varsity Lightweight Crew.

On October 4th, 1957, the Soviet Union launched the first artificial satellite — Sputnik 1 — into space. The United States, which had thought that its technological supremacy was uncontested, was now faced with the prospect of falling behind the Soviets. President Eisenhower and Congress were quick to react. That following year Congress authorized the creation of the Advanced Research Projects Agency — renamed Defense Advanced Research Projects Agency (DARPA) — and passed the National Defense Education Act of 1958. These measures made billions of dollars available to educators and researchers to promote scientific education. America reaped the benefits of this investment for decades to come as American scientists and engineers won the Space Race, developed cutting edge medical treatments, and developed leading research institutions and universities.

With China aspiring to lead the development of artificial intelligence (AI), some American observers have cried “Sputnik” and implored policy makers to pump more money into STEM fields so that the United States does not fall behind. Their concern is valid but their proposed solution is ham-fisted and misguided. The race for artificial intelligence is not the race to the moon. Whereas the moonshot was a narrowly defined government project, leading the AI revolution is a broad and nebulous societal goal.

The transformative nature of artificial intelligence suggests that the best way for the United States to stay at the forefront of AI development is to leverage its current leading position to enable future innovation. The United States should pass legislation requiring the release of privately held data after a period of several years so as to generate a comprehensive public data bank on which nascent AI enterprises can train their algorithms. Creating these data banks would significantly lower the barriers to AI innovation in the United States, draw international talent, and ensure that the US stays at the forefront of technological innovation.

Economist and historian Joel Mokyr contends that the reason the United Kingdom led the world in industrialization was not because of outstanding British inventors such as James Watt, but because Britain had more tinkerers, mechanics, and entrepreneurs than other European countries. As a result Britain was better able to incorporate new industrial technologies into its commercial and civil enterprises once they were developed. The British example suggests that technological revolutions do not simply occur around the best technologists, but where there are many competent technologists who can apply the discoveries of the brilliant few to everyday life.

The revolutionary potential of artificial intelligence might mean that its fruition will rely on a large group of well-trained (not visionary) programmers and engineers. The United States currently leads the world in AI thanks to large tech companies such as Facebook, Google, and Apple who employ much of the field’s top talent and command vast troves of well indexed and organized data. However, that same concentration of data in the hands of a few private firms may soon hamstring US progress by crowding out smaller firms hoping to develop their own AI applications.

Whereas the great inventors of the Industrial Revolution patented their new machines — which protected their intellectual property as others applied it — tech firms who aggregate the data needed for AI development are not required share it. Indeed, at least for a short period after its collection, tech firms need to keep the data they collect private so that they can use it to develop products and services. However, if tech firms were required to publish their data after a period of several years the result would be a boon in AI research and development with minimal impact on company profits.

Under this arrangement the US federal government would receive and store the “outdated” data from the servers of large tech firms after the data passed outside the window of proprietary use. The data — which would be purged of names and other identifying information — would then be accessible for start-up AI firms who can use the data for free to train AI algorithms. Making data accessible will foster a new generation of public and private AI enterprises by lowering barriers to entry for less sophisticated or well-funded groups. Additionally, by allowing big tech to retain exclusive control over its data for several years before publication, companies would still have an incentive to develop their data aggregation capabilities — resulting in higher quality published data. While big tech has done a good job developing artificial intelligence on its own, we should pass legislation that would open up the field to new players to encourage competition and innovation.

In addition to spurring AI development in the US, a data publishing bill may bring other benefits as well. Publishing the data collected by large tech firms would have the secondary effect of making them more sensitive to public interest. If firms operate under the assumption that the data they collect will eventually have to be published they may be reticent to collect increasingly personal and sensitive information which would incur public condemnation when released.

The unique challenge of protecting our advantage in artificial intelligence requires a more sophisticated response from our government than simply throwing money at STEM. While government investment in scientific research will never hurt the problem, we must recognize that, unlike the moonshot, the leaders in the AI revolution will be private firms. Because AI will invariably spread to the public sector as it becomes cheaper to develop and implement, the US government should not waste its time trying to develop it on its own, as the Chinese state has resolved to do. Instead the US should focus on giving its economy the room to develop.

By passing legislation to require the publication of out-of-date data, government can help create an economy where more groups can participate in the development of AI — ensuring American leadership in the field for years to come.

Technology, Innovation, and Society

Writings by undergraduates at the University of Pennsylvania, exploring the impact of emerging technologies on the future of politics, societies, and the world. Posts reflect only the views of individual authors.

Technology, Innovation, and Society Penn

Written by

Account for the undergraduate seminar “Emerging Technologies and the Future of the World” at the University of Pennsylvania.

Technology, Innovation, and Society

Writings by undergraduates at the University of Pennsylvania, exploring the impact of emerging technologies on the future of politics, societies, and the world. Posts reflect only the views of individual authors.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade