Self-Improving AI and the Need for GPU Regulation

Sam Stone
Structured Ramblings
6 min readApr 1, 2024

We need better AI regulation, fast. The software-focused rules policymakers are focused on today won’t get us there.

On March 18, Nvidia unveiled its new Blackwell graphics processing unit (GPU) that will enable training the largest AI models four times faster than today’s state-of-the-art. AI capabilities, already growing exponentially, will accelerate even more to a pace that is hard for most people to comprehend. This makes the need for additional Federal regulation of AI even more urgent, but today’s policymakers are focused on the wrong problem

We are in a unique, finite period where AI is not yet capable of self-improvement, but is fast approaching that potentially irreversible threshold. Regulation and control of systems approaching self-improvement ought to become the overwhelming imperative for federal AI policy. Software regulation — the primary focus of Biden’s executive order on AI — is, at present, a seductive distraction. Federal AI regulation must instead focus on hardware, specifically high-end GPU networks, to monitor and restrict access to AI systems approaching the ability to improve themselves.

Policymakers obsess over today’s AI systems increasing discrimination or taking jobs. While these are real risks, they are not novel or existential. Older, unintelligent algorithms present the same risks at lower magnitude. In contrast, self-improving AI is a fundamentally novel risk. It’s like a “chain reaction” over which we might lose control — with existential consequences. A January 2024 survey of widely-cited AI researchers estimates a 10% chance of “human extinction or similarly permanent and severe disempowerment of the human species” from losing control over advanced AI. We must regulate now — before the first “chain reaction — to ensure that the organization that builds the first self-improving AI is aligned with US interests.

Self-improving AI could rapidly acquire new digital and physical capabilities, and learn to evade human control. Last month, Google released Gemini 1.5, a model capable of understanding and improving codebases tens of thousands of lines long. While Gemini’s codebase isn’t public, it’s likely too long for Gemini to improve autonomously — yet. We may be only months from a model that can improve its codebase, triggering autonomous AI self-improvement.

But there’s hope. Nuclear nonproliferation efforts illuminate a path — hardware regulation — for managing a new technology with existential risk. During WWII, when US scientists first saw a potential path to a nuclear chain reaction, the government adopted policies to restrict access to the hardware to produce fissile material. They didn’t wait for proof a nuclear bomb would work to obstruct bad actors. [1]

Critics of this approach to nuclear regulation argue regulation slowed innovation. That’s true. But the US’ failure to expand the benefits of peaceful, clean nuclear energy, stemmed not from insufficient innovation, but rather from public opposition in the wake of accidents, like Three Mile Island. It’s an example of how lax regulation and early failures to protect public safety can quickly undermine society’s tolerance for subsequent innovation.

The wisdom of a hardware-based approach to AI regulation stems from a few simple facts. First, improving any AI system requires two things: new software and hardware with which to train the new software. Creating new software is relatively simple. Significant recent advances in AI have resulted from small software changes, where existing code is simply trained on more GPUs. [2] Moreover, software is easy to modify and share, hindering government monitoring and control.

On the other hand, GPUs are costly physical objects well-suited to tracking, needed in high volumes, and dominated by US-based Nvidia. [3] This makes GPUs relatively easy to monitor and control. By restricting access to high-end GPU networks, the Biden administration can ensure that only organizations aligned with US interests, acting transparently, can train cutting-edge models. And GPU restrictions can directly prevent a “runaway” AI. A self-improving AI may learn to rewrite its code, but without GPU access, that new code is not deployable and the self-improvement becomes impossible to realize.

The Biden administration has taken steps in the right direction with controls preventing high-end GPU exports to China (amongst other countries). But Chinese companies continue to deploy cutting-edge models, likely via black market GPU imports. [4] Export controls for this hardware must be tightened fast, but the Commerce Department’s planned cadence of annual updates is too slow.

October’s Executive Order also initiated domestic monitoring of AI training. It requires reporting for organizations acquiring GPUs networks or training models when either the network size or the model size crosses a threshold. [5] The GPU threshold is more enforceable since GPU suppliers have been corralled into reporting GPU use — and there are only a few, US-based GPU suppliers. [6] But today’s GPU threshold is too lenient and thus ineffective. An organization could acquire a GPU cluster below the reporting threshold, run it at maximum capacity, and produce a model above the threshold in under 12 days. [7]

The best path to closing these loopholes is congressional legislation that expands reporting and solidifies enforcement. Until then, the Commerce Department’s January proposed rule to extend reporting to GPUs rented (not just acquired) by foreign companies should be adopted, and applied to domestic companies.

Eighty years into the nuclear era, we haven’t blown ourselves up. At the core of nonproliferation’s success is the hardware-focused Nuclear Nonproliferation Treaty’s “trigger list” of controlled materials. It devotes half a page to defining fissile material (what goes directly into a bomb) and thirty-five pages to the industrial components, like centrifuges, needed to produce fissile material. [8]

Where nuclear weapons require thousands of centrifuges, self-improving AI will require thousands of GPUs. The hardware-based regulation that worked for nuclear weapons can also work for AI. Smart regulation cannot prevent self-improving AI, but it can dramatically increase the chances it is developed by an organization aligned with US interests acting transparently and safely.

Footnotes

[1] Presidential advisor Vannevar Bush, a leader of the Manhattan Project, recommended uranium export controls in August 1942, months before the first sustained nuclear chain reaction in December 1942 proved the viability of a nuclear weapon. US Department of Energy Manhattan Project History.

[2] This paper details “emergent” (unexpected) capabilities in LLMs when they are trained with more compute, but the architecture and data are left unchanged. “Emergence” has been observed in arithmetic, translation, question answering, and other areas.

[3] This February 2024 paper from Baidu uses 12K GPUs for a 175B open-source LLM; frontier models like GPT-4 are suspected to use even more GPUs. Nvidia was estimated to have >90% of the global market for machine learning GPUs in 2023, per this report. The next largest ML GPU suppliers — AMD, Google, Intel — are also US-based.

[4] This February 2024 paper from Chinese company Baidu indicates that the company is likely still able to access large networks of Nvidia A100s, which are subject to export controls. In November 2023, Chinese company 01.AI released a 34 billion parameter LLM called Yi-34B that outperformed Meta’s highest-performing, open-source LLM, Llama 2. This article details allegations of how China is evading GPU export controls.

[5] See section 4.2(b) of the October 30, 2023 Executive Order.

[6] 15 companies that have voluntarily pledged to help enforce the Executive Order including Nvidia, Google, Amazon, and Microsoft, which together control well over 90% of the global GPU market, either directly (via GPU manufacturing) or via GPU cloud service provision. White House briefing.

[7] An organization could acquire a GPU cluster capable of just under 1020 floating-point operations per second (the second reporting requirement in EO section 4.2(b)), and run it for 11.6 days (~106 seconds), thus producing a model trained on more than 1026 floating-point operations (the first reporting requirement in EO section 4.2(b)). This assumes the GPU network is operating at its theoretical max capacity.

[8] See the November 28, 2023 Zangger Committee Consolidated Trigger List. Fissile material is defined on page 1 (Section 2); components for producing fissile material are defined on pages 5–43 (Annex Clarification Of Items On The Trigger List).

--

--