Ghost in the AI models ~ The Open Letter

Freedom Preetham
Autonomous Agents
Published in
10 min readMar 30, 2023

Many AI experts, industry leaders, tech wizards, ethicists, policymakers, and digital influencers recently signed an open letter to AI labs working on advanced AI systems.

The request was to pause all advanced AI training that is happening across the board which demonstrates human-level intelligence for at least 6 months. The reason stated was that “Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable.”

Here in lies the rub. The illusion of control. The belief that somehow you can do enough to be confident to continue to work on it after 6 months. The ghost in the system has its own agenda though.

First, where do I stand?

Before we get to the open letter, it is important to note that the control we believe we can put in place is quite unsettling to believe that it’s plausible and quite an illusion.

Recently Yann LeCun mentioned that when human-level AI arrives, it will not want to dominate humanity.

I find the idea of magically arriving straight into a superhuman AI without first going through bug-ridden versions to be severely error-prone and optimistic. When I say bugs, I mean complex hallucinations, ethical breakdowns, and uncontrolled segments in a black box.

We can’t even deliver simple software code without bugs, and yet we’re also releasing complex AI systems at the same time. The hallucinations of GPT4 (even after 8 years after the release of Transformers) should teach us something.

Some researchers (not all) seem to build on top of previous abstractions without a sound or fundamental intuition of how black boxes work, and they’re not able to derive a generalized mathematical rigor. They just sweep these problems under the rug and focus on empirical sciences. The cascading errors from such abstractions are like a carefully created house of cards.

The smell test used is “If I achieve numerical stability and the model converges during training and performs well for the held-out validations, then things are OK to release.” This approach has not changed for ages, and we still practice it as we work against time to outdo the competition instead of focusing on guardrails. What is worse, AI researchers who take time to stabilize the model first before meeting business needs get a bad rep, are pushed down the org charts, and even get fired for being too slow or not able to show monetizable progress in record time.

Pardon me, I have misled you with that overtly simplified statement on the comprehensive tests that we (must) do today. The tests that many advanced labs do are quite advanced as well.

I have written about what can be done that is plausible from a mathematical sense to buffer the impact here: Mathematically Evaluating Hallucinations in LLMs like GPT4.

But still, the stated evaluations do not even scratch the surface of robust guardrails which are extremely complex to create. This is not because we are lax or we need more time. It is because we humans are not capable.

Some say we need to develop AI models to control other AI models (Not like adversarial models that we already have, but governance models). Well, this is a good thought process but is subject to the same errors and limitations as discussed above.

The very first version of the “human-level” AI will have ghosts (errors) in the system even after we subject it to the most advanced tests we can subject it to. These ghosts are not obvious, visible, or tractable. They are polygenic and oligogenic in nature.

Think of them as small effect mutations per locus across a large amount of exploration space. So the effect per individual locus point will be insignificant and hard to measure but will be spread thin across a very very large vector space that cumulatively will be cancerous and devastating and hard to fix. THIS, is how AI dominance will emerge unintentionally.

Will humanity have a chance to recover and fix the problems? This is what keeps all of the people on the other camp worried while people like Yann make such lackadaisical statements.

Oh, the open letter makes sense then! No?

Au contraire! The open letter argues that advanced AI is hard to test and that it is important to take enough time to test before releasing the AI to the general public. It states “Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable.”

However, the letter fails to adequately address the improbability of fully testing advanced AI systems thoroughly. Unlike what it advocates, pausing development for 6 months will not significantly improve the situation. OpenAI, Google, Meta, Baidu and most advanced labs are already at the forefront and have tried to take steps to mitigate the risks associated with their AI systems, and it is likely that they think that these steps will continue to be adequate. While it may not actually be adequate, this is the best we can do so far. Math has to advance significantly and I do not believe that 6 months or 24 months is good enough. This letter is clearly preaching to the choir.

The letter comes across as a moral panic by a bunch of people playing the “reasonable doubt” card.

The only other way to panic is to ask for a complete halt to developing any more advanced systems completely. I would argue that completely stopping the development will be a more reasonable argument to stop the dominance.

In the letter, the proposal states “AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.”

While I agree with the premise that everything humanly possible to safeguard the systems should be done collectively with joint efforts from across the world, the challenge is that this will NEVER be enough.

6 months sounds like a plot. It is too meager to ask for any significant understanding to emerge in 6 months timeframe. It smells. like a competitive agenda that some capitalists are playing. Somehow I am not able to digest this open letter has true authentic intent.

There is no Kill Switch

The rate at which AI models are advancing is alarming. Add hyper-capitalism and competition to the mix, it’s an arms race to own bigger better AI models. Humans will not cooperate. Pausing for 6 months to figure out what to do is not going to help.

There are innumerable manifestos and AI principles that are already conjured by the society of greater minds. The Key Advanced Research Initiative for AI and the Asilomar AI Principles provides enough ground rules from the perspective of making state-of-the-art AI models more current, accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.

The challenge is that it is not like there is only one monolithic AI model that runs in a single lab on a single cloud or hardware which has a trip wire. These models are innumerable, and pervasive, and already have access to a vast amount of internet access through apps written on their APIs.

The open APIs and the house of cards are the problem.

They are in your phones, in your laptops, and by the virtue of API extensions, in your IoT devices. Before you know it, the IoT devices shall emerge to have human-level intelligence by acting as swarms.

Can you really find a kill switch for swarms? In the future?

What are the Alternatives?

I am sorry to sound the death bell. It is unfortunately dark.

Unless you are asking for a complete stop from further developing any more advanced AI models, it is too late in the game to ask for slowing down. No half a** measure is going to help.

I will argue that the way nature works is quite mysterious. If there is anything to borrow from the controversial recapitulation theory that “ontogeny recapitulates phylogeny”, we can surmise that phylogeny points towards homo-deity.

The idea of phylogeny pointing towards homo-deity is a fascinating one. It is the idea that the evolution of life on Earth has been a journey toward the creation of humanity who will create God-like creatures. I am basing this idea on the theory of recapitulation, which states that the development of an individual organism (ontogeny) follows the same stages as the evolution of its ancestors (phylogeny). And that phylogeny does not necessarily have to evolve organically.

If you are thinking. “Dude! I don’t understand what you are saying. Can you state something in simple terms what the alternatives are? instead of philosophy?”, then here you go,

I theorize that we should not pause or slow down. We need to actually go way faster. Let’s double down.

“Huh!? Wait a f****g minute, did you just say go faster? Double down? Are you insane?”

Yup. Double down, now. Actually, hasten the development of these models to create unstable systems which create “weak ghosts” that we can slay in the labs. Heavily funded through non-profits and philanthropic groups.

Let me try this again. Imagine that you create robust testing methods that make the AI models robust in the immediate future. I theorize that such systems shall also develop immunity to such testing methodologies where the nature of the bugs gets polygenic which cannot be found in the labs or by external testers or by policies or by writing poetic manifestos. (Such immunity arises from polygeny as against monogenic errors).

Your probability to believe that you are in control and your confidence to release such models to the general public increases. This will make us more and more confident, build cascading models on top of them until the polygenic errors catch up. And BOOM! Doomsday. You will have cancer in the system.

Robust polygenic ghosts with immunity to robust testing are harder to slay. Just like cancer.

Instead, we need to develop AI models with different layers of testing and allow them to grow (like a culture in a Petri dish) in the lab. Not just one single robust system, but many weak systems to get baselines.

This allows us to expose some weak ghosts in a reasonably advanced model, similar to a model organism or plasmids which are quite abhorrent in nature, and learn the effect of how they behave. A GPT-5 fully tested model (as per whatever is humanly capable) will have hallucinations that can render themselves for some sort of post-release test. Let’s get to this faster.

The need for a very large group of people testing such systems is what makes these labs release to the general public as an experimental release. But they quickly also go down the path of monetizing it by opening the APIs. This is dangerous.

So the constraints are that we need a large number of people to test the model but minimal detrimental effects.

The solution then is to hasten the development of such advanced models (with differential guardrails in place) and HEAVILY curtail the access to tester groups WITHOUT API access. As in, no one is allowed to build apps or subsequent house of cards on top of these models. All they can do is test the system through a well-controlled interface.

You incentivize the testers (can be from the general public) instead of asking them to pay money.

There is a mathematical reason why I propose this, not a philosophical one. Bayesian modeling, causal inference, and extreme value theories will need such weak-ghost baselines to find confounding variables.

The amount of adversarial evaluation, contrastive evaluation, counterfactuals, and negative evaluations you can do mathematically on such differentially immune polygenic models can render itself much better to find confounding factors of why the model behave the way they do (because we shall have baselines and can mathematically assess the vector spaces of differentially robust models for such polygenic cases). The model distribution shift will not only happen due to the training versus validation datasets but also due to numerical stability induced through differential testing and tuning for those tests.

In the absence of differentially tested models, it shall be very very hard to summon such weak ghosts. Weak ghosts are amenable, friendly, and easier to slay remember? They are like vaccines.

One can argue that testing by hastening such models and release to a larger group of testers can also be considered a template for robustness testing. Well OK. I do not have an argument on that, it might as well be.

BUT WHAT IT IS NOT, is that we are not pausing or slowing down due to moral panic. Instead, we are doubling down and course-correcting to think about this problem laterally and creating ghosts that are in our ability to slay and delay the stronger demons changing the course of phylogeny which is inevitable.

Listen, I am not concerned about the current generation of baby models such as GPT4 ,5 or 6. They pose minimal harm in the grand scheme of things to come. However, the real threat lies in the monetization strategies and open API access that AI labs have implemented. These practices will result in the widespread proliferation of malevolent entities and will trigger a series of cascading errors that will disseminate polygenic cancer throughout the system, leading to irreparable harm to humanity. Additionally, I am alarmed by the notion of magical thinking, where some people believe we can control these malevolent entities that we are unleashing upon ourselves and the world.

Oh, we should also dooms-day prep (I say this jokingly of-course) ;) Good luck humanity, I am rooting for you.

--

--