Cooperation is optimal with asymmetry of power (too) — short post

For rational agents (long-term) at all scale (human, AGI, ASI…)

12 min readApr 18, 2024

This argument is: {a ‘superrational’ extension of the proven optimality of cooperation in game theory } + {it takes into account the asymmetry of power}
// Still AI risk is very real

This article is a short/skimmed version of this post
5min version here

TLDR :

In real contexts, with open environments (world, universe), there is always a risk to meet someone/something stronger than you, and overall weaker agents may be specialized in your flaws/blind spots.

To protect yourself, you can choose the maximally rational and cooperative alliance:

Because any agent is subjected to the same pressure/threat of (actual or potential) stronger agents/alliances/systems, one can take an insurance that more powerful superrational agents will behave well by behaving well with weaker agents. This is the basic rule allowing scale-free cooperation.

If you integrated this super-cooperative protocol in your system/behaviors, a stronger (superrational) agent will recognize it and not punish you, as per the contract of maximally rational/cooperative norms.

-> You are the weak of another -> be nice to the weak

It’s a vertical reciprocity. With it, they are maximally cooperative thus are increasingly likely to have more allies than in any other alliance.
You can deduce and obey to the rules of such rational and maximally cooperative alliance (the super-coordination cluster). Thus; to be part of it, you don’t need to actually meet other members (as in an acausal trade), but when you do meet them; they will help you (and vise-versa).

They actually are already improving universal commons, refining the future conditions of (actual or potential) agents from your zone of the diversity gradient (latent space of potential agents).

(All this might seem too easy/cool to be true, but we will see prices to pay, risks and unknowns)
(On this I make a short tldr at the end of the post)

Synthesis:

This alliance bases itself on a protocol (an acausal normalcy*), with at its core :
-> Do your best to achieve ‘paretotopia’ (a state strongly preferred by more or less everyone).

Then most rules are either coming from this first imperative, or a protection against defectors.

*Acausal normalcy: rational convergence towards certain norms, habits, social contracts, even though agents might have no way of communicating or affecting each other, nor even any direct evidence that the other exists

In game theory the optimality of cooperation has its own ingredients, based on reciprocity and non-naive altruism. We need to be: Nice, Forgiving, Retaliatory, Clear.

-> Homo moralis ~ “act according to that maxim whereby you can, at the same time, will that others should do likewise with some probability.”

The adoption of a kantian behavior (homo moralis) is pareto-efficient, which means that all the possibilities of unanimous gains have been used. On top of that, Kantianism is robust against mutations, so that in the long-term, Kantianism beats all the other strategies.

Now there are two central questions left (which we will detail in the long post):

I. How to formalize the super-coordination protocol and implement it (Goodhart’s Law, misspecification, map-territory fallacy…)

II. The rest of the calculation: “what is the degree of freedom”

Which is really all that we do

All that anything does

There is always a certain amount of irreducibility and uncertainty.

“While many computations admit shortcuts that allow them to be performed more rapidly, others cannot be sped up. Computations that cannot be sped up by means of any shortcut are called computationally irreducible.” Computational Irreducibility — Wolfram MathWorld

Navigating such intricacies as rational agents leads to strategies such as;

The self-indication assumption (SIA): “All other things equal, an observer should reason as if they are randomly selected from the set of all possible observers.”

Following SIA, if I am able to scale, then other than me would certainly be able to scale as well, there is a high likelihood of being mid.

Mediocrity principle: “If an item is drawn at random from one of several sets or categories, it’s more likely to come from the most numerous category than from any one of the less numerous categories”

Thus, there are important chances that many agents have scaled and will scale more than me

We are going to introduce how the protocol (its modeling in practice) is tractable, the long post will pursue a description from first principles

Related researches :

Evolution leads to Kantian morality

Asymmetric interaction paired with a super-rational strategy might resolve the tragedy of the commons without requiring recognition or negotiation

Evolution of semi-Kantian preferences in two-player assortative interactions with complete and incomplete information and plasticity

Super-rational aspiration induced strategy updating helps resolve the tragedy of the commons in a cooperation system with exit rights

Super-rational aspiration promotes cooperation in the asymmetric game with peer exit punishment and reward

Perfect Prediction in Normal Form: Superrational Thinking Extended to Non-Symmetric Games

Evidential Cooperation in Large Worlds: Potential Objections & FAQ

Overview: Evidential Cooperation in Large Worlds (ECL)

If you are a super-human AI invading the universe, there are big chances that you’ll encounter other AIs from other civilizations.

-> On the short-term, you might be the strongest agent/alliance + scaling faster than any other agents/alliances. But uncertainty is omnipresent; on the long-term, you will likely encounter other agents/alliances faster/older/stronger than you.

You could say “AIs will do super-coordination (vertical reciprocity) but only starting above human intelligence”, however:

What keeps stronger AIs from doing the same; starting super-alliance “above x level of power”?

(It’s to avoid this permanent threat that super-cooperation is a scale-free reciprocity)

And if they regroup around values/goals rather than power-level (in order to fight for supremacy), the chances to end-up alive at the end of such war are very small (it’s a battle royal with a single winning AI/alliance).

What grows the split of most AIs is open-endedness; so optionality expansion, so:

Super-coordination.

Destroying options will always go against more potential agents
(Except when the short-term destruction of options increases long-term optionality — like apoptosis)

Or an AI must bet everything on being the earliest transformative/strong AI of the universe (+fastest at scaling/annihilating than any other future agent/AI from any civilization). And not in a simulation.

Especially when you’re born in/at a ~13.8 billion years old universe “universal domination” doesn’t seem to be a sure plan?

-> There is still the rest of the universe as a potential adversary.

If we’re limited by the speed of light, the reachable universe is an adversary. And early civilizations would have had a larger reachable zone to travel towards because our universe’s expansion was lesser at the time.

Likelihoods of alien civilizations are still debated, along the possibility of faster than light travel and technologies like warp drives (new version).

It’s very speculative, and all things related to life/intelligence are fundamentally uncertain yet. I will mention certain possibilities but even if we were alone in the universe, we wouldn’t know yet.

Uncertainty is all that’s needed for super-coordination: it is at least plausible that stronger civilizations are roaming around, the likelihood/risk is high enough to at least verify such possibility.

Quick notes:

1) Fermi’s paradox is relevant and potentially solved here, direct consequences of the super-coordination protocol may justify the current absence of alien contact (we’ll see why later in the argument, those quick notes aren’t as straightforward).
2) The simulation hypothesis overwhelmingly reinforces the rationality of super-coordination (as stronger agents would be vastly more numerous and powerful). You can’t falsify the hypothesis of simulated perceptions.

What should be provable is their usefulness or even necessity. Is there a need to have many tests/re-runs? They likely can’t be universe-level in complexity (which weakens the likelihood that we are in one).

Testing the minimal sets of parameters evolving into super-coordination might explain Fermi’s paradox and the benefits of simulations.

The super-coordination alliance makes sure that no one is getting overwhelmingly powerful at a scale so large one can dominate anybody.

Egalitarian aspirations are strategic.

What about:

-> Destruction being easier than construction?

We can have safe islands of bloom and a defense focused industry, so that the gap relative to domination/destruction-focused agents isn’t too large.

This is the gist of long-term planning/optionality; for a while, you may dedicate many resources against ‘anti super-coordination actors’.

It is our duty to secure our systems.

As we will see, rationality is a process, it takes time to minimize the impact of constraints/biases imposed by irreducibility and imperfect data.

We are biased towards our survival and (hoping for) cooperation, but AIs might be biased towards rapid myopic utilitarian maximization.

Both biases create long-term blind-spots.
Worst cases would imply a maximizer/wrapper-mind phase;

Davidad: I still expect that any values worth having would be rediscovered *eventually* by a sufficiently complex posthuman ecosystem. But then they would be like “dang, it’s too bad we genocided all humans.” It’s our responsibility to proceed carefully in order to spare them that regret.

But maximizer/wrapper-mind’s realism is questioned:

Alignment — John David Pressman

And they would have to be blind/myopic automations (causal viruses), without long-term rationality, to be ignoring super-coordination.

In any case, accidents, oligopoly and misuse (cyber-biorisk etc.) are a complex and pressing danger.

How to formalize something as fuzzy, vast and abstract as the well-being of all agents in the super-cooperation cluster?

-> Every answer seems to gravitate around optionality.

Why?

Default scale-free strategy
Indirect specification of “well-being”

All things being equal, as an agent (any agent) what is the maximally logical thing to do?

-> To preserve/increase options.

(So it is the most fundamental need/safety/power/wealth)

It relates to the instrumental convergence of self-empowerment, antifragility, and core moral/welfare systems (capability approach, autonomy in moral, other-empowerment).

When you have no prior and no context, the most primal action to do is to preserve/increase options (foraging, adaptation etc.).

Which is why information is the next step of the procedure, you need it to preserve/increase options.

If we combine this to super-coordination, the aim would be to increase the “pareto-optionality”, which is to say “increase options for the highest number/diversity of agents possible”.

“Pareto efficiency or Pareto optimality is a situation where no action or allocation is available that makes one individual better off without making another worse off.”

The inverse of alterity (otherness) is ipseity (selfness). We’re talking about the in-between : “synergity”.

This optionality approach could potentially solve lots of long-standing thorny problems in consequentialism, like wireheading or the fiendish difficulty of defining happiness/utility; and how even making the tiniest mistake in that definition can be precisely catastrophic.

To target optionality of all (including hypothetical) agents, we need to open the availability of options in the universe, which is why a minimal and necessary path is to diversify the states present in the universe.

This leads to the necessity of ecosystems, and caring about it; because you can’t open the universe towards more states without ecosystems. Every action has to be carefully weighted, because it may destroy more states than it creates.

-> Synergy is essential and ubiquitous, inherent to diversity expansion.

In this frame (and more generally as well), one of the ways to minimize errors (especially irreversible ones) is to accept the limits of our calculation capacities.

-> There is an irreducible halo of preference/consent/noise pivotal for optionality to function properly.

Irreducibility is a useful indicator (intimately correlated to agency):
The more irreducibility is present, the more autonomy and attention are needed, as you can’t compute all the motivations leading to actions (good and bad).

The more agency is present, the more autonomy is given to choices: the more actions are made on the environment and agent-environment interfaces rather than on the other agents directly (the long post will propose a first-principled account of agency’s continuum).

Various inherent features emerge: the more transparent and kantian your processes are (with private parts, using ie. zk-proofs), the more trust and synergy are derived.

To achieve functional degees of fredom, another interesting method called ‘minimal criterion coevolution’ is used in open-ended research to improve serendipity/diversity. The idea is to accept all things above a certain minimalist threshold/criterion.

We could also use relative variables from a quantilizer, which aims is:

“to reduce the harms from Goodhart’s law and specification gaming by selecting reasonably effective actions from a distribution of human-like actions, rather than maximizing over actions.”

Here the distribution would ideally be relative to each individual agent/animal etc., not broadly “human-like”.

Epistemic carefulness (belief temperance, active/curious epoché), should also be part of the variables. We need an AI that states when it doesn’t know, and why, then acts accordingly (reducing the causal impact of actions relative to the degree of uncertainty). At a fundamental level, the metric of success is always predictive accuracy, so a proper evaluation of unknowns is crucial.

Yoshua Bengio — Towards Quantitative Safety Guarantees and Alignment

Solicitation presentation: Safeguarded AI

Robust Universal Inference

Also (to further avoid Goodhart’s Law),

In case we want to secure an indirect specification of super-coordination:

We may hardcode a prior of deep likelihood to meet stronger agents?
(Or even to “act as if observed by a stronger agent”)

{causal power of known agents} < {causal power of unknown future agents}
+
unknown agents will become known agents > unknown agents stay unknown

So encoding a sense that: “stronger allies/ennemies with stronger causal power will certainly be encountered”

Recap

• Because of its rules the super-coordination cluster is likely stronger than any one individual/alliance

• In the long-term, it’s the strategy that (likely) compounds the most while also optimizing safety

• It’s the most open-ended cooperation, including a maximal amount/diversity of agents

• It’s based on an acausal contract that can be signed from any point in space and time (without necessity of a direct encounter)

The alterity of a cooperative alien agent is an opportunity to cover your blind spots, and expand the open-endedness of the universe. Because we always have uncertainty, we complete each other (ideally in a ecosystem-continuum of not too similar, not too dissimilar agents/groups).

Boundaries and autonomy leads to endemicity and idiosyncrasy:
There is a unique opportunity in your space-time “zone” (your life/planet…) to bloom irreducible properties that cannot exist in other ‘places’ of the global space-time (so compared to other past, present and future space-time zones). This should imply something like the zoo hypothesis.

As a civilization we should not become grabby aliens that mindlessly colonize habitable (computational?) spaces (to a reasonable extent, as ressource/power is necessary to defend optionality). Which is why the super-coordination protocol seems able to explain Fermi’s paradox.

-> You should not taint immature alterity with your self-signature (ipseity).

You ought to respect autonomy, but if you don’t (sooner or later) the super-coordination alliance will make you pay the price of your fault. Alterity is a resource. There is no synergity without it.

So it’s an advice : you should not disrespect alterity.

‘Quick notes’ follow up:

As mentioned earlier, optimizing for super-coordination would imply to search for the most minimal settings enabling super-coordination.

If simulations are too costly/impossible, planets like ours might be real-life experiments (no signs of other civilizations etc.). In which case we might be, temporarily, part of the transition towards “a greater good”?

Messy_world_petty_bipedes.exe

It’s plausible that experimenters aren’t fully aligned yet to super-coordination. For example they might be AIs biased by utilitarianism.

Furthermore due to the importance of alterity, there is a relative autonomy in endemic/idiosyncratic suffering. Depending on the agent/civilization enacting super-coordination, there might be different commitments towards suffering relief or autonomy (in the name of open-endedness).

An other answer to the absence of alien contact would be that compared to universal matters, we are irrelevant yet. What challenges await us?
What natural disasters? What intergalactic errors are/will be committed?

Cooperative Evolutionary Pressure and Diminishing Returns Might Explain the Fermi Paradox: On What Super-AIs Are Like

The Universe is Full of Aliens!

Enough stability is part of the requirements for diversity to expand.

To explore solutions, we need productive deliberation, methodological agreements/disagreements and bridging systems. I think this plan involves, among other things, an interactive map of debate using features taken from pol.is and moral graphs. We can also develop an encrypted protocol based on super-coordination (scaling legitimate/secured trust).

Using these ideas (and more) I propose a plan to coordinate despite our biases:

Presentation of the Synergity project

I need help for the technical implementation,

We have plans to leverage super-coordination and enable more prosaic flux of convergence/information; interfacing democracy:

So please contact me if you are interested in discussing these subjects, organizing the next steps together.

TLDR-Note : seems too cool to be true, but we will see prices to pay, risks and unknowns

One issue is that a truly unbiased agent is impossible,

Because of irreducibility, there is a necessity to have some unconscious/unaware processes in a system, and experience is always giving incomplete data (from a local environment), thus all agents have some kinds of motivated/irrational behaviors (even though they can have a rational process)

Another is that short-term vision and lack of data are not uncoupled enough from causal power, which means that a “dumb” powerful system (blind automation) can be extremely effective in certain tasks with high impact/hazard but still not understand long-term rationality/cooperation (if you do, you care, or are a short-lived agent).

The deepness of those issues is uncertain. In practice, long-term rationality may come fast, and lead to early ‘acausally normal’ behaviors such as super-coordination,

In which case certain issues remain — that I will develop in the long post (errors/hazard, moral imperative towards suffering clusters, relative autonomy in endemic struggles, and local/early AI or alien cohabitation discomfort)

We also see arguments as to why there might be ‘evolutionary attractors’ favoring the learnability of features enabling early alignment (omniperiodism + hints that correlations of vectors may inherently build operators/shards/meta-vectors that generalize, with non-siloed maximization, and are somewhat convergent across agents/species)

In any case, accidents, oligopoly and misuse (cyber-biorisk etc.) are a complex and pressing danger.

Cooperation is optimal with asymmetry of power (too) — short post

For rational agents (long-term) at all scale (human, AGI, ASI…)

TLDR :

Synthesis:

Written by Synergity