Confusing Plausibility with Probability in AI-Safety

Published in

Philosophistry

2 min readAug 14, 2023

teapot floating in space — Russell’s teapot, generated by DALL·E

Is there an analog of the Lizardman’s Constant, but for scenario estimation? If someone tells you a plausible scenario that has no outside view, what is the immediate probability that your heart assigns? Is it 5%?

The problem with the standard AI Extinction line is that there is no good outside view and no great bases for inside views. Plausible stories then rush to fill in this vacuum. Scott Alexander, for example, gives a 33% chance to this sequence:

1. We get human-level AI by 2100.
2. The AI is misaligned and wants to kill all humans
3. It succeeds at killing all humans.

#1 has OK outside views. Moore’s Law will continue, or at least some version of it. The Scaling Hypothesis has some credibility, although not as much as Moore’s Law. And there is a “capitalism gets what capitalism wants” or “tech makes what tech wants” driver to the world economy. This last point is potentially countered by Scott’s own “1960: The Year The Singularity Was Cancelled” But overall, #1 is relatively clean.

#3 is also OK from an inside view. Eliezer Yudkowsky’s standard scenario is that an Evil AI could spawn a million instances of itself and accomplish whatever it wants. Never mind that such an amassing might be bottlenecked by GPU shortages, or…

Confusing Plausibility with Probability in AI-Safety

Written by Philip Dhingra