“There is no Universe”

Saving the foundations of mathematics comes at a huge price

Detail of an ‘antennae’ formed by the Mandelbrot set

Math has a reputation for certainty. Take, for instance, the phrase “put 2+2 together”, which is universally held to mean “draw the obvious conclusion”. Math is regarded as a language of precision, logic and reason. It provides the bedrock upon which all of our best theories of reality are constructed.

Perhaps then it is hard to imagine that just over a hundred years ago a discovery was made which challenged the very foundations of mathematics. A discovery simple enough to lend itself to a very intuitive, everyday analogy; yet profoundly paradoxical enough to avoid resolution for the best part of two decades. Even today, the implications of this discovery have lead mathematicians to proclaim — in full seriousness — that ‘there is no Universe’.

That discovery was Russell’s Paradox, named after the discoverer, mathematician and philosopher Bertrand Russell. In this article, we will introduce the paradox and see its solution, via a quick tour of set theory. Let’s begin with a famous analogy…

The Barber Paradox

You may well have encountered the following riddle:

“There is a barber who shaves the head of all those who do not shave their own head, and only those. Who shaves the barber’s head?”

The paradox here is one of self-reference, and the key is in the strict interpretation of how the barber is defined. If the barber does shave his own head, then he no longer shaves only those who do not shave their own head.

However, if the barber does not shave his own head, then he no longer shaves the head of all those who do not shave themselves. Either way, whether he shaves himself or not, the barber cannot fulfil both parts of the definition. Hence the paradox.

There is an obvious solution here though — we have no reason to assume such a barber must necessarily exist. If we simply argue that the definition “all and only” is impossible in this case, then the paradox is no more. Such a barber simply cannot exist, and there is no obligation to assume otherwise.

The trouble is, in the set theory version of the paradox, dismissing the ‘definition’ as impossible isn’t so straightforward. It meant dealing a cruel blow to the life’s work of logician Gottlob Frege.

Frege’s work just so happened to be the grandly-titled Grundgesetze der Arithmetik (‘Fundamental Laws of Arithmetic’), which aimed to provide a logical foundation for basically all of mathematics.

It took nearly twenty years to patch things up, and in the process, whole new fields of mathematical research opened up. Let’s take a closer look.

Naive Set Theory

In order to understand the true profundity of Russell’s Paradox (and the solutions subsequently proposed), we must first cover the basics of a branch of mathematics called set theory.

The basics of set theory are almost so simple as to raise suspicion. Do you remember Venn diagrams from junior school? That’s set theory, drawn.

At the turn of the 19th Century, set theory was an up-and-coming branch of mathematics, having been pioneered by Georg Cantor. Today, it is often used as a foundation for all of mathematics.

At the core is the concept of ‘a set’: a collection of distinct objects or ‘elements’, whose membership is defined in some meaningful way.

The language of set theory

Mathematicians use precise notation to describe sets. The symbol ∈ is used to show an element belongs to a given set. Likewise, ∉ is used to show an element does not belong a set.

There are symbols for a range of useful logical connectives:

  • Negation “NOT x”: ¬x
  • Conjunction “x AND y”: x ^ y
  • Disjunction “x OR y”: x y
  • Implication “x implies y”: x y
  • If and only if: x y

There are also symbols mathematicians use to quantify statements:

  • The existential quantifier “There exists x”: x
  • The universal quantifier “For all x”: ∀x

These can be combined to make sentences, such as:

xy(y = x + 1)

Which reads as “for all x, there exists y, where y equals x plus one”. When x and y are numbers, this statement is true.

Sentences like this can be used to build mathematical theories.

Defining sets

A set can be defined using the following curly brace notation:

S = {: P(x)}

Here, P(x) is what is known as a predicate function, which is basically a “membership rule”. More precisely, a predicate function is a function which can return a Boolean value (i.e., “true” or “false”) depending on whether its argument meets a specific condition.

The colon ‘:’ is read as “such that”.

Therefore, the notation above reads as: “S is the set of all objects x for which the predicate P(x) returns ‘true’”.

The choice of predicate determines which elements can and cannot be members of a given set. In a sense, it can be thought of as being a ‘membership requirement’, where elements which ‘pass’ are members of the set, and those which ‘fail’ are not.

For example, we can have:

  • The set of all integers (often denoted by ℤ),
  • The set of all non-integers {x : x ∉ ℤ },
  • The set of all negative integers {x : (x ∈ ℤ) ^ (x< 0)},
  • The set of all dogs {x : (x has four legs) ^ (x barks)}
  • The set of all non-dogs: {x: ¬(x is a dog)}

The number of elements in a set is referred to as the set’s cardinality. This is usually denoted as |S| for a given set S. The set of ‘alphabet letters’ has a cardinality of 26.

One important set is the empty set, often denoted by the symbol ∅. This set has a special significance, in that it is uniquely well defined — it is the only set with no elements (i.e., its cardinality is zero; or |∅| = 0). The empty set’s existence can be asserted by the following logical sentence:

xy(y x)

Or, “there exists a set x such that, given any element y, y is not a member of x”.

Other sets important enough to have their own conventional symbols include:

Set operations

Sets can be joined together to define new sets. We can define the following binary relations:

  • “The union of A and B”, A ∪ B
  • “The intersection of A and B”, A B
  • “The complement of B in A”, A \ B
  • “The difference between A and B”, A~B
  • “The symmetric difference between A and B”, A ∆ B
  • “The Cartesian product of A and B”, A x B

Sets within sets

Another important concept in set theory is that sets can belong in sets!

For instance, consider the ‘set of shapes’. Within this set are, amongst many others: the ‘set of circles’, the ‘set of squares’, the ‘set of triangles, the ‘set of hexagons’. The larger set contains many smaller, well-defined sets.

These smaller sets are subsets of the larger set. Likewise, the larger set is a superset of each of the smaller sets.

In general, if X is a subset of Y, we can write:

X ⊆ Y

We can come up with examples of sets which belong in themselves. For example, consider the set below:

S = { x : |x| > 1 }

This is the “set of all sets which contain more than one element”. It contains (amongst others) the ‘set of prime numbers’, the ‘set of even numbers’, the ‘set of all triangles’… and so on.

This means it contains more than 1 object, and therefore fulfils its own membership criteria. Therefore, S is a member of itself (i.e., S ∈ S)!

Likewise, sets can also not belong in themselves. The ‘set of all integers’ is itself not an integer.

So far, so good — but here is where things get messy…

Russell’s paradox

The notation S = {x : xx} defines a set which raises the following question:

Let S be the ‘set of all sets which do not contain themselves’. Is S a member of S?

Is S a member of itself? Let’s try both possibilities:

  • Yes, S ∈ S → then S fails its own criteria for inclusion!
  • No, S ∉ S → then S meets its own criteria for inclusion!

See how we have encountered a paradoxical situation?! There is simply no valid answer. And unlike the barber analogy, we cannot simply dismiss the paradox by claiming S is impossible. On what grounds would we do so?

The way we have defined S is exactly the way we’ve defined every other set so far — a collection of objects, whose membership is determined by a logical predicate. We haven’t broken any of the rules laid out by the ‘naive’ version of set theory we discussed before.

In math, the principle of reductio ad absurdum dictates that any statement which implies a contradiction (or ‘absurdity)’ is usually dismissed as false. In this case, the basic rules of ‘naive’ set theory appear to do exactly this!

This is a strong indication that these rules alone are not sufficient.

Gottlob Frege (L) and Bertrand Russell (R)

The search for a more complete list of rules ensued after Russell discovered his eponymous paradox in his 1901. He wrote about it to the unfortunate Gottlob Frege in 1902. Frege’s response was to add a rather tragic appendix to his life’s work:

“Hardly anything more unfortunate can befall a scientific writer than to have one of the foundations of his edifice shaken after the work is finished”

Russell elaborated on the paradox in his 1903 work ‘The Principles of Mathematics’, which was the precursor to his and Whitehead’s epic ‘Principia Mathematica’, published the following decade.

Once the paradox was published, the race was on to find a way out. This lead to the development of axiomatic set theory — an effort to rewrite the rules so that mathematicians could avoid paradoxes and other such inconveniences.

Finding an escape route

Ernst Zermelo was a German-born mathematician who rose to prominence in the early 20th Century by solving a long-standing set theory problem called the ‘well ordering’ theorem.

Indeed, set theory became something of his speciality. After learning of Russell’s paradox in 1904, Zermelo began working on providing a new system of rules (or ‘axioms’) to improve upon ‘naive’ set theory.

Zermelo’s work was slow to become accepted, mainly because he couldn’t prove his system was any more consistent than naive set theory. However, it clearly showed enough promise, as by the 1920s his system had been developed further by other mathematicians. It ultimately became the basis for our modern understanding of set theory.

Ernst Zermelo (L), Abraham Fraenkel (C) and Thoralf Skolem (R)

In particular, important contributions were made by Abraham Fraenkel and Thoralf Skolem, which eventually lead to the development of what is now known as Zermelo-Fraenkel set theory. It was this reworking of the fundamental axioms of set theory that provided mathematicians with an escape route from Russell’s Paradox.

The solution was made possible by the introduction of one axiom schema in particular: that of restricted comprehension.

To better understand how this resolves Russell’s Paradox, we first need to take a look at unrestricted comprehension.

Anything goes?

We have (kind of) already seen a definition for unrestricted comprehension — at least, informally.

Above, we looked at how a set can be defined as a collection of elements that return “true” for a given logical predicate. Leaving it at that (i.e., without adding any restrictions) gives rise to Russell’s Paradox, as we just saw.

S = {x : P(x)}

This is the axiom schema of unrestricted comprehension. We are allowing any set to be defined by the elements which return ‘true’ for a given predicate.

[Axiom schema refers to the fact that rather than defining one specific axiom for one specific predicate, we are defining a ‘blueprint’ axiom that holds for any given predicate.]

To avoid Russell’s Paradox, it is clear we need to add some restrictions to the way in which we can define a set. A quick fix would be simply to allow sets to be defined as above, but make an exception in the specific case where the predicate gives rise to Russell’s Paradox (i.e., when P(x) = xx).

However, singling out just one case seems a bit messy — especially for a theory as fundamental as set theory.

Ideally, we want our axioms to be as general and as few as possible. This ties into philosophical notions of mathematical beauty or elegance. If our axiom system requires lots of ‘fixes’ and special exceptions, it suggests we haven’t yet struck upon an abstraction foundational enough.

Luckily, Zermelo-Fraenkel set theory gives us another solution, by defining the axiom schema of restricted comprehension. It is often referred to as the axiom schema of specification, for reasons that will become clear shortly.

∀A∀P∃B∀C[C ∈ B ⇔ C ∈ A ^ P(C)]

This is a logical sentence, like the ones we’ve seen already. It reads as:

“Given any set A and any predicate function P, there exists a set B, where given any set C, C is a member of B if and only if C is a member of A and predicate P(C) is true”

This may still be a little unclear, but we can paraphrase:

  • “Given any set A and predicate function P, there exists a set B…
  • such that for any set C…
  • C is a member of B if and only if it is a member A and P(C) is true”

In plain English — every set must be defined as a subset of another specified set, as well as returning ‘true’ for a given predicate function.

Be specific!

In other words, this axiom schema is a game changer. It alters the rules of how we define sets.

Previously, we could define a set by making reference only to some predicate function, P(x). If P(x) returned true, then x was a member — no further questions asked.

Now, membership of a set has one extra requirement. As well as passing a predicate P(x), an element also has to belong to a specified pre-existing set.

This means, our ‘template’ for defining a set is no longer:

S = {x : P(x)}

But is now:

S = {x Y : P(x)}

where Y is some specified superset of S.

Paradox resolved?

Now, let’s see what happens if we try to conjure up Russell’s paradox with our infamous predicate P(x) = xx.

S = {x ∈ Y : xx}

where Y is some arbitrary superset of S, as required by the axiom schema of restricted comprehension.

What happens if we check whether S belongs in itself?

To be a member of itself, S must meet two ‘membership’ requirements.

  • “S must be a member of Y”, or S ∈ Y
  • “S must not be a member of itself”, or S ∉ S

Let’s start with the first of these requirements, S ∈ Y. Clearly, if S is not a member of Y, it has already failed the ‘membership test’. If so, we can simply show that S cannot be a member of itself. Case closed!

But what if we assume S is a member of Y? If so, it passes the first requirement. Let’s look at the second requirement.

For S to be a member of itself, (i.e., S ∈ S), it must pass the second requirement, which is S ∉ S. In other words, “S is a member of S if and only if S is not a member of S”. That doesn’t sound right…

We can rewrite the above as the following logical sentence:

S ∈ Y ⇒ S ∈ S ⇔ S ∉ S

Which ‘translates’ as:

  • Claim: “S is a member of Y”
  • “If Claim is true, then S is a member of S…
  • …if and only if S is not a member of S”

We’ve hit a contradiction! This sounds very much like the paradox we encountered before. Crucially, this time there is a way out.

  • The logical sentence above is of the form: “Claim implies contradiction”.
  • Therefore, we can dismiss the claim as absurd, and hence false.

And if the claim S ∈ Y is false, then S ∉ Y must be true. S cannot a member of Y.

And if S is not a member of Y, then it has failed one of its own membership criteria. S cannot be a member of S. No paradox invoked.

The axiom of restricted comprehension effectively provides us with a ‘back-up’ scenario that gets us out of trouble when we reach the inevitable contradiction.

However, this axiom does bring about a non-trivial consequence. As phrased by the late Paul Halmos:

“There is no universe”

“There is no Universe”

Fortunately for astronomers, we are talking of the mathematical universe. Specifically, this is a proposed ‘universal’ set that might be defined as the ‘set of all sets’. As a logical sentence:

xy(y x)

or, “There exists a set x, such that for any element y, y is a member of x”.

The issue is, such a set would contain literally everything in mathematics, and this would make it incompatible with the axiom schema of restricted comprehension.

For instance, suppose the universal set U does exist. That is to say, the following logical sentence is true:

∃U∀y(y ∈ U)

Then we could refer to it when defining any other set in accordance with the axiom schema of restricted comprehension. For instance:

S = { x ∈ U : x x }

Because U is the ‘set of all sets’, any set x surely has to be a member, meaning the first 'membership requirement’ is effectively a free pass. This gives rise to Russell’s paradox again!

As we saw above, a set defined in this way has to meet two ‘membership requirements’.

  • “S is a member of U”, or S ∈ U
  • “S is not a member of itself”, or S ∉ S

Exactly as before, we can show how assuming S ∈ U results in an absurdity.

  • S ∈ U ⇒ S ∈ S ⇔ S ∉ S

or, “If S is a member of U, then <contradiction>”

This means we have to take S ∈ U as false. In other words, S is not a member of U. But this leads to another contradiction!

If S is not a member of U, then the sentence:

∃U∀y(y ∈ U)

must be false, since we have found an element which cannot be a member of the so-called ‘universal’ set.

In logical notation:

∃U∀y(y ∈ U) ⇒ ∃S[(S ∈ S ⇔ S ∉ S) ∨ (S ∉ U)]

Which reads as:

  • “The existence of the universal set U implies that…
  • …there exists a set S such that…
  • …either S is a member of S if and only if S is not a member of S…
  • …or, S is not a member of U”

Since we know “S is a member of S if and only if S is not a member of S” is absurd, the statement simplifies to:

∃U∀y(y ∈ U) ⇒ ∃S[S ∉ U]

Which reads as:

  • “The existence of the universal set U implies that…
  • …there exists a set S, such that S is not a member of U”

Which is again, a contradiction! The universal set cannot exist. We are left to conclude that the claim “there is a universe” is absurd.