How do we know that all electrons are identical? Part 2

Domino Valdano
Physics as a Foreign Language
21 min readAug 17, 2017

In Part 1, I went over the Gibbs paradox, a paradox of late 19th century statistical mechanics whose resolution suggested that particles must be identical and indistinguishable at some level. This was the first clue, and got some people thinking about the issue — but it wasn’t really the last word.

In Part 2, I’ll complete my explanation of how physicists know that all elementary particles (such as the electron) are identical by delving into quantum mechanics, a fascinating area of physics that was discovered and developed over the first 3 decades of the 20th century (1900-1930). It should be entirely possible to read Part 2 without reading Part 1; even though they both have to do with why particles are identical, both are self-contained and neither is dependent on the other. Part 1 is basically the explanation as could have been understood circa 1900, while Part 2 is the explanation as understood by 1930 — after quantum mechanics had been completed.

In classical statistical mechanics, you can represent the different possibilities for the state of a system by probabilities. For example, if you know the temperature and pressure of a gas, there is a statistical distribution (called a “probability density function”) of the different particles making up the gas. These particles are bouncing around randomly. At a high temperature, you’re more likely to find an individual molecule of the gas moving rapidly; at a low temperature, you’re more likely to find an individual molecule of the gas moving slowly. But either way there is a whole range of possibilities.

In quantum mechanics, the same is true but it gets a little more complicated. The probability density function in quantum mechanics is given by the square of the magnitude of a complex function called the “wave function”. By complex I mean instead of a function of real numbers (x = 1, 2, 3.4, 9.8, etc.), it’s a function of complex numbers, each of which has a real and an imaginary part (z = 1+i, 2+3.5i, 4.8+9i, etc.) If you’ve never encountered this before, I’m sure it sounds really weird. But I can’t say much more other than: this is just how quantum mechanics works — it’s kinda weird!

So for example, if the wave function for an electron is 1/√2 at position x and 1/√2 at position y, then when you square these you get probabilities: the chances of it being found at position x is 1/2 and the chances of it being found at y is also 1/2. So you’ve got a 50/50 shot if you look for it in either location.

So far this is still identical to classical statistical mechanics. If you wanted to, you could also just represent the probability density function in classical physics by the square root of itself everywhere, and nothing would change. The difference is, in quantum mechanics the wavefunction acts less like a mental abstraction and more like an actual physical wave, in that it can exhibit interference.

Dark and light Interference fringes

Classically, probability waves don’t interfere with each other. Probability is always a positive number, so if two different particles of a gas each have a probability p for being found at location x, then the probability of finding either of them there is just 2p. Classically, the probability for different events happening (or different measurement outcomes occurring) always adds to each other, it never subtracts.

But in quantum mechanics, it’s the wavefunction itself (rather than its square) which acts as a wave. And since the wavefunction at each point can be any complex number (including positive or negative real numbers), sometimes when you combine different possibilities the probabilities add but other times they subtract! When subtraction occurs — for example if the probability for two different events completely cancels out making it impossible for either to happen — that’s called quantum interference.

Let’s assume we have 2 electrons and that there are only 2 locations where each electron could be found, location x or location y. If the two electrons were distinguishable, then we could label them “electron A” and “electron B” and this would mean there are 4 possible states the 2-electron system could be in. Either both A and B are at x, both are at y, A is at x and B is at y, or B is at x and A is at y. To summarize, we have AB = xx, yy, xy, or yx. A common notation for representing states like this in quantum mechanics is to use angular brackets: |xx>, |yy>, |xy>, and |yx>.

But scientific research in the 1920’s demonstrated a surprising fact: a system of 2 electrons like this cannot be in 4 different states, there’s only 1 possible state it can be in!

Part of the reason for that you should be able to guess: if there is no way to distinguish electron A from electron B, then states |xy> and |yx> are identical. They are just two different ways of representing the same physical state. Either way, there is 1 electron at position x and 1 at position y.

But that still leaves us with 3 states, not 1 — what’s wrong with having a state like |xx> where both electrons are at position x, or |yy> where both are at position y? It turns out, more than 1 electron can never occupy the same state. In 1925, Wolfgang Pauli proposed this principle — now known as the Pauli exclusion principle — and in 1940 he was able to prove using quantum field theory that it applies not just to electrons but to all particles of a certain type (those with half-integer spin — electrons have spin 1/2).

Wolfgang Pauli

It would take me too far off topic to give a full explanation of what spin is in this post (if you want to know more, you’re encouraged to read my explanation of spin-1/2 here on Quora, which they just notified me was emailed out to over 100,000 people yesterday). But it turns out, all quantum particles fall into 1 of 2 categories: fermions or bosons. Fermions have half-integer spin and obey the Pauli exclusion principle, while bosons have integer spin and do not.

Fermions tend to have more “matter-like” properties. For example, electrons, protons, and neutrons are all spin-1/2 fermions. They are what make up the building blocks of matter (atoms, molecules, etc.) You may have heard somewhere or another that matter cannot occupy the same space at the same time. This is due in part to the Pauli exclusion principle (as well as electrostatic repulsion between different atoms).

Bosons tend to have more “radiation-like” properties. For example, photons — the particles responsible for light and other electromagnetic radiation (radio waves, microwaves, wifi, UV, x-rays, gamma rays, etc.) — are spin-1 bosons. The Higgs boson discovered at LHC in 2012 is a spin-0 boson. And most theoretical physicists believe that gravity is mediated by a spin-2 boson called the graviton, although this has yet to be detected in a laboratory.

The Pauli exclusion principle is not just an axiomatic rule, it’s a conclusion that can be derived from our best fundamental theories of physics. In fact, it requires both Einstein’s theory of special relativity combined with quantum mechanics in order to fully derive the Pauli exclusion principle as a conclusion. Because of the way in which spin works, the wavefunction of 2 bosons is always forced to be “symmetric” whereas the wavefunction of 2 fermions is always forced to be “antisymmetric”.

In this context, symmetric simply means that if you interchange the locations of two bosons then nothing happens — you get back exactly the same state. Antisymmetric means something similar but not quite: if you interchange the locations of two identical fermions then you get back the same state but with a minus sign in front of it.

Quantum mechanics is done in a type of vector space called a “Hilbert space” where whenever you have 2 states, there is another state which can be formed from them by adding them together in a “linear combination”. For example, if |xy> and |yx> are both states in the Hilbert space, then |xy>+|yx> is also a state in the same Hilbert space. And so is |xy>-|yx> or any other linear combination like 3|xy>-2|yx>. This way of combining states in quantum mechanics is called “superposition”. Instead of definitely being at one location or definitely being at another, the electron has some chance of being at one and some chance of being at the other.

However, because these states represent a quantum wavefunction, and I mentioned earlier that the square of the magnitude of a quantum wavefunction is a probability distribution, the states must be normalized in a way such that the total probability of the electron being found anywhere adds up to 100% (or 1). Therefore, the coefficients in the linear combinations above have to be divided by an overall factor to normalize them.

Combining this with the requirement that fermionic wavefunctions must always be antisymmetric, it means that the only state these 2 electrons can be in (assuming there are only 2 possible locations for them) is 1/√2|xy>-1/√2|yx>. (Or the same thing multiplied by any complex number of magnitude 1, which is physically equivalent.) If we interchange the x and y in this, we get 1/√2|yx>-1/√2|xy> which is exactly -1 times the original state. Mathematically this is a different state in the Hilbert space, but physically it means the same thing. If you square the 1/√2 coefficients then it is telling you that there is a 1/2 chance that electron A is at x and electron B is at y, and a 1/2 chance that electron B is at x and electron A is at y. 50/50

What we did is take two states that are physically indistinguishable — |xy> and |yx>, and formed a superposition of them which has this antisymmetric property required of fermions. But what about the states |xx> and |yy>? These can never be made antisymmetric, because interchanging x with x, or y with y doesn’t change anything. Because they are intrinsically symmetric states, they just can’t exist for fermions — they only apply to bosons.

As you may have guessed, this means that for bosons there are 3 possible states they could exist in instead of just 1. For 2 photons which could be at location x or location y, the 3 different states they could be in is |xx>, |yy>, or 1/√2|xy>+1/√2|yx> — all of which are perfectly symmetric if you interchange x and y. (No minus sign.)

To summarize, a pair of distinguishable particles which can be at 2 different locations has 4 possible states they could be in. Whereas a pair of fermions has only 1 possible state, and a pair of bosons has 3 possible states. This leads to very different statistical behavior for fermions and bosons, and explains why a lot of the properties of the 2 kinds of particles are so different.

In an earlier post of mine, I told the story of how Max Planck’s study of entropy in the late 1800’s led to the initial discovery of quantum mechanics. During that same time period, there was already a big clue which had been around for a while — a known puzzle concerning Maxwell and Boltzmann’s version of thermodynamics (which later became known as statistical mechanics). Using the Law of Equipartition, classical thermodynamics predicted the wrong heat capacities for many gasses at low temperatures.

A “heat capacity” is the amount of heat something can absorb until its temperature has been raised by a fixed amount (usually, 1 degree Celsius). Some gasses are able to absorb a lot of heat (thermal energy) without raising their temperature much. While for others, exposure to only a small amount of heat will send the thermometer soaring. The theory behind this, according to Maxwell and Boltzmann, was that some gasses are better at absorbing and storing thermal energy than others because they have a greater number of internal degrees of freedom — these “degrees of freedom” serve effectively as containers within which the energy can be stored. The Equipartition Theorem (proposed by Maxwell and then proved more generally by Boltzmann) states that in equilibrium, every gas (or liquid, or solid) will have a total internal energy of 1/2 NkT. Where N is the number of degrees of freedom in that gas, T is the temperature of that gas, and k is just Boltzmann’s constant. In other words, the gas will have 1/2 kT of thermal energy per degree of freedom.

For example, if we have a gas of monatomic Hydrogen (monatomic means every molecule is a single atom), each atom has 3 degrees of freedom because it can move in one of 3 directions: up or down, left or right, and backwards and forwards (3 directions since we live in a 3-dimensional space). Thermal energy can be absorbed by a Hydrogen atom by increasing its kinetic energy in any of those 3 independent directions.

Image Credit: astarmathsandphysics.com

On the other hand, if we have a gas of diatomic Hydrogen molecules (diatomic means each molecule is composed of 2 atoms connected by a chemical bond) then there are more degrees of freedom (possible ways in which each molecule of the gas can move). In addition to the freedom to move linearly in any of 3 dimensions, it also has the freedom to rotate along either of 2 different axes.

Although 75% of the matter in the universe by mass is monatomic Hydrogen, most of the Hydrogen on Earth is diatomic Hydrogen. That’s because Hydrogen is only monatomic at the extremely high temperatures and pressures that exist inside stars (such as the sun). Under the range of temperatures found near the Earth’s surface, Hydrogen naturally pairs up into its diatomic phase. But what seemed weird in the 1800’s is that depending on the exact temperature, diatomic Hydrogen can have different heat capacities.

Image Credit: Hyperphysics

At room temperature, Hydrogen has a heat capacity per molecule of close to 5/2 k (or if it’s per mole instead of per molecule, this is written as 5/2 R as in the diagram). According to Maxwell and Boltzmann’s view of thermodynamics, this implies 5 degrees of freedom (actually 7, if you include 2 more degrees of freedom for vibrations). But the exact value at room temperature is about 2.47k. And as the gas is cooled down to below 0 Celsius (273K), it gradually drops down from 2.47k all the way to eventually settle at 1.5k. But 3/2 k would imply that it has only 3 degrees of freedom — in other words, that it’s a monatomic gas! Why would colder Hydrogen become a monatomic gas at low temperatures? And what does it mean to have a value in between 3 and 5 degrees of freedom? Heat capacity was supposed to be independent of temperature. There were similar known problems with the measured heat capacities of Oxygen and Nitrogen gas.

There were many proposed explanations for this puzzle in the 1800's, but nobody understood the full answer until the development of quantum mechanics. The full answer is that the ways in which rotational degrees of freedom can be excited in molecules are quantized. Classically, something can rotate at any speed no matter how slowly — so any amount of energy, no matter how small, could start something rotating. But in quantum mechanics, angular momentum is quantized so rotations can only happen in certain discrete increments. Either a molecule starts rotating rapidly, or not at all — there is no in between. Because of this, at low temperatures the average amount of energy that’s getting exchanged between random collisions of molecules is just too small to excite these degrees of freedom. At low temperatures, Hydrogen gas is still diatomic but the 3 translational degrees of freedom are the only ones that can be excited — there is just not enough energy to start the molecules rotating. Once the temperature gets up above a certain threshold, the typical energies involved in collisions become enough to excite rotations. The higher the temperature, the greater the probability for energies high enough to cause rotations; therefore the heat capacity gradually rises up to the level of what one would have expected for something composed of molecules with 5 degrees of freedom. If you keep raising the temperature further, eventually it gets hot enough to excite vibrations (imagine the bond between the atoms is like a spring, stretching and compressing alternately), which it turned out were also quantized. At very hot temperatures, diatomic gasses have 7 accessible degrees of freedom, which is what you would have thought was true at any temperature classically. Quantum mechanics provides a similar explanation for the heat capacities of Oxygen and Nitrogen.

Einstein proposed in 1906 that quantization could solve this apparent conflict between Maxwell and Boltzmann’s Law of Equipartition and the experimentally measured curves for the specific heats of diatomic gasses. And his hypothesis was confirmed in 1910 by Nernst when he measured the specific heats of various gasses to greater accuracy and found they agreed with Einstein’s theoretical predictions. This was one of the very first experimental tests of early quantum mechanics, and it passed!

But getting back to identical particles, there’s another way in which a quantum mechanical theory of gasses differs substantially from the old classical theory of gasses of the 1800’s.

If the individual particles of a gas were distinguishable, then when you cool the gas down to absolute zero they would all go down into the ground state — whichever state of theirs has the lowest energy. Usually, you would think the ground state is one where every particle is fully at rest and there is no kinetic energy, rotational energy, or any other kind of movement or internal energy.

But for a gas of fermions, their indistinguishability leads to the Pauli exclusion principle which prohibits more than one identical particle from going into the same state. Therefore, they cannot all be in the ground state. Often the energy levels a particle can occupy are represented by a ladder diagram, where each energy level is another rung on the ladder. Usually there is also “degeneracy”, where multiple states have exactly the same energy — in which case they can be represented by the same rung on the ladder as long as we keep track of the fact that there is degeneracy (multiple states) on that rung.

What happens when a gas of fermions (also known as a Fermi gas) gets cooled down to absolute zero is that each state of a given energy gets filled up, starting with the ground state and moving on up the ladder until all of the particles in the gas are accounted for and have a rung. Again, because of degeneracy, multiple particles can be on the same rung. But as long as the degeneracy is small compared to the total number of particles, this still means that lots of rungs will get filled up. Once you fill up all of the rungs with particles, the highest energy level that gets filled is called the “Fermi energy”.

In 1910, the same year Nernst confirmed the quantum theory of heat capacities for diatomic gasses, a new type of star was discovered by astronomers. By 1922 it would get named a “white dwarf”, but already in 1910 astronomers noticed it was different from ordinary stars and had some pretty strange properties. The puzzling thing about this kind of star was that it seemed far too dense for classical physics to explain how it was able to shine.

Sirius B (the tiny dot) is the closest white dwarf star

The mass of a white dwarf is similar to that of the Sun’s mass, and yet all of that mass is packed into a tiny ball that’s typically about the same size as the Earth. Considering the Sun is about 333,000 times as massive as the Earth, that means it’s an extremely dense kind of matter. At the time, it was far denser than anything physicists had ever seen or heard of, even though stars were supposed to be burning gasses of ions (also known as plasmas), not solid matter. If it was some kind of extremely dense solid, then why would it shine at all?

It turned out it was indeed a plasma, not a solid. But it was really really dense. No classical theory of gasses can explain how a gas could be this dense and not just collapse in on itself due to its own gravity. In 1926, R. H. Fowler correctly explained, using the mathematics of quantum mechanics, that white dwarfs are actually Fermi gasses rather than classical gasses.

In other words, a white dwarf is a gas of identical fermions. Specifically, it’s a gas of electrons. At high temperatures and low pressures, a gas of electrons behaves no differently than an ordinary classical gas. It doesn’t matter that the individual electrons are identical because there are many more states available than there are electrons. They have a large volume to move around in, and a lot of different kinds of ways in which they can move since the temperature is high enough. But cool the same gas down enough, or raise the pressure so that it gets packed into a small enough volume, and then the electrons start getting squeezed into the same states. Except that they can’t go into exactly the same state due to the Pauli exclusion principle. So they just fill the states roughly up to the Fermi energy and stop.

If they were distinguishable particles, then they’d have to all go into the same state and the energy would essentially be zero — no movement in the ground state. But because they are fermions, there is a “degeneracy pressure” which keeps them from going into the same state and avoids the whole thing collapsing due to gravity. The statistics of how fermions in this situation behave are known as “Fermi-Dirac statistics”, which only becomes similar to classical “Maxwell-Boltzmann statistics” in the limit of high temperatures and low pressures. Statistics in this context refers to what the probability is that each particle will have a given energy at equilibrium, as a function of temperature. Or another way of saying it: what is the expected number of particles which will be found at each energy level for a system after it reaches equilibrium?

You can derive Maxwell-Boltzmann statistics by counting how many different unique states the particles can occupy using combinatorics, and then finding out where this distribution of states reaches a maximum (also representing maximum entropy, aka equilibrium). For lower energies, the degeneracy is generally lower so there aren’t as many states. But if the energy of an individual particle is too high, then it reduces the amount of energy left over to be distributed amongst the other particles resulting in fewer possible combinations. So there is a balance, an equilibrium condition, where the whole system is at maximum entropy when the states of a given energy are filled with an expected number of particles N_i = K_i/e^(E_i-µ)/(kT)). K_i is the degeneracy; it represents how many states are at a given energy level E_i. The factor of e^(-E_i/kT) (where k is the Boltzmann constant and T the temperature) is known as a “Boltzmann factor”. The Boltzmann factor means as we move up the ladder of energy levels, the number of particles occupying each rung gets exponentially less and less (even though there is more and more space for them due to the degeneracy). But the temperature controls how fast this exponential drops off. The Greek symbol µ in e^(E_i-µ)/(kT) is called the “chemical potential” and is unimportant for now, but it represents how much the total energy of a system would increase if an additional particle were added to it. (For many systems, µ is 0 or approximately 0 so it’s often not even included).

As long as the gas is sparse enough that we don’t have to worry about two different particles occupying the same state (the expected N_i’s in all states are less than 1), then the same derivation works just fine for fermions or for bosons — it doesn’t matter, both lead to the same Maxwell-Boltzmann statistics. However, if you consider the case where the gas is very dense or at a low enough temperature, then suddenly it matters a lot whether the particles are fermions or bosons (or neither, which doesn’t actually happen in nature but could be imagined). For fermions, the expected number of particles occupying each energy level once you count the states and find their maximum is N_i = 1/(e^((E_i-µ)/(kT))+1) — this is what is known as Fermi-Dirac statistics. For the high density conditions in white dwarf stars, or low temperature conditions in other Fermi gasses, the chemical potential µ becomes important and it’s approximately the same as the Fermi energy discussed earlier (and for zero temperature it’s exactly the same). Note that the only difference between Maxwell-Boltzmann statistics and Fermi-Dirac statistics is the “+1” in the Fermi-Dirac formula. Such a slight difference, and yet it has such a huge effect on the way the matter behaves!

What about bosons? They don’t obey the Pauli exclusion principle, so wouldn’t a gas of bosons appear no different from an ordinary classical gas? Nope, bosons have their own set of statistics they follow known as “Bose-Einstein statistics”, which is different from both Maxwell-Boltzmann and Fermi-Dirac statistics.

Even though they don’t obey the Pauli exclusion principle, identical bosons are still different from distinguishable particles because the combinatorics are still different. Remember back when we were discussing the quantum states in a Hilbert space? For a pair of identical bosons each with only 2 available states, we saw that the pair has only 3 possible states they can be in instead of the 4 you would expect if they were distinguishable. The generalization of this is that for a set of N identical bosons with K available states, there are “N choose K-1” = (N+K-1)!/N!/(K-1)! different unique states they could be in, instead of K^N for distinguishable particles. (Where of course, the ! marks are mathematical factorial symbols as in part 1.) You can easily check that this works for my original example where N=K=2: (2+2–1)!/2!/(2–1)! = 3!/2!/1! = (3*2*1)/(2*1)/1 = 6/2 = 3.

Allowing every energy level to have a different number of degenerate states K_i, the formula has to be expanded to a product of a lot of factors each of the form (N_i+K_i+1)!/N_i!/(K_i-1)! (same thing as above, just with i subscripts on them to distinguish the different energy levels E_i). After using calculus to find the maximum of this expression, the resulting equilibrium state can be identified as the one where there are N_i = K_i/(e^((E_i-µ)/(kT))-1) particles in each energy level E_i. This is the formula for Bose-Einstein statistics. Notice, the only difference between this and the Fermi-Dirac formula is that the +1 is now a -1! This makes all 3 of them easier to remember. Although usually for bosons, the µ is 0 because they can be easily created or destroyed — for example, photon number is not conserved in our universe, so they can appear and disappear with no cost when needed.

The formula for Einstein-Bose statistics was discovered by an Indian physicist named Satyendra Nath Bose, a year or two before Fermi-Dirac statistics was discovered and applied to white dwarfs. The story of how he came upon it is fascinating. He was giving a lecture in 1924 in British India (within what’s now called Bangladesh) on the “ultraviolet catastrophe”. The ultraviolet catastrophe was the name given in the early 20th century to the problem that nobody knew how to fully derive Planck’s formula for blackbody radiation from statistical mechanics, which I discuss at length in what is so far my most popular piece on Medium (the story of how Planck stumbled upon quantum mechanics by studying entropy).

Planck had been correct in pointing out that the key was to assume energy was quantized somehow, but he hadn’t succeeded in coming up with a perfectly clean derivation all the way from first principles, without including some ad hoc assumptions about the vibrational modes inside ovens. Bose was going through the process of demonstrating to the audience why starting from the basic combinatorics of states and energy levels, you end up with the wrong formula. Except that at the end, a strange miracle happened — he surprised himself and everyone by somehow accidentally ending up at the right formula. He looked back through what he’d done and realized he’d made a mistake — in counting the states up he had counted them in the “wrong” way. He had accidentally treated the photons as if they were all identical and interchangeable instead of distinguishable as had previously been assumed. After thinking about this more, he realized maybe he was on to something — perhaps it wasn’t really a mistake after all. He didn’t know who else to tell about it, so he decided to write a letter to Albert Einstein. Einstein was immediately very excited and helped him publish a paper on it.

Satyendra Nath Bose

So the first key to reproducing Planck’s formula was that light is quantized into individual packets of energy now called photons. But the second big key was that these photons don’t have any individual identities. Aside from some having different energy and momentum than others, they are all identical. With hindsight, this made a lot of Boltzmann and Gibbs’s earlier work on statistical mechanics make more sense. There had been a factor of N! thrown in to the equations in order to make the Maxwell-Boltzmann distribution work out right, and in order to make sure entropy scaled properly with volume. Gibbs was aware that this had something to do with treating the particles as if they were interchangeable, but nobody paid much attention to that or had really taken it to heart. Before Bose, generally everyone still assumed that particles would be distinguishable from each other on some level at least in principle.

Bose’s fortuitous mistake in Bangladesh allowed the entire world of physics to put the nail in the coffin for the idea that quantum particles each have their own identity. If they had, then there would have been more states and we would still have an ultraviolet catastrophe on our hands — the thermodynamics of distinguishable photons would never have been able to reproduce the blackbody radiation that’s been observed in blackbody ovens since the late 1800’s. Nor would we be able to explain why the sun or other light sources don’t radiate out an infinite amount of energy.

And that — my friends — is the story of how we came to know that all electrons are identical!

Please click the clap button if you found this informative, thanks :-)

--

--

Domino Valdano
Physics as a Foreign Language

PhD Theoretical Physics, UC Santa Cruz 2009, Advisor: Tom Banks