Quantum Bits, Superposition, and Entanglement, Mathematically Speaking

13 min readOct 2, 2021

When you hear about quantum bits or qubits for short, what comes into your mind? Let me guess, “a qubit can be in state ‘1’ and ‘0’ at the same time”, “a classical bit can only be in either ‘1’ or ‘0’ state, but a qubit can be in both states simultaneously“, or maybe something like that. Well, that’s not wrong, but not really accurate. Here we’ll see what a qubit is, what it means by superposition and also entanglement, mathematically.

Qubits

Before we go to the qubits, let’s first talk about what we call an -ordinary- bit or a classical bit (as it’s described by the classical physics). The name bit itself is a contraction of binary digit. A bit is an abstraction of physical thing to represent two possible values like “1” or “0”, “on” or “off”, “true” or “false”. Here we’ll use numbers “1” and “0” for its possible states, so a state of a bit is basically a number. Easy, right? Now, let’s talk about qubits. The qubit is a quite different animal (but ‘animal’ nonetheless), the state of a qubit is a 2-dimensional vector. You might scratch your head and say, “wait a second, a 2D vector?” Yes the state of a qubit is (or can be represented as) a 2D vector, and we’ll talk about this vector too here.

Physically, the state of a qubit can be stored on a photon, an electron, an atom, or other things that behave quantum-ly. In case of an electron, you may have heard that an electron can be in the spin down or spin up state, or in the superposition state (we’ll talk about the superposition later). In case of electrons, you need to note that they don’t actually “spin”, it’s just the physical property that has some kind of similarity with something “spinning”, for the lack of better layperson word for more technical term of “some form of intrinsic angular momentum”.

Here we won’t talk much about the qubit as a physical object, we’ll talk more about the abstraction or the mathematics of the qubit, likewise it doesn’t matter about a classical bit being a transistor, a switch, or a light bulb, we only focus on what we can do with the “0s” and “1s”.

Vectors

From the previous section we talk about a qubit having a state as a 2D vector instead of a simple number like “0” or “1”. So, what is a vector? You may know vector in your physics class as a quantity that has magnitude and direction, like a car’s velocity, 80 km/h to the north for example. Here, you might need to “unlearn” this vector definition, but no worries, there’s nothing fundamentally wrong with that definition. In mathematics, a vector has its rigorous definition, but naively you can think of a vector being an array of numbers that has some properties and “obeys” some rules, the dimension of the vector is how many numbers are there in that array.

We often say, a vector lives in a vector space, where the properties and the rules apply. The dimension of a vector can -not only- be 2, 3, or 4, it can also be so large, even infinite. Now, from the physics class you know that a vector is usually written with bold letter like v or letter with an arrow above it:

Here, we’ll introduce a new notation, commonly used in quantum mechanics, the Dirac bra-ket notation. It’s introduced by physicist Paul Dirac for the “math” manipulation in quantum mechanics. Using Dirac notation, a vector is written like this:

read: ket v

Intermezo: we use this notation for vectors in quantum mechanics for historical and convenience reasons. For the convenience reason, there’s another thing like the bra notation, together with the ket they can form the bra-ket (you know: bracket) symbol. The bra:

bra notation, the dot inside can be anything

The bra-ket:

The bra and the ket here, the dot inside can be anything

The bra and bra-ket above actually have their mathematical meaning, but we don’t really need it for this story. This is just to show why we use this notation. In the end of the day, the ket notation is just a different way to write a vector. Enough intermezo.

We talked about a qubit state being a 2D vector, what does it mean? It means that it‘s represented with an array of two numbers like this:

2D vector, the vector elements a and b are numbers

Let’s take an example of a qubit state. There are two special qubit states, which correspond to the “0" and “1” states of the classical bit. The qubit state that corresponds to “0” of the classical bit is usually denoted with |𝟶⟩, this is called a computational basis state:

It doesn’t matter where the elements 1 and 0 are in the vector, but by convention the ket 0 is this

As you may expect, another computational basis state is |𝟷⟩, that corresponds to the “1” of the classical bit:

For simplicity, you can think of the first row number is for the vector length in x-axis, and the second row number beneath for its length in y-axis, like you would normally do with a vector as a quantity that has both magnitude and direction. As you can see, these computational basis are orthogonal/perpendicular to each other.

Superposition

Now let’s talk about superposition, what does the superposition have to do with the vector we’ve talked above? First, computational basis states like |𝟶⟩ and |𝟷⟩ are just two possible states of a qubit, it’s no different with classical bit 0 and 1 respectively, but there are more states a qubit can be in than just those special states. In general, a qubit state is a 2D vector, so the state can be, for example like this

Computational basis states |𝟶⟩ and |𝟷⟩ are orthonormal (read: perpendicular to each other and ‘normalized’/’vector length is made 1'), we called it unit/normalized vector (image by author)

For the above example, the qubit state or the state vector can be written as:

You may remember on how to do some calculation like this for vectors

That qubit state above is what we call superposition, it’s basically a linear combination of |𝟶⟩ and |𝟷⟩. That’s it! Now, is it really correct to say the state above is simultaneously in |𝟶⟩ and |𝟷⟩?

In general, a qubit state can be written as

Qubit state in general. That fork-like greek letter in the ket notation is read psi.

Superposition and Amplitudes

More information, the coefficients a and b of a general qubit state above are called amplitudes. The amplitude for |𝟶⟩ is a, and the amplitude for |𝟷⟩ is b. The name ‘amplitude’ here comes from the fact that a quantum state is basically a wave, and a wave can be characterized by its amplitude. Note that we cannot measure/observe these amplitudes directly but we can manipulate it (by the way, manipulating these amplitudes is useful in quantum computation). Why is this important? Because the square of each amplitude is “apparently” the probability to get each state in the computational basis when it gets measured (in the end, when we measure the qubit state, we want it to “collapse” to either “1” or “0” like a classical bit). For example, a qubit in a superposition state

When we measure the qubit above, the probability to get the measured state in the computational basis |𝟶⟩ is the square of 1/√2, which is ½ or 50%. The same with |𝟷⟩. There’s an equal probability to get either “0” or “1”; in practice, if we have many qubits with this same superposition state, we would expect to get “0" and “1" 50% of the time after measurement. Another example, if from the beginning the qubit state is in pure basis state |𝟷⟩, the probability to get the qubit measured as |𝟷⟩ is 100%.

We can conclude that the sum of each squared amplitude must be equal to 1 or 100%, because it doesn’t make sense otherwise, right? Think about it. That means

This is what we usually call normalization constraint

This also means the vector state length is always 1. If quantum mechanics is the correct description of our physical reality, we will never find any quantum state that violates this constraint.

Now let’s ask ourselves, if someone gives us a single qubit (made in the superposition state with amplitudes a and b), can we know the values a and b by measuring the qubit and without that someone telling us the amplitudes of the qubit that he gave? The answer is no, because we can’t make a conclussion by just measuring that single qubit, we need to have many qubits with the same state and measure them to get the probabilities and hence deduce the amplitudes a and b using the constraint formula above with enough confidence.

One thing that we haven’t mentioned so far is that, the amplitudes a and b are actually complex numbers. A complex number has its real and imaginary (square root of negative 1) parts. If you’re wondering, that’s why we use the absolute/modulus symbols for the amplitudes in the normalization constraint formula, because they are complex numbers. Why the state has to do with complex numbers? Complex numbers are useful in describing the “phase” of the wave nature of a quantum state. For example, sine and cosine wave have different phase, look at how they start “waving” at the specific location of the wave. If you ever studied signal processing or if you have background in electrical engineering, you may be familiar with complex numbers to describe wave phase and rotation.

To give one more math jargon on this quantum vector state that can have infinite dimensions, can have complex number coefficients, some other related properties and rules, mathematicians usually say these vectors live in a vector space called Hilbert space (after mathematician David Hilbert). But don’t worry too much on these complex numbers, now we should get what it means by a qubit state and the superposition.

Computational/Measurement Basis

You can skip this part if you already know what it means by computational/measurement basis and the superposition “collapse”. Here we’ll talk about it mathematically and (a bit) physically on what it means and how it’s done.

First, we can think of a qubit in computational basis state |𝟶⟩ or |𝟷⟩ as an electron with spin down or spin up state. The superposition state means that the vector state can be anywhere between the state |𝟶⟩ and |𝟷⟩. When we say we measure the qubit state (electron spin in this case) in these computational basis, we “force” the final measured qubit state to be either “0" or “1", with the probability to get it measured in “0" or “1" corresponds to its amplitude like we’ve seen before. Take a look at these illustrations.

The left side is for the vector representation when the qubit state is in the pure basis state |𝟷⟩. When measured, it always “collapses” to the state |𝟷⟩ 100% of the time. The right side is the illustration of an electron in spin up state. The measurement device is “vertical” basis, if the spin aligns with the “arrow”, it’s spin up, otherwise spin down. In this case, the spin after measurement is always spin up. (image by author)

The left side is the vector representing the pure basis state |0⟩, when measured, it’s guaranteed 100% to be in state |0⟩. The right side, we can see the electron spin being measured, the result is always |0⟩. (image by author)

Left side: the qubit is in the superposition state with equal probability of measuring spin up and spin down, we can see this from the vector projection to computational basis |𝟷⟩ too. The right side: we see what the vector represents, the electron spin when measured in “vertical” basis. (image by author)

You can see in the illustration of physical measurement above, the measurement device (usually magnetic device to deflect the electron based on its spin) “forces” the electron to be in the measurement basis state (spin up or spin down), the measurement outcome is spin up if the electron spin aligns with the measurement “arrow”, or spin down otherwise. This is what we call qubit superposition “collapse”, once measured we loss the superposition information, it’s not in the superposition state anymore, of course because we force it that way. The measurement outcome is completely random, again with the probability to get the desired basis state based on its superposition amplitudes. You can choose whatever computational/measurement basis, like changing the orientation of the measurement device “arrow” there.

Many-Qubit States

This is a prequisite to understand quantum entanglement, so let’s check this out. Suppose we have two qubits, it’s natural that we want to write the states in a single math expression. Fortunately we can do this using a vector operation called tensor product/multiplication. Don’t worry with this new term “tensor product”. Let’s see an example of tensor product of two 2D vectors below, suppose qubit A is in state |𝟶⟩, and qubit B is in state |𝟷⟩. The combined states can be written like this:

Or for simplicity we can just write it like this |𝟶𝟷⟩ without the subscript AB. In this example, qubit A has nothing to do with qubit B, so when we measure the state of qubit A, it doesn’t “collapse” qubit B. We call these individual states uncorrelated.

In general, two qubit states can be expressed like this:

First qubit state

Second qubit state

those two qubit states in one expression:

It becomes a 4D vector. Ask ourselves, does the operation order matter here?

Continuing from above expression, it can also be written:

Now, it has four amplitudes. If we have n qubits, we’ll need to keep track of 2ⁿ amplitudes. The vectors grow exponentially with the number of qubits. If you want to simulate those general quantum states of -say- 100 qubits, you should just give up, because it’s a difficult task even for the best supercomputer we have today on this planet, let alone by pencil and paper.

These combined states also follow the normalization constraint, the total probabilities of all possible states must equal to 1, otherwise it doesn’t make any sense.

Normalization constraint for general states of two qubits

As you can see, those two qubits even if they are both in the superposition states, doesn’t affect each other, in fact we write it as |ѱϕ⟩ (separate states) to begin with. But there’s more to these combined states, and we’ll explore this in the next section.

Entanglement

Ah yes, the quantum entanglement, this is sometimes popularized as somewhat magical, the spooky action at a distance, faster than light communication, and so on. Quantum entanglement describes correlation of two or more qubits. Some of us may have heard about classical correlation example like a pair of socks. If I put the left and the right socks each in two different boxes, opening one box and knowing which sock inside makes us automatically know which sock is inside the other box without opening it. The quantum correlation from entanglement is different (or some may say “stronger”) from the classical correlation where the information of which sock in which box was already there in the first place, it’s just hidden. Quantum entanglement has been tested many times to check if such hidden information like in our pair of socks example exists. The results so far show no hidden information whatsoever. So, what is the quantum entanglement, and what does it even mean? Let’s continue in the next paragraph, it’s a bit too long here.

In the previous section we know how to express two combined states. Now consider this two qubit states (we can make it in the lab):

Can you make this into two separate qubit states? No, there’s no way. From the previous section about two-qubit states, we see that we can express them with two separate qubit states, and when we measure one qubit, it has nothing to do with other qubit. But the above state is interesting, it even got a name, it’s one of the so called Bell states (after the physicist Bell).

We can see it’s in the superposition state, yes, with equal probability of getting state |𝟶𝟶⟩ or |𝟷𝟷⟩. The measurement outcome is completely random, but as we can see from the state above, measuring one qubit state somehow has something to do with the other qubit state, depending on the one qubit state after measurement, the other qubit state is automatically known (when measured in the same computational basis). We call these qubits correlated, the entanglement. That’s it! “Wait, that’s it?” you may ask, yes. When we get |𝟶⟩, we know that the other qubit state is |𝟶⟩, see the entangled states above. The same thing with |𝟷⟩. So where is the spooky action at a distance thingy? It’s the fact that measuring one qubit state will collapse the other qubit state no matter how far they’re apart. There’s no communication though, it’s just correlation.

So what’s the difference between the quantum correlation and the classical one? From the surface, they look kind of similar with our pair of socks example. But the difference is that the measurement outcome of the qubit state is random, there’s no predefined states right after the two qubits got entangled, this is tested many times with some clever experiments (the Bell experiment), and at this rate, you just need to trust me™, it’s worth another story to explain this experiment, the result, and the consequences.

Correlation in quantum entanglement doesn’t have analog in everyday experience. It’s bizarre. It’s different with the pair of socks example, again because each sock is already there in the box, we just lack the knowledge. One thing to note, the quantum correlation of the entangled states only works if both qubits in the example above are measured in the same computational basis, if one qubit is measured in “vertical” basis with the measurement device, and the other is measured in “horizontal” basis, there would be no correlation, can you see why?

Conclusion

It’s been a long reading. There‘s a lot of details I don’t cover in this story like the bloch sphere, the phase difference, complex numbers modulus, Bell inequality, and so on. Explaining all of these would be too long, and actually it wouldn’t be too hard to grasp when you already have the basic. I am also sure and understand that some mathematicians reading this story would be so irritated with the “non-rigourous” mathematical stuff here, but this is not a rigorous textbook. I hope you understand more about what a qubit is, what it means by qubit states, superposition, and entanglement after reading this story.