The magic of TLS, X509 and mutual authentication explained
A story that explains the problems solved by TLS and X509 certificates
More recently I had to set up mutual TLS authentication between a MySQL server and a replica which gave me the first chance to really dive into setting up and running a CA, and implementing mutual authentication.
It was a cool learning experience, and I’d like to recap and expand on some of the learning I had. First, let me state: This isn’t designed to be a 100% accurate reflection of the specification. Rather, it’s simplified in some places to build abstractions. Hopefully where terms are mentioned there are links which can be used to find a more concrete meaning.
The problem of data safety
When conceptualising how computers communicate, its reasonable to assume that messages will send directly from one computer to another. Computer “Alice” sends a website to computer “Bob”:
However, that’s not how it happens. It’s extremely rare that two computers are connected directly to each other; normally, there many intermediary computers (often termed “routers” or “firewalls” or any number of other appliance-like names”). It looks more like:
This presents computers Alice and Bob with a problem. First, they cannot be sure that the computers between Alice and Bob haven’t recorded what is being said, and if Alice sends a message to computer “Bob”, “Bob” cannot be sure that the message seemingly sent by computer “Alice” was actually sent by “Alice”, nor that it was sent unmodified.
Luckily, there is a mechanism to solve this exact problem.
Transport Layer Security
Transport layer security (or “TLS”) and its predecessor secure sockets layer (or “SSL”) have existed since ~ 1996 to solve this problem. As identified earlier, there are two set of problems that need to be solved:
Ensuring data is not readable by intermediaries
In order to ensure that data is not read by the computers that sit between a network connection the data is encrypted. Encryption is the process of encoding a message such that only those who should be able to read and understand it can read and understand it. In order pass the information back and forth, the two computers need to decide how they’re going to encode the data such that only they can understand it, and the intermediary computers cannot.
In order to start sharing secrets with Bob, Alice needs to know some way of encoding the data such that only Bob can decode it. This is a chicken and egg problem; if the data was encoded, changing the encoding would be easy! However, we have neither chickens nor eggs, and it’s still a difficult problem. The solution lies in a process called “public key cryptography”
Without going too much into the depth of how this works, know that Bob has a reference to decode secret information. Bob’s reference is split into two halves:
- A “public” half, which contains the reference to encode the data, and
- A “private” half, which contains the reference to decode the data
This allows Bob to send the public half of this reference to Alice, who can use it to encode and send information that only Bob can read. These references are called “keys” — a public key, and a private key.
In the first step, Alice needs to ask for Bob’s public key. This process, called asymmetric encryption, looks something like as follows:
However, this process has a weakness: it does not allow Bob to send messages back to Alice that only she can read. In order to to send messages back, Bob needs a key that only Alice understands.
Alice knows she can send information to Bob that only Bob can read. So, she can take advantage of this to send Bob some secret information that can be used by both Alice and Bob to encode information back and forth, to each other.
This information allows the formation of the symmetric key. This key is shared between both Alice and Bob, and can be used to decode messages on either side. This creates the two way, secret connection.
At the end of this long process, Alice and Bob can both send secrets to each other all day long without worrying about whether anyone is listening in on their connection. They cannot — only Alice and Bob have a copy of the reference symmetric key that’s needed to decode these messages!
However, this process has a flaw: How do we know Bob is actually Bob?
Ensuring Bob is Bob
In the examples described, we know that Alice is talked to Bob through a network of computers. However, what happens if one of those computers suddenly starts pretending to be Bob?
Without knowing Bob beforehand, it’s impossible for Alice to know “Bob” is “Bob”. Alice will simply start an encrypted connection with whomever pretends to be Bob. Indeed, although it’s not shown here, the intermediary can pretend to be Bob to Alice, and Alice to Bob! This is termed a “Man in the middle attack”.
However, there is a part of the TLS standard that is also designed to solve this problem. Specifically, when Alice first indicates to Bob that she’d like to start talking over an encoded connection she not only asks for his public key but also for him to provide a certificate (in the form of X.509) proving who he is. She then asks a set of trusted advisers called “certificate authorities” whether Bob seems legit, and decides whether to proceed based on what those authorities have to say.
Where a certificate is not vouched for by an authority, Alice will simply reject the connection.
Ensuring Alice is Alice
For Alice, the connection is now happy and fairly secure. She knows the Bob she’s talking to is the real Bob, and that only she and Bob can see the messages being exchanged. However, Bob has no such assurance that Alice is Alice.
There are two sides to each connection:
- The “Client”. In this case, that’s Alice — she sends the first message.
- The “Server”. In this case, that’s Bob — he responds to (or “serves”) the messages.
Verifying Bob is Bob is an extremely common operation. Indeed, while viewing this post it’s extremely likely your browser verified that the blog website you see before you is the blog website it claims to be. Verifying Alice is actually Alice is a much less common operation, but is generally called “Mutual TLS authentication” as both Alice and Bob are verified.
Consider the scenario in which Bob is expecting some sensitive, perhaps medical or similar data from Alice. Bob will then process that data and then make a diagnosis about Alice condition. In this case, Bob definitely wants to be sure that Alice is the real Alice, and is not making up fake diagnostic data!
Luckily, the aforementioned TLS standard can be easily extended to include the same verification process for Alice as for Bob:
Now that both Alice and Bob both have strong guarantees that they are who they say they are (vouched for by their certificate authorities) and the connection is encrypted this connection can be said to be very secure.
Transport Layer Security (TLS) and the X.509 certificate can seem when first encountered like essentially magical things that somehow provide security but it’s not clear exactly how or why. After implementing them a couple of times and going through the required debugging to get everything talking correctly to each other it becomes a simpler task. Hopefully this post has gone some way to making that debugging process eventually easier.
If you’ve found this useful, hit me up on twitter! I want to do more digging into the X.509 standard, but I hit exhaustion writing this post, and felt like this was a reasonable point to finish it up for now. I will write more if people are interested in reading it.
- Daniel Nettleton for their early review and feedback
- Tomasz Kapłoński for their early review and feedback
- Antonius Koch for their early review
- Vinai Kopp for review