How Does HTTPS Actually Work?

Himanish Munjal
CodeX
Published in
9 min readJul 11, 2022
Image from x-cart.com

Most of you have probably previously heard about HTTPS. This article claims that 79 percent of websites currently use HTTPS. Simply put, HTTPS is the encrypted version of HTTP, the most used protocol for transmitting data between two servers. To strengthen the security of transported data, HTTPS is the layer 7 protocol which encrypts the data being transferred.

Check out this series of articles to learn more about HTTP.

The advantages of HTTPS and underlying technologies like TLS, TLS handshake, certificates, and certificate authorities will all be covered in this article.

1. What Is HTTPS

HTTPS is the secure variant of HTTP. To keep things brief and straightforward, HTTPS is simply the HTTP protocol plus data encryption and additional features offered by SSL/TLS.

Source

2. Why HTTPS

The majority of data exchanged online uses HTTP. By default, this data is not encrypted, making it possible for anyone to spoof the data between the source and the destination by using a man-in-the-middle attack. Additionally, the information you receive may be altered, or a malicious actor may answer in place of the genuine server itself.

Source

In order to address these issues, HTTPS augments the HTTP protocol with encryption, authentication, and integrity.

I’ll provide a high level overview of how it works in this section, and then I’ll go into more specifics on the underlying technology and procedure in the following section.

Encryption:- HTTPS protects data transferred over the internet from being intercepted and read by a third party by including SSL/TLS encryption. Public-key cryptography and the TLS handshake enable this. Below, we’ll talk about these procedures. We’ll discuss these processes below.

Before encryption:

This is a string of text that is completely readable

After encryption:

ITM0IRyiEhVpa6VnKyExMiEgNveroyWBPlgGyfkflYjDaaFf/Kn3bo3OfghBPDWo6AfSHlNtL8N7ITE

Authentication:- Contrary to HTTP, HTTPS uses a strong authentication system through the TLS protocol and CA certificates. As part of the TLS handshake when an HTTPS connection is being established, the server sends a chain of certificates that the client can use to verify that the handshake is being done by the needed server. The client will validate that any identifying information contained in the certificate has been verified by a reputable third party if the server’s certificate has been signed by a publicly recognised certificate authority (CA). Below, we’ll talk about certificates and CA.

Integrity:- Each document (such as a web page, image, or JavaScript file or any other data) sent to the client by an HTTPS web server includes a message digest that client can use to determine that the data has not been altered by a third party or otherwise corrupted while in transit. HTTPS protocol signs each message with a message authentication code (MAC). The MAC algorithm is a one-way cryptographic hash function (effectively a checksum), the keys to which are negotiated by both connection peers during TLS handshake. Whenever a TLS record is sent, a MAC value is generated and appended for that message. At the receiver end, another MAC is created by the receiver from the message that it gets and then both MAC codes are compared to ensure the authenticity of the message.

3. Diving Deep Into TLS

Transport Layer Security or TLS is a cryptographic protocol that provides end-to-end security of data sent between applications over the Internet. TLS is normally implemented on top of TCP in order to encrypt Application Layer protocols such as HTTP, FTP, SMTP, and IMAP, although it can also be implemented on UDP, DCCP, and SCTP as well.

As the most extensively used protocol at the moment, TLS1.2’s specifications will be a main focus of this article. Its predecessors (SSL 1.0, SSL 2.0, SSL 3.0, and TLS 1.1) are currently deprecated, and its successors (TLS 1.3) is in the adoption phase.

TLS is what actually provides the above 3 mentioned capabilities to all applications running above it: encryption, authentication, and data integrity. In case of HTTPS, the application running above it is HTTP.

The encrypted tunnel must be established between the client and the server before application data can be transferred via TLS. To do this, the client and the server must agree on the TLS protocol version, select a cypher suite, create session keys, and, if necessary, verify certificates. This is all carried through as part of the TLS handshake.

3.1. TLS Handshake

The handshake is one of the most important part of TLS as this is where each connection begins and where the technical underpinnings of TLS are established.

TLS handshake happens on top of the underlying network layer handshake. So if TLS is working on top of TCP, TCP handshake happens between client and server. Once that is done, TLS handshake is initiated.

TLS handshake accomplishes 3 main things:

  • Exchanging cipher suites, TLS version and other parameters.
  • Authenticating one or both parties.
  • Creating/Exchanging symmetric session keys.

Mentioning below the steps which take place in the handshake.

Source
  1. The first message is called the “ClientHello.” This message lists the client’s capabilities so that the server can pick the cipher suite that the two will use to communicate. It also includes a large, randomly picked prime number called a “client random.”
  2. The server responds with a “SeverHello” message. In this message it tells the client what connection parameters it has selected from the provided list and returns its own randomly selected prime number called a “server random.” If the client and server do not share any capabilities in common, the connection terminates unsuccessfully.
  3. In the “Certificate” message, the Server sends its TLS certificate chain (which includes its leaf certificate and intermediate certificates) to the client. We’ll discuss certificates and certificate chain later.
  4. This is an optional message, only needed for certain key exchange methods (Diffie-Hellman) that require the server provides additional data.
  5. The “Server Hello Done” message tells the client that it has sent over all its messages.
  6. The client then provides its contribution to the session key. The specifics of this step depend on the key exchange method that was decided on in the initial “Hello” messages which is mostly RSA or Diffie-Hellman.
  7. The “Change Cipher Spec” message lets the other party know that it has generated the session key and is going to switch to encrypted communication.
  8. The “Finished” message is then sent to indicate that the handshake is complete on the client side. The Finished message is encrypted, and is the first data protected by the session key. The message contains data (MAC) that allows each party to make sure the handshake was not tampered with.
  9. Now it’s the server’s turn to do the same. It decrypts the pre-master secret and computes the session key. Then it sends its “Change Cipher Spec” message to indicate it is switching to encrypted communication.
  10. The server sends its “Finished” message using the symmetric session key it just generated, it also performs the same check-sum to verify the integrity of the handshake.

After these steps the TLS handshake is complete. Both parties now have a session key and will begin to communicate with an encrypted and authenticated connection.

At this point, the first bytes of “application” data (the data belonging to the actual service the two parties will communicate about — i.e. the website’s HTML, Javascript, etc) can be sent.

In TLS 1.2, whole handshake happens for each session, that is not the case for TLS 1.3.

4. Diving Deep Into TLS Certificates And Authentication

Every TLS connection must first go through authentication. TLS offers authentication through the use of certificates, as we have repeatedly said throughout the article.
In this section, we’ll first go over some terminology before and then look at the process of authentication itself.

4.1 Certificate

So, first of all, what is a certificate?

A certificate is nothing but a simple document containing the public key and some information about the organisation who is creating the certificate.

Let’s take the example of Medium’s certificate

A certificate, as you can see, records various data. It records information on the certificate’s owner, in this case medium.com. Additionally, it identifies the signing authority, in this case Cloudfare, Inc.
Additionally, as you can see, the certificate gives you access to the owner’s public key. During the TLS handshake’s key exchange, this public key may be used.

But let’s keep in mind that a certificate in itself is not enough to provide any kind of authentication for server (medium.com). I can create my own certificate naming the owner (which is me) as medium.com and send it back to the client. We will discuss below how authentication actually happens with certificate.

4.2 Chain Of Trust

When you get the certificate from the server, you actually get the chain of certificates.

Source

The chain of trust certification aims to prove that a particular certificate originates from a trusted source. With this chain of certificates, we will recursively validate the authenticity of each certificate till we get the root certificate.

The chain consists of 3 parts:-

Server Certificate:- The server certificate is the one issued to the specific domain the client is communicating with.

Intermediate Certificate:- Intermediate certificates branch off of root certificates like branches off of trees. They act as middle-men between the protected root certificates and the server certificates issued out to the public. There will always be at least one intermediate certificate in a chain, but there can be more than one.

Root Certificate:- A root certificate is a digital certificate that belongs to the issuing Certificate Authority. It comes pre-downloaded in most browsers and is stored in what is called a “trust store.” The root certificates are closely guarded by the Certificate Authorities.

If the certificate is legitimate and links back to a Root CA in the client browser’s Truststore, the user will know that the website is securely based on interface trust indicators.

Source

In the above figure, medium.com is server certificate, Cloudfare Inc is intermediate certificate and Baltimore CyberTrust Root is the root certificate.

4.3 What Is a CA?

A Certificate Authority (CA) is an entity that issues digital certificates conforming to the ITU-T’s X.509 standard for Public Key Infrastructures (PKIs). Digital certificates certify the public key of the owner of the certificate (known as the subject), and that the owner controls the domain being secured by the certificate. A CA therefore acts as a trusted third party that gives clients (known as relying parties) assurance they are connecting to a server operated by a validated entity.

In the chain of certificates explained above, all the certificate are signed by a CA. The server certificate (medium.com) is signed by CloudFare which is a certificate authority. Then in turn CloudFare’s certificate is signed by Balitmore CyberTrust. As Balitmore CyberTrust root certificate is the root certificate, it’s a self signed certificate.

4.4 Authentication

Now that we understand the basics of certificates, let’s understand how authentication actually works.

Below are the step involved in server authentication:-

  1. Get the chain of certificate from server.
  2. Get the top certificate from the chain of certificates.
  3. Decrypt the signature of the certificate using the public key of the issuer’s certificate.
  4. Compare the decrypted signature with the public key of the certificate.
  5. Follow process 2–4 till you get the root certificate.
  6. In case comparison fails at any step, fail the authentication process.

Once the whole chain of certificate is validated for correctness, we can safely assume that the certificate with given details is valid and we are communicating with the right server.

5. Summary

Going through the topics that we covered in this article

  1. What is HTTPS.
  2. Security limitations with HTTP
  3. How HTTPS solves the security limitations
  4. TLS and TLS Handshake
  5. Certificates, Chain of certificates and Certificate authority.
  6. How certificates are used for server authentication.

So here’s my take on an introduction to HTTPS. I do believe that there needs to be a lot more detail given for a number of concepts, such as key exchange with Diffie-Hellman, the entire CipherSpec exchange, the production of TLS certificates, etc., and HTTPS can most definitely not be addressed in a single page. I’ll try to cover these topics in more detail in later articles, though, as this one is currently over ten minutes long.

Meanwhile, please drop a clap and follow me if you think this helped you in any way and drop a comment if I missed anything. Cheers :)

--

--

Himanish Munjal
CodeX
Writer for

Hi, working as an SDE 3 at Amazon. I write about tech with low level details. Please reach out for any recommendation and suggestion.