What happens when you visit “https://www.holbertonschool.com” in your browser?

Kristen Loyd
Aug 9, 2017 · 6 min read

When you want to visit a website, you type the website in your preferred browser and usually the website you want to visit loads almost instantaneously. It may seem simple, but in reality, there are many components and processes that go into viewing a webpage; let’s take a look at them.

What happens if you want to visit “https://www.holbertonschool.com”?

First, our computer needs to know what IP address is associated with “www.holbertonschool.com”. This is because humans use easy to remember names for websites, called domain names, but computers use a numerical label called an Internet Protocol address (or IP address). The process of finding the IP address associated with a domain name is known as a Domain Name System request (DNS request). (1)

To find “www.holbertonschool.com”, the browser first checks its cache to see if it has the IP address. If it doesn’t have this cached, or doesn’t cache DNS the browser will ask the Operating System (OS) to resolve “www.holbertonschool.com”. The OS will check its cache and then the `/etc/hosts` file to see if there is an entry associated with the domain name listed. If the domain name is not in the “/etc/hosts” file, the OS will ask a recursive DNS server, called a resolver, to find the IP address. The resolver server is usually operated by your Internet Service Provider (ISP) or is your network connection gateway to the Internet, such as a router.

The resolver will check its cache to see if it has the needed IP address; if it doesn’t it will ask authoritative name servers for answers to do the resolution. An authoritative name server is a DNS server that does the pointing for a domain or zone. The highest level for authoritative name servers start at the “root zone”, handled by root servers. The resolver knows where to locate the root server by using the root zone file located in it. A zone file is what an authoritative name server uses to save its configuration for the domains or zones it controls. (2)

The resolver’s request of the root server is done via TCP/IP (or Transmission Control Protocol/Internet Prototcol). TCP/IP is the two-layer protocol of how computers communicate over the Internet. The TCP layer handles packaging information to be sent into packets, and sending the packets to the receiving server to reassemble. The IP layer manages the address of the packets to ensure they are delivered to the correct IP address. (3)

Upon receiving the resolver’s request for the IP address of “www.holbertonschool.com”, the root server will tell the resolver the location of the authoritative name server for the next level of domain. The next level domain from the root zone is called a Top-level Domain, or TLD for short. Since “www.holbertonschool.com” is the target, the TLD is .COM. Before asking the .COM server for “www.holbertonschool.com”’s IP address, the resolver will save the address of the .COM TLD server so it won’t need to ask the root server for it if another request to a “.com” domain name needs to be handled. (4)

If the .COM TLD server does not have the IP address for the “www.holbertonschool.com” domain saved in its zone file, it will give the resolver the address of the authoritative name servers for the domain. There are usually more than one authoritative name servers for each domain or zone level to safeguard against the failure of any one name server. The list of authoritative name servers is given to the resolver in random order. The .COM name server will tell the resolver that the authoritative name servers for holbertonschool.com are ns-792.awsdns-35.net, ns-176.awsdns-22.com, ns-1455.awsdns-53.org, and ns-1619.awsdns-10.co.uk, because these are located in each zone file of their respective authoritative name servers. These authoritative name servers are given in random order to the resolver, and the resolver will ask one of them for the IP address of the domain name in question.

The authoritative name server requested will then tell the resolver the IP address of the domain name being searched, in our case it is “www.holbertonschool.com”. The resolver will save the IP address, then tell the requestor OS the IP address. The OS will then give the IP address to the browser.

Finally we have an IP address for “www.holbertonschool.com”! Before any content can be loaded, however, more steps need to be taken.

An Example of a web infrastructure with one load balancer and two webservers

Your browser will ask the server at the IP address for “www.holbertonschool.com” to connect to the webpage. Usually, this server is a load balancer. A load balancer directs requests coming to the website to two or more webservers. There are many ways to set up the load balancer. Say “www.holbertonschool.com”’s load balancer uses two webservers, A and B. The load balancer could be set up to send all requests to webserver A, only sending requests to webserver B in the event of a failure or maintenance. Another way to set it up would be to send a percentage of the traffic to webserver A and the remaining traffic to webserver B. In either case, it will send the request to a webserver, that will have the same content as all other associated webservers (unless there is maintenance going on where a webserver is being updated or modified).

If the web traffic to and from the load balancer needs to be encrypted (which it does since we specified “https://”), the information will be sent and received via Hyper Text Transfer Protocol Secure (HTTPS) and an Secure Socket Layer (SSL) certificate will be saved on the load balancer. When your browser requests a connection to the webpage, the SSL certificate will be sent back, which contains the public key your browser needs to connect securely. Once your browser has the public key, it will perform a “SSL handshake” to establish a secure connection. Now, all communication with the load balancer and your browser will be encrypted and secure. (5)

It is also good to note here that the load balancer will also have a firewall in place to protect against malicious requests. An example of this would be if a user is trying to send hundreds of requests to your webpage at a time in an attempt to make your webserver crash. You could then modify your firewall configuration to disallow any requests from that user’s IP address.

The load balancer will then send the request for “www.holbertonschool.com” to a webserver. Webservers will have firewalls in place, often with directions stating requests can only be sent and received from the load balancer and other associated webservers. This is also to prevent attacks on these webservers.

Once getting a request from the load balancer, webserver will read the static information in the codebase for the specified webpage (“www.holbertonschool.com”). This static information is the HTML and CSS content of the webpage. If the HTML requires information from the dynamic portion of the codebase, the application server will step in. Application servers handle requests for information that may be accessed, modified, or removed. The application server will access a database management system (DBMS) to provide the webserver with the necessary dynamic page information. A DBMS is a management tool to organize, retrieve, modify, or delete information in a database.

If a static HTML content includes a script for dynamic information, the application server will process that script by using the DBMS to access the database. When the application server has the requested information, it will translate the information into HTML before sending it to the webserver.

Once the webserver has the requested content, it will then send it to the load balancer, which sends it to your browser, which loads the contents of “holbertonschool.com” to your screen!

And to think, all these steps take milliseconds.

Bonus Content:

  • If you want to know the authoritative name servers for your domain, a `whois` query is your friend! To do this, open a terminal and type `whois holbertonschool.com` and look for the lines that read “Name Server:”. These are the name servers for “holbertonschool.com”! You can also find many whois tools online.
  • If you want to see how DNS resolves a domain name, open a terminal and type `dig <domain name> +trace`.

Sources:

  1. https://en.wikipedia.org/wiki/IP_address
  2. https://en.wikipedia.org/wiki/Zone_file
  3. http://searchnetworking.techtarget.com/definition/TCP-IP
  4. https://howdns.works/episodes/
  5. https://www.instantssl.com/ssl-certificate-products/https.html
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade