What Happens When:

type https://www.holbertonschool.com in your browser and press Enter.

Nicolás
6 min readAug 28, 2019

In this article we are going to review some basic ideas that are involved in the web infrastructure to review what happens when you type www.holbertonschool.com in your browser. Before we start this network journey , it is imporant to take under consideration some of the following concepts:

  • TCP/IP: Internet Protocol is the id number that identifies each devices connected to the network (from the users devices until the servers that provide information). And Transmission Control Protocol is the protocol that makes the conection between to machines controlled in the trasmission state.
  • DNS: Domain Name System, it’s like a dictionary where it is storage all the IPs of the web pages (i.e google.com, medium.com, holbertonschool.com, etc). We are going to get deeper soon.
  • TLD: Top Level Domain is a domain extension that helps to distinguish and classify domains, is what follows the domain name in a query (.com, .edu, .org, .gov, .uk, etc).
  • URI: Uniform Resource Identifier is a string of characters used to identify a name or a resource on the Internet, either by location, name or both.
  • URL: Uniform Resource Locator is a subset of the URI that specifies where an identified resource is available and the mechanism for retrieving it. A URL defines how the resource can be obtained. It does not have to be HTTP URL (http://), a URL can also be (ftp://) or (smb://).
  • URN: Uniform Resource Name is another subset of the URI that does not imply availability of the identified resource.
  • HTTPS: Stands for HyperText Transfer Protocol, with the S means that is the secure version of it. And is the way the client and servers exchange information and communicates between each other.
  • SSL: stands for Secure Socket Layer and it’s a security protocol that allows http becomes https through data encryption.

Now that we know some of the words that are involved in this venture, we are ready to start with it. The first thing that is going to happen when you press enter is the browser making a request. A request to whatever knows the IP of the domain name “holbertonschool.com”. In a request you have available some methods (GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE) that allows to do an specific petition, this time by default the browser send a GET method that simply fetch some data. Since we know what we want, that is to see (get) the www.holbertonschool.comweb page, our browser ask the Operating System if he knows the IP of the server that has that information, if not it’s time to ask to the internet where to find that server (its IP). This take us to the DNS, that is going to search in the root (see the image below), and then the root will send the query to the TLD, in this case is .com, and then look for all the domains below that extension until gets the IP of the URL

DNS hierarchy. Taken from cisco.

Once all of that is done, the DNS send the IP that we need all the way back until we get that server’s direction. Now our OS save that information for the next time we type www.holbertonschool.com and press enter.

Load Balancer:

Now that we have the IP the browser request will travel until the servers IP and the first thing that the request will find is the load balancer’s firewall. The firewall is like a shell that protects the devices from attacks or forbbids the request from some specific IPs. As our request is not forbbiden and is not dangerous for the system we can continue. The firewall protects in this instance the load balancer that is a very important part of the web infrastructure because is like a control center that decide to which server send the user request. This is handle by algorithms that will decide the order and number of request to send to each server behind the load balancer:

  • Round Robin Algorithm: Requests are distributed across the group of servers sequentially.
  • Least Connections Algorithm: A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
  • IP Hash Algorithm: The IP address of the client is used to determine which server receives the request.
Load Balancer Structure: R1 — R10 are the request from client. The hardware load balancers has software that handles some algorithms inside to distribute the incoming network traffic across the backend servers S1 — S4.

Web server

Once we pass that filter we can find the web server that takes the request and looks for the response and send it back. We need to go back one moment to see what is going on in the browser, as you can see in the image below (and in the top of this same web page) the browser has a lock-pad and a protocol: HTTPS, a secure protocol that means that the browser and the server comunicates each other in a secure way through an encrypted information that only can read the ones that have the same key. This is possible thanks to the SSL, protocol that makes the HTTP becomes to HTTPS and the user is happier and safer.

Browser structure search bar

All the HTTP request is going to pass through the 80 port and the HTTPS through the 443. The ports are like gates, the system has too many ports and has to control who is going to pass through each of this doors. Secure protocols like HTTPS pass through the 443 door.

Application server

If the content is dynamic we need help of the application server. The application server is a software that uses another resources like file systems, or external resources to server an application that change constantly and in real time (something like twitter for example).

Both, application server and web server has to generate an HTTP reponse with information in it like html code. If everything went well, the application server takes care of putting 200 in the response’s status, but it can also decide to put any other status code it likes, using them accordingly to their specifications. There are three digits response status: 1xx (informative response) 2xx (satisfying response) 3xx (redirections) 4xx (client errors) and 5xx (server errors). In this case we should get the 200 response with the other information provide from the file code, database, to the app server and to the web server that is the one that does the final response to the client.

Here the server send a response to the load balancer, and the load balacer to the internet that take some path to reach your browser with the information that you request. That’s briefly what happens when you type www.holbertonschool.com in your browser and press enter.

For a really nice explanation on DNS and HTTPS see:

and

respectively.

Further information:

--

--

Nicolás
0 Followers

goza de la música de los curiosos / curious people's music liker. https://open.spotify.com/artist/5KfbP4AHbk7alYjdccB8mA