You might have heard of this popular question or simply being curious of, “what happens when you type “google.com”. But before answering it, lets modify it to my favorite “holbertonschool.com”.
The answer to this question entails details of how your computer interacts with another computer connected via internet, covering following sections:
- Keyboard input
- Event handling
- Client and Server
- DNS Lookup
- OSI model
- Web server
When you type holbertonschool.com in the browser, you type it from your computer’s keyboard and as you are typing it, your keyboard emits an event, that is, it signals the operating system (OS) that a state has been changed and your OS records this change and responds to it. It’s like, when you touch a hot pan, the moment you touch a hot pan, your brain processes it and prepares your body for response, in this case, remove the hand.
There are flags and key codes to detect when a key is pressed to map which key is pressed and generate a response accordingly. Different keys on your keyboard evokes different responses. So when a key is pressed, kernel signals the OS that it needs its immediate attention and resources. OS make the CPU (central processing unit aka, brain of the computer) respond to it by suspending its current activities, saving its state and executing interrupt handler function. If you would have typed “holbertonschool.com” in a text editor, your OS would have let dedicated text editor program to handle your interactions with it. But since you typed in the browser, it would let browser application handle it.
What is a browser?
Let’s say you want to order food, and you choose to order from Doordash, while your food is getting ready, Doordash driver goes to the restaurant, grab your food and delivers it to you. Voila! in minutes you have your food.
But who is actually serving you, the restaurant or Doordash? Well! Doordash is the middle man, letting the restaurant serve you.
Similarly, the browser is the medium that lets you request and lets server serve you. It’s a software installed and running on your computer that lets you search the internet. It takes your input, creates and sends a request, gets the response and serves you.
But wait how does Doordash knows which restaurant to go to and how to find it? Of Course, Google maps. Now how does your browser know which server to send request to, yes, you guessed it right, it needs to find its address. So, it queries DNS (Domain Name Server) for finding the IP .
DNS is internet’s version of a google maps. It routes you to your destination. Your computer or your router knows address of the DNS server, when you type the URL in browser for first time, it sends a request to the DNS server which responds back with the IP address of the web server hosting holbertonschool.com. This value is usually cached after that or gets added in the list of knowns hosts, so your browser doesn’t have to do this lookup every time.
Now that your browser knows the IP address of server, it needs to find its way to pass this request all the way to the server. When you were placing the order, it’s not just you interacting with Doordash, there’s another end its managing. It needs to check with the restaurant if it’s ready to accept the order, handle billing and payments, find the most reliable driver, and so on. Similarly, there’s a lot of stuff that needs to be managed for a smooth communication between browser and server. Let’s get to the meat of it all.
There’s something called an OSI (Open System Interconnection) model that standardizes communication between different computing machines (ref. wikipedia), that is, describes the flow of information from one computer to another. It defines 7 layers and the interplay of these layers magically brings holbertonschool.com from server to your machine. Since at both ends, client’s and server’s, these layers are followed but there is a difference in the flow of which layer kicks in first. When your browser sends the request, communication starts at application layer and going down to physical layer, whereas in server, while receiving the request it would start at physical layer, going up. On the other hand, when server is responding to your browser’s request, it would go from application layer to physical layer and when your computer would receive the response, it would first go to physical layer all the way back to application layer.
7. Application layer: consists of protocols that directly interacts with the end user. A protocol defines how different applications across machines communicate with each other. That is, if you are requesting a web page, HTTP (Hyper Text Transfer Protocol) will handle it, while if you are sending an email, SMTP (Simple Mail Transfer Protocol) will handle it. So, in case of holbertonschool.com, your browser generates a http request. Don’t confuse browser as part of application layer, the role of application layers comes in when your browser creates a http request. This http request is part of application layer.
6. Presentation layer: depending on what you requested, image, video, text, gif etc, this layer converts and presents the data in readable format. In case of holbertonschool.com, when your machine would receive it, presentation layer would kick into to render it as a html page.
5. Session Layer: is responsible for establishing, maintaining, and terminating the session between devices. For example, when you are doing video chat, from the time you entered in the chat to the time you left it, is one complete session, given there were no interruptions during that interval. However, in case of holbertonschool.com, HTTP uses lower layer protocol, instead of session-layer protocols.
4. Transport Layer: takes care of the reliability, safety and security of the path taken between the request and response. Here, the transportation, delivery and assembling of data takes place. When you are requesting holbertonschool.com, essentially, you are not sending any data, but the role of this layer is more evident, when you receive the data. The data your machine receives comes divided into packets with a sequence number assigned to each packet, called data payloads. Now, this layer makes sure that you have received all packets and reassembles them in order. As I mentioned above HTTP uses the TCP (Transport Layer Protocol) instead of session layer protocols for establishing and maintaining a connection from your machine to the server to ensure reliable delivery. For security, it uses SSL (Secure Sockets Layer), which encrypts all data passed between browser and the web server making all communications private and integral. In http requests, it’s the job of tcp protocols to ensure fast and efficient delivery. Doordash is receiving hundreds of requests Now, imagine, if the same driver is first delivering you and then your friends, then everybody else. It would be lot slower and highly inefficient. So, Door dash has to make sure that all the requests are served well and are distributed across drivers.
3. Network layer: organizes and routes the data. It also decides which transfer protocols to use. So in case of holbertonschool.com the best path to route the data between your machine and web server is determined by IP (Internet Protocol).
2. Data Link layer: In this layer data is broken down into pieces. So when the server sends you holbertonschool.com, it doesn’t send you the entire page all at once, rather the data link layer segments it, encapsulates it and transmits it as packets (data payloads) through the physical layer. Also, it is not necessary that the packets are delivered directly to your machine, it may travel from network to network, passing through many machines, before reaching to you. So in that case, IP addresses of all of these hops are translated to hardware addresses, at Data-Link layer.
1. Physical Layer: Physical layer deals with the actual connectivity between your machine and the server. The hardware, signaling and encoding mechanisms required to form the actual connection are defined at this layer and the data received from the server is in the form of raw bits. Try “ifconfig” in your terminal to checkout the network interface configuration of your system.
So far, I have mainly talked from clients perspective, it’s time to understand what happens at servers end.
Let’s say, like you, your friends are hungry too and requesting the same food from the same restaurant. But this restaurant is hugely popular and not just you and your friends but many other customers are ordering too. What if, the restaurant manager is overloaded and not able to handle all the requests, that would be a huge bussiness lose.
Now since Holberton School is becoming popular and everybody wants to know about it, how does web server makes sure that holbertonschool.com is always accessible, and what if there are too many requests for the server to handle? No worries, there are load balancers for it.
Again, Popular websites have to serve several thousands of concurrent requests and return correct text, image and video response to them. To serve the large number of requests, usually the content is distributed across multiple servers. A load balancer sits in front of these servers and acts as a traffic cop to direct traffic to the right server. It makes sure that no server is overloaded, ensures high availability and reliability by ensuring requests are served. If a server goes down, it starts redirecting the requests to different servers that are online.
Webserver’s use firewall to protect the system against breaches and attacks. Human equivalent of this will be skin — it doesn’t kill the foreign objects trying to enter our bodies, it simply obstructs their path. Similarly. firewall guards a system against incoming connections that are meant to harm the system. For example — if a bad source starts flooding the web server by large number of concurrent requests, firewall will detect that and block requests from that IP address from reaching the web server.
Now visit, www.holbertonschool.com and in a fraction of a second, you would be able to visit it, but see so much is happening to get you that, cheers to the power of internet.