what happens when you type in the browser and hit enter.
Magic! when you search for some social network and hit the enter button, you almost instantly have your friend’s photos with other friends’ comments, beautifully formatted in your web browser.
This is an URL: https://google.com
Let’s see what’s happening behind your sight when you type a URL on your computer. Which information your browser sends to the internet, how information is processed and sent back, and how the information arrives to you in a way you can understand.
When you want to call someone you remember his or her name. Then you or the automatic system of your phone needs to find the correct number to send a request to start the call.
If you don’t have the number you ask a common friend to share that number with you and then you start the call directed to the phone number.
It is the same in the case of internet connections.
You have the name of your web page called the DOMAIN, and your browser needs to find the associated number. That number is called the INTERNET PROTOCOL or IP, and your computer searches on multiple agendas you own and agendas of other servers until you can find it.
In our case, the browser starts with the domain www.google.com and searches in different agendas until it finds the IP number associated:
What does it mean that my browser searches at multiple agendas?
Let´s see which are and who are the owners and in which other our browser ask for help when the IP is not registered in the browser caché.
- Your browser runs the function ‘gethostbyname’. It checks if that domain or hostname is registered in the local ‘hosts’ file of the operative system.
- Your browser sends a request to a DNS server. The Domain Names System server searches first in its own caché, and then:
- Your Internet provider company caché
- Send a request to a top-Level Domain. It is a server that saves copies of the root domain server, about all the webpages with a specific suffix as ‘.com’. Some are associated with countries (.co, .es, .mx) or with the web page purpose (.museum, .aero, .tv, .coop).
- Send a request to the root server. Contains information for web pages located around the world. There are 13 of this kind.
A register can contain directly the IP, or it can contain the name of another domain. In this case, the browser starts again to search the IP, based on the given domain.
Ways to send the information
There are multiple ways to send and receive data. Some of them focus on the reliability of the data, and others focus on the speed of transmission.
Think about it for a minute. If you are making a video call, you don’t care about some images lost in the process, but you need it to feel as if the other person were in the same place. That protocol is called UDP.
On the other side, when we are sharing a web page we need every single word and pixel to be equal in the server that sends it and the server that receives it.
The web pages in general are transmitted through TCP.
The Transmission Control Protocol is a reliable and ordered way to send/receive information through a network connection over an IP address.
The reception of every single package is confirmed before the next is sent. It includes a temporizer to identify when a package is lost or damaged. This warrants that every byte received will be identical to the bytes sent.
If the two servers were gentleman its conversation would sound like this:
1. Establish the connection:
- Good morning Sr. Do you have a second?
- Good morning Sir. I am available. How can I help you?
2. Request the web page
- Can you send me please the web page?
3. Send the different parts or files that form the web page.
- Yes, of course, here you have the first part.
- I have received the first part, thanks.
- Perfect, lets have the second part.
- I have received.
As a company that has a general telephone number and some extensions to communicate with different areas of the organization, a server has its IP and many ports.
Each port is configurated to receive a certain type of information, process it in a particular way, and respond using a specific protocol, or format.
Servers have a lot of private and confidential information, business logic, and credentials to connect with databases.
Servers usually close most of the ports and try to receive all the petitions through a secure connection.
Incoming connections are allowed or forbidden according to the group of rules called FIREWALL.
Firewalls are like the watchman of the office. The request is analyzed to see which port want to use for the connection, the IP direction of the sender, the content requested, or the actions requested.
Web pages normally are transmitted through port 443, and some older webpages with low security are transmitted by the 80.
In an HTTPS connection, the request is going to be listened to and answered through port 443, and the traffic through other ports will be forbidden to avoid hacker attacks, information thefts, and that other person takes the control of the server.
How does my web browser know I have a secure connection? What does it mean a secure connection?
Between the many ways and ports to transmit information through the internet, there are many protocols that can be intercepted and the content can be treated clearly by someone different from the sender and the intended receiver.
HTTPS with the S at the end, was defined to implement a secure connection protocol.
It was based on the HTTP protocol which means “HyperText Transfer Protocol”. and was created to send plain text initially, as the marked down language HTML or CSS.
It allows showing all the web page contents when is sent a request to port 80, with only a problem: security. Anyone could see the data sent between the server and the client until the Secure Sockets Layer appeared.
SSL is a protocol to secure data (as passwords, personal data, credit card numbers, and other content) sent between two machines using encryption. It is used on almost every web page on the internet currently, but especially in online commerce. It uses IP address and port 443 to establish the network connection.
You can know if a web page has it by looking at the address bar of your web browser when you are visiting that page.
It works similarly to the security implemented in an email service. Everyone knows your email address and can send you emails, but only you know the password to be able to open all the received emails.
This is the logic. The server sends everyone its public key and sends requests, but the content is available only to the server that has the private key.
The content is not protected by a password, instead, it is encrypted to be plain text with no human readability. The private key allows you to translate the content into readable and significant text.
A web page receives this certificate after the organization owner of the web page is researched to establish responsibility for the content of that page.
The server has two keys, a public and a secret one. The public is in the certificate, and the client uses it to encrypt the content that it’s going to send. That content encrypted can only be opened and read with the secret key that only the server has.
The content that the user is sending is its own public key, so the server can use it to encrypt the information that sends to the user and it becomes readable only to the client.
Web pages can not depend on a unique machine to serve their content.
Normally they have many servers, and the server that receives our request normally sends it to another server. The first one is called the load-balancer.
A load balancer uses an algorithm to decide to which server is going to send a request. For example, it can distribute one for each server in rounds(round-robin), send the next request to the server with the minor amount of active connections (least connections), or send all the requests from the same IP address to the same server (IP hash).
In any case, the request is sent to a web server that is the one that is going to generate to answer and serve the page.
Finally! This server will answer your request.
A web server is a software that receives and answer HTTP or HTTPS requests from clients through the world wide web.
It contains all the static content (the content that doesn’t change between different users, countries, and IPs) like images, HTML general structure, CSS style sheets, or plain text files.
The most common software to run web servers around the internet areApache and Nginx.
Not all the pages are static or show the same information to everyone. Facebook shows my friends that are different from yours, my data, and my photos.
Dynamic web pages require making calculations to get and processing data from other APIs or databases.
The dynamic websites allow the user to save their own data and interact with the page content: save purchase orders in a shopping car, associate user information, have a profile, and many other more.
For this, the web page content needs to be generated by an application server that operates applications, communicates with databases through queries, and manages user information.
The application server identifies what information is needed from the database or API, makes the call, receives the data, and processes it.
Then, return it to the web server that puts that personalized data into the rest of the web page, and sends the response to the browser!
The most common programming languages to write the application servers are PHP, Node, Python, and C++.
The Database saves information.
It can be saved in relational or not relational databases. It receives queries and returns consults with the answer organized as requested.
The most common systems for data management are SQL, MySQL, Oracle, MongoDB, and PostgreSQL.
Now… Continue enjoying the internet magic!