Web Development Fundamentals Every Developer Needs to Know

Learn key concepts to understand web mechanics and how the internet actually works

Natasha Ferguson
9 min readFeb 25, 2023

In this blog post, I’ll explain how the internet actually works. We’ll answer questions like ‘How does the browser find the HTML file for a requested web page?’, ‘How does the HTML file turn into a user interface?’, ‘What can we do to make that process quicker?’, ‘How is the communication with the server established and continued?’. These are all the questions I'll be answering in this post. Concepts we’ll cover include:

Client-Server Model

The term client-server model is s generic term to describe when we have two computers talking to each other, and one of those computers is the client, the computer requesting information, and the other computer is the server, the computer that is sending the information. This model can be used in a variety of different types of projects, and the most common use case is the internet.

When we try to go to some website, say linkedin.com, what is actually happening is that our browser is becoming the client. It makes a request to the server. The server is set up by LinkedIn. Whenever someone creates a website, they need to host that website on some server. Sometimes you will use a cloud provider for this, such as AWS or Azure, or you can create a server yourself.

The server is going to send HTML back to the client, and then the client, which again is simply a web browser, is going to interpret that HTML and generate a user interface. As you continue using linkedin.com and interacting with the page, your browser continues to communicate with the server.

Sometimes a single computer can be both — a client and a server in two different interactions. For instance, a single machine could act as a server for end users and a client for a database.

Webpage Request Lifecycle

So what happens when you go to linkedin.com? What is the browser doing to request HTML for this page from the LinkedIn server? To answer these questions, we first need to look at its URL. A full URL might look something like https://www.linkedin.com:443/feed . Let’s break it down.

  • HTTPS is the protocol and it specifies how this request is going to be formatted. HTTPS and HTTP are the standard protocols for communication on the internet.
  • Next after the protocol, we have a colon and two slashes, and then we have a subdomain as well as the actual domain name. linkedin.com is the actual domain name, and this is the name of the website. www is a subdomain, and you can think of this as a domain inside of the linkedin domain. The domain name is broken down into the name of the website and a top-level domain, and this is the .com part. Most of the top-level domains are associated with a specific country.
  • Next, we have a port 443 and it can be left off because there is a default port. So with HTTPS, the default port is going to be 443, and with HTTP, the default port is 80. You almost never need to include a port in the URL.
  • And then finally we have the path to the resource that we are requesting. If you are just requesting the homepage, most servers allow us to not include any path. However, if we need some specific resource, for instance, if you want to go to your LinkedIn feed, you would use the slash and feed path.

We’ve covered the key parts of a URL, but there are some other things we can include. Such as query parameters — key-value pairs we can pass to the server to give it more information, usually for filtering down some content. You can also add a fragment. It uses the pound sign # and then you can add the ID of some element in HTML and browsers will handle this by scrolling to that element by default.

Now that we’ve broken down the URL, we need a way to actually locate the server and send it a request for the HTML. When you want to send a request to a server, you might know that the domain name is linkedin.com, but you need to know the address of that server. And we do this by using an IP address. It’s a unique identifier for a computer on the internet. To find the IP address of any given domain, we use the Domain Name System (DNS).

When the browser is looking for the IP address of a website, what it’s going to do first is check the local cache. The operating system on your machine is keeping a local cache of every IP address that has recently been found with the DNS. This way we don’t have to make repeated network requests. In the case that the IP address is not found in the local cache, a DNS request is issued on the network.

This request starts by going to the resolving name server which checks its own cache. The resolving name server is usually your Internet Service Provider. Next, if the resolving name server doesn’t know the IP address, then it is going to look to the root name server. This server holds a mapping of top-level domains such as .com to the IP addresses of top-level domain name servers. For each top-level domain, there’s a server that will know how to locate the IP addresses of all registered domains in that top-level domain.

So now the root name server can tell the resolving name server where to look, and where the top-level domain server is. The resolving name server is going to use that information to make a request to the correct top-level domain name server (TLD name server). We’ve reached the top-level domain server and it’s going to redirect to the authoritative name server. The authoritative name server will know the IP address of the domain that we are looking for, so it will send that IP address back to the resolving name server, which in turn will return it back to our computer and our browser will know where to look.

Now we have the IP address of linkedin.com, and the next question we need to answer is how does the browser actually connect to linkedin.com? What is the method by which the two computers connect to each other? This connection happens over TCP, which stands for Transmission Control Protocol. It’s a network protocol used to establish a connection between two computers on the internet. TCP is the primary mechanism by which HTTP requests are delivered. The way we send information from one computer to the other is in what’s called a packet. It’s a small piece of data, a portion of a larger piece of data, and the packets are combined together to form the larger piece of data that is being sent.

For a TCP connection to be created, the client needs to initiate the connection. The client does this by sending a synchronization packet (SYN) and the server responds to the client with the synchronization acknowledgment. Then finally, the client responds back to the server acknowledging that they got the synchronization acknowledgment. We call this verification by both the client and the server the three-way handshake.

From this point on the two computers are connected to each other via TCP so they can start sending HTTP messages back and forth. When the user navigates away from the website, the connection will end. To end this TCP connection, either side sends a finish (or fin) packet to the other side followed by an acknowledgment by the other side. This terminates the connection.

Hypertext Transfer Protocol (HTTP)

HTTP is a common network protocol used for sending requests and responses on the web. HTTP requests generally have three components:

  1. Request line — includes the method, path, and HTTP version
  2. Headers — contain key-value pairs of extra information for the server
  3. Body — contains the contents of the request, such as new data being uploaded in a POST request.

HTTP responses follow the same general format as requests but in the top line (status line) there’s no method or path. Instead, this line contains a status code and message. For instance, the status code of 200 with a message of OK will be included in the response to a successful GET request.

Common request methods are:

  • GET request is a request for the server to send back some information.
  • POST request is used when the client is sending information to the server.
  • PUT replaces data on the server
  • DELETE deletes data from the server
  • PATCH partially updates data on the server
  • HEAD is the same as GET but without the body

Common status codes that a server can send back as a response:

  • 200: OK (request completed)
  • 201: created (often with POST requests)
  • 301: moved permanently (redirection)
  • 302: found (moved temporarily)
  • 400: bad request
  • 401: unauthorized (not authenticated)
  • 403: forbidden (you don’t have access to what you are trying to use)
  • 404: not found (the path you gave doesn't have a resource)
  • 500: internal server error
  • 503: service unavailable (if the server is down for planned maintenance)

HTTP isn’t privacy sensitive. Routers and intermediate devices controlled by your ISP, and websites you visit can read and modify all of the information transmitted over HTTP, including the HTTP headers, the originating and destination IP addresses, and even response data. This is because HTTP itself isn’t encrypted.

The Hypertext Transfer Protocol Secure (HTTPS) is an extension of HTTP that’s used for secure communication online. It requires servers to have trusted certificates and uses Transport Layer Security (TLS), a security protocol built on top of TCP to encrypt data communicated between a client and a server.

HTTPS wraps the HTTP message into encrypted message envelopes before they are transported over the network. HTTPS conceals the message body and HTTP headers but not the origin and destination IP addresses that indicate which nodes are talking to each other.

How Browsers Render Content

Once the browser has received the HTML files from the server, what's it going to do with them? How does it turn that HTML into a page that a user can interact with? The process of taking an HTML file and converting it to a user interface is known as the critical rendering path. This critical rendering path has five steps:

  1. Parse HTML and create a DOM tree and request any resources found such as images, scripts, fonts, and stylesheets.
  2. Parse CSS into CSS Object Model (CSSOM) tree.
  3. Combine DOM and CSSOM into a render tree containing information about the nodes that are going to be rendered to the page.
  4. Calculate the layout (width, height, location) of nodes based on the viewport size.
  5. Paint the screen using the render tree and layout calculations.

Dynamic Changes

What happens when JavaScript tries to change some elements on the page? We can divide it into three types of changes:

  1. Color change: the node will be repainted and it’s a very fast operation.
  2. Position change: reflow (recalculation) and repaint of the node that was changed and its sibling nodes.
  3. Major changes: reflow and repaint the entire page.

During the HTML parsing phase, the browser is requesting different resources as it finds them, images for example. But in the case of high-priority resources such as stylesheets and scripts that are necessary for the page to work, there’s something called the pre-load scanner. What it does is as the HTML is being parsed, it does through that HTML and it looks for any resources that are high priority and it makes HTTP requests for them while parsing is still happening. As a result, we don’t have to wait for these requests to come back as long as we would if we did while the HTML was looking for them.

Optimizing For Critical Rendering Path

So what can we do to optimize our code? Here are some ideas for you to implement:

  • Use defer/async scripts — the idea is to make our scripts not be render-blocking.
  • Minimize the size of the DOM — the deeper our HTML code goes, the more complicated the DOM tree has to be, and the longer it’s going to take to parse that HTML. However, we should never compromise on accessibility.
  • Reduce file sizes with compression/minification.
  • Lasy-loading — figure out what the minimum amount of content your page needs, don’t request everything at once. When the page becomes interactive, start requesting additional resources in the background.
  • Hardware accelerated animations — animations are complicated and they can slow down the page a lot. If we have an animation that is slowing down the page what we can do to improve performance is add a CSS rule of transform:translate 3D(0,0,0). It’s one of the ways we can tell the browser that we need to use compositing. When we use three-dimensional space, we need to layer elements. When the browser is doing this layering, it’s going to realize how complicated the animation is and it’s going to hardware accelerate it. It’s going to handle it on the GPU instead of the CPU which generally gives a performance increase. You can use it when you have performance issues with animation.

This concludes an overview of the key concepts necessary to understand how the web works. I hope you found it helpful.

--

--