How complicated could it be?

Source: Tumblr

So we’ve all typed an address into our browser and had that moment where the page doesn’t immediately load and we think to ourselves “Come on machine, how hard could it be?” Well, after some extensive research, it turns out it can be very very complicated. So let’s break down what happens and why it’s important to know what’s happening.

Note: Aside from the code being run on your computer and the processes that exist on the internet, there are a whole slew of electrical/physical processes that happen like keyboard circuitry and data packets sent to pins but that might even be boring to a computer. Also we will be acting under the assumption that we have never visited the website we are trying to reach so there is nothing in the browser or OS cache. That is a whole different bag of beans.

Phase 1: Initialization (Method)

So, the easy part, the part we all know: entering the URL (Uniform Resource Locator or, the web address) and hitting enter. This is where if we experience a delay we end up like this:

Source: SpookySkeletons.net, ok no it’s actually from Reddit.

After this our browser parses the URL and we send it to locate the protocol identifier, the host name, the port number, and the path. Without the IP address (Internet Protocol, a numbered label of a machine in a network) in a cache, a request is made to the server’s configured DNS (Domain Name System used to translate domain names into IP addresses) so the browser can utilize it. During this process the request can be lost (which is when you may have seen that “request failed” message).

Phase Two: You down with HTTP?

Now that the browser has an IP address the browser forms an HTTP request which is formatted like this:

Source: tutorialspoint.com
Request-Line = Method SP Request-URI SP HTTP-Version CRLF

So a general breakdown of the Request-Line would be: The request method, such as GET, POST, or DELETE. These have different intentions, GET for example would be used to retrieve information form the given server. The Request-URI (Uniform Resource Identifier) identifies the actual resource on which we want to apply the request.

The header is actually very important because it tells the server what kind of browser is making the request, as well as what it can handle. If you have visited the website before and there are things like a username and password saved, this is where the form data gets sent. And finally a message if necessary.

Phase 3: Shipping faster than the Post Office, not like that’s hard to do.

Source: HowStuffWorks.com

So now our request is shipped off to the server in packets (formatted unit of data) and it is routed in the same way as our earlier DNS query for the IP address. So now that the server receives the packets, the exciting part happens. The server generates a response, a webpage. The server then sends that response back to the client browser.

Phase Four: HTMLOL

Source: FormLogic.com

Now it’s not as simple as the server sending us a webpage that we can immediately read (we will assume that the webpage sends back HTML and not some other form of data). Our browser does the grunt work here to parse that HTML that we all love so much to be able to render the actual page. Since webpages are made up of more than just text, the browser will find where external resources are located through a series of separate requests and render them on the page. There is a complicated process of parsing HTML that happens here using a parsing algorithm. Basically HTML cannot be parsed using a regular top-down or bottom-up parser, so a special HTML parsing algorithm which utilizes tokenization and tree construction is used.

Source: W3.org

Tokenization is a complicated and lengthy process so I’ll include a link but in essence the tokenizer turns the HTML into tokens, it then sends these tokens to a parse tree. The parse tree is a tree of DOM elements (Document Object Model, a programming interface for HTML) which means that the HTML elements will be able to be presented as an object. From there we create a frame tree which is where we actually sift through the DOM nodes and calculate and represent the actual styling of the webpage to be displayed.

In conclusion…

I guess the most valuable lesson I gleaned from writing this blog wasn’t anything specifically technical (not that I didn’t enjoy smashing my head on the table as I learned how complicated tokenization could get). It was more a deeper appreciation of how intricate our computers and their relationship with the internet is. Something that I always viewed as slow if it took longer than a second has thousands of processes involved in it, so I guess next time my webpage takes a little while to load, I’ll be able to appreciate how hard my machines are working.

Sources: