How Do We Connect to the Internet and Access the Web?
It would be almost impossible to go a day without interacting with the internet in some meaningful way in our current age. It's everywhere you look and everywhere you don’t look. Hard to believe that as little as 30 years ago the internet was only being used by a handful of scientists, universities, and government institutions, and the World Wide Web had just been created! This article isn’t going to dig too deep into the history of the internet and the web, but rather take a look at the inner workings of how we access the two.
The Internet vs The World Wide Web
Let’s clear up one thing— the difference between the Internet and the World Wide Web. Often used interchangeably, the two terms are actually not synonymous. Even in the opening paragraph of this article, I struggled with what terms to use. The internet has actually existed for much longer than 30 years, while the World Wide Web was only created in 1989.
In broad terms, the internet is the global network of networks, and the web is a collection of information that is accessed by the internet. The internet sends and receives data (files, emails) and also provides the paths for which it all travels. The web is a service (or application) that sits atop the internet infrastructure and holds information as data that we can access via the internet. Just remember it is still very possible to send, receive, and access data via the internet without ever interacting with the web.
How do you know if you’re using the web? If a Hypertext transfer protocol (HTTP), Hypertext Markup Language (HTML), or a Uniform Resource Location (URL) is involved, that's a good indication you’re interacting with the web (via the internet, of course).
Side note on capitalization — the internet and the web should be lowercased, while the World Wide Web should always be uppercased. All 3 of the words were originally capitalized, but as of 2016, lowercase has become the new recommendation. They are all still considered to be proper nouns — an interesting read on how this came to be, here.
How your computer accesses the internet
The first connection your computer makes when getting on the internet is to your LAN, or Local Area Connection. A LAN is a single network that multiple devices can access in a single location. The obvious example is your modem at home that provides an internet connection and its router (most likely Wifi) that allows multiple devices like your phone, computer, and TV to share the same connection. Your LAN then connects to an even larger network known as a WAN, or Wide Area Network. The WAN is a collection of LANs and is most likely a router run by your ISP or Internet Service Provider. Your immediate WAN is probably located in your neighborhood, which is then connected to another regional WAN, perhaps for your entire city or town. This cycle is repeated, and after multiple “hops”, you are connected to the “backbone” of the internet.
The backbone of the internet transfers data at speeds up to 100 GB per second and is comprised of Tier 1 Internet Service Providers like Comcast, AT&T, Verizon, Sprint, etc. All of the Tier 1 ISPs freely share a connection to each other at Internet Exchange Points (IXPs) which are maintained by all the backbone Internet Service Providers. These IXPs are physical structures usually contained within buildings with their own network switches. These exchanges are what connect the entire world together over the internet. Tier 2 and Tier 3 Internet Service Providers are not actually connected to the backbone, but rather pay money to a larger Tier 1 provider who then provides a connection to the backbone for the smaller Tier 2 or 3 ISP to sell to their base of customers.
How the web is accessed via the internet
You probably navigated to this article via a google search, or perhaps by clicking a link to this article on another website. But how did your browser actually come to displaying the contents of this page?
Let’s break it down:
Domain Name System (DNS)
When you navigated to the URL for this article, your browser (aka the client) first needed to conduct a DNS or Domain Name System lookup. The DNS is like the phonebook of the internet. Every website has a unique IP address that can be found in the DNS via the website’s domain name. Your browser first checks your own device’s locally cached DNS storage to see if it has looked up the domain name before and has the IP address. If not found, it checks your Internet Service Provider’s DNS records, where the IP address can usually be resolved. The IP address will then be cached locally on your machine for a quicker lookup next time. This IP address will be used for the HTTP request.
An HTTP request with the IP Address is then sent to the corresponding server. All HTTP requests and responses use Transmission Control Protocol (TCP). TCP is one of the most basic standards of the internet that ensures end-to-end communication and the delivery of data. So your request opened a TCP channel to the server that is hosting the requested data that is located via the URL’s IP address. The server then approved the request and sent back a “200 OK” HTTP status message. The 200 code tells your browser that your request was properly received and that data is being sent your way.
The transfer of data is made possible by the internet, not the web —remember, the internet is the infrastructure. Each data packet must conform to a standard known as Internet Protocol (IP). Similar to how packages are actually sent through the mail, each packet of data must have the IP Address of where it’s going and also must abide by a regulated size limit. In your case, the packets had your computer’s IP address on them. If the data packets (aka files) are large, they are split into smaller packets and sent through many channels with different hops, then reassembled when they reach the client (your computer).
More on “hops”
A request might take 2 hops to get onto the backbone of the internet, 2 hops on the backbone, and then 3 hops down to hit the server. The data packet sent back would then possibly be split into smaller packets, all taking multiple hops to get to the backbone, then multiple hops to get off the backbone and down to your local machine, where they are all reassembled into one file or packet.
Parsing the website
Thanks for reading, I hope this gave you some insight into how the internet and the web work!