A Racket of Packets
Recently a close friend asked me a series of questions that shook me to the core of my being. I had been living under the impression I knew myself; who I was, what I was made of(physically/emotionally/spiritually), what was important to me, etc. Yet after three questions I felt like a stranger in this body of mine. The culprits:
- “How do packets work?”
- “How do they get where they’re going?”
- “How do you know they’re still ok when they get there?”
After an disgraceful response to the first and a silent grimace to the others as my stomach tied itself into knots, I knew solace would only come through learning everything I could immediately. In an effort to save others from the agony of ego death, I recount my plunge.
So what’s a packet?
A packet is a protocol data unit at the second layer(the internet layer) of the Internet Protocol Suite model of computer networking. You may be thinking “Gee, Connor, that sentence means nothing to me and I hate you.” I feel you. The thing is packets are complicated and to do them justice we need a bit of background info about the Internet Protocol Suite.
…so what’s the IP suite?
It’s the networking model employed by the internet that characterizes how data should be packaged, addressed, routed and transmitted. The model describes the structure of a system capable of sending a message across a network as a stack of layers in which protocols reside. As a message is passed down the stack, each protocol does something to the message to help it get to where it needs to go. By the time it gets to the bottom layer the message has all the information it needs and begins its journey. Those layers, from bottom to top, are: the link layer (re: intra-network communication), internet layer (re: inter-network comm.), transport layer (re: host-to-host comm.), and application layer (re: process-to-process comm.). Each layer has its own protocol data unit(PDU) which is essentially its unit of measurement, or thing it creates. The lowest — link layer — calls its PDU a frame, which is essentially a wrapper around the internet layer’s PDU, a packet(!). The two biggest protocols on the transfer layer name to their PDUs differently: TCP calls it a segment while UDP and most others call it a datagram. The application layer’s PDU is just data.
As a side note, the IP suite and the open systems interconnection model have a lot in common. While the IP suite isn’t trying to be OSI compliant, they do overlap in many places, and the IP suite could easily be mistaken for a subset of OSI by an unwary novice(i.e., me). I highly suggest checking out OSI as well, and looking into the differences between the two. The main difference I’ve found is that some IP suite protocols don’t fit neatly into any one layer, and IP suite is fine with that. OSI is more of a clean freak about its protocols. Also, OSI concerns itself with a physical layer below the link layer which IP suite just kind of assumes exists and doesn’t really talk about.
Now that we have some terminology to work with let’s go through the life cycle of some data being transferred via the IP Suite from one computer to another. We’ll be talking about a stack of HTTPS/TCP/IP/MAC(Ethernet), because that’s the stack you’re viewing this with. Your browser wants to send some data (the HTTPS GET request for this page) that TCP determines will fit over IPv4 in x number of segments, so it breaks it up into x chunks and attaches a TCP header to each chunk. The header contains a bunch of info, including the port(s) it’s being sent to and from. The TCP segments get handed off to IP (the internet layer) which wraps them in packets that kind of look similar to segments (a header attached to a payload) except the header contains a final destination IP address and route information instead of ports. Those packets are handed to link layer (Ethernet) which frames them with a header containing the next jump’s MAC address and transmits the frames to the node (router, firewall, host) with that address. When a frame is received Ethernet unwraps it and passes the payload up to IP, which checks to see if the packet is supposed to go somewhere else or stop there. If it’s supposed to stop there (because “there” is the medium server hosting this page) it keeps unwrapping up through the rest of the stack, but if it’s supposed to go on further IP will look up the next stop on a routing table and rewrap the packet with updated header info, passing it back down to Ethernet to get a new MAC address for the “next stop” and on to the next node, etc. etc. until it gets to where it’s going.
Thusly a packet lives and dies on the internet.
How do we get to your house?
So above we’ve got packets and frames, and they have addresses that tell nodes where they should go next as well as where the last stop is. And the next jump addresses get updated each jump. “But Connor, how does my router know your router’s IP address?” It turns out your router is really outgoing and friendly. Routers talk to each other via protocols like Address Resolution Protocol, Open Shortest Path First, and Neighbor Discovery Protocol in order to build routing tables. These tables are used to look up the shortest path for a packet, and the next node on that path. The routers play hot potato along that path until eventually it gets to a node that recognizes the destination IP as belonging to their network, and sends the packet to the intended recipient.
I’d say the most impressive part of that process is how routing tables are made. There are a few different approaches to calculating the shortest paths between nodes:
- link-state routing has each node start off with a graphical map of the network it’s part of. Every node broadcasts to every other node and collects data on the other nodes’ transmissions, assembling all the results into a path-of-least-resistance map. The map is used to create a routing table of optimal next-hops.
- optimized link-state routing has every node broadcasting its info like link-state routing, but uses topology control to find and distribute path resistance info to each node. Every node then builds its own tree graph of lowest cost paths. This is mostly interesting for wireless networks, where topology control rehashing can dynamically reduce energy expenditure and interference, but is also relevant to prolonged traffic spikes in any kind of network.
- distance vector routing starts off dumb, with each node assigning a ‘cost’ to each direct link it has to another node and using that as its routing table. Then the nodes begin broadcasting their routing table to their neighbors, who incorporate it into their own costs to reach second-step nodes. Rinse and repeat until all nodes in a network know the aggregate cost to each other node. Distance vector routing protocols are intra-network only.
- path vector routing is similar to distance vector routing except it’s for inter-networking. It makes the assumption that one node labeled a gateway in a network can serve as an entry point to/for the rest of the nodes in the network. The gateway nodes go about creating a routing table with other networks’ gateway nodes using something similar to distance vector routing. Path vector routing is a big part of why your home router doesn’t need to know many other IP addresses.
To summarize, a huge concern of the link layer is building and maintaining these routing tables so your packets of kittens-in-mittens.png can not only make it from your laptop to Instagram’s servers, but do so as quickly as possible.
Kitty Incomplete: Resend Pawcket
Now that we’ve got a better understanding of routing, it’s time to dive deeper into packet composition to answer the last question about data integrity. Recall that a frame is a wrapper around a packet which is a wrapper around a segment/datagram, and each protocol has its own set of headers it’s adding. Most protocols include a header for performing some sort of data integrity check.
At the link layer our wrapper includes a checksum generated by a cyclical redundancy check. The maths behind CRCs are more complicated than I can do justice to here, but they boil down to converting a data block into a polynomial and doing some long-division to get a remainder, which is your checksum. A given link layer protocol will be using a particular CRC (they come in different sizes!) at both ends, so doing the same math on the block when you receive it should tell you if there’s been any data corruption (from interference, typically).
They don’t protect at all against malicious behavior, since a block that gets modified en-route has the checksum attached to it, and it’s no secret how to do the math to generate a new checksum. If some jerk wanted to modify your block then put a new checksum on it they could, and when the receiver gets the modified block, it will see the new checksum attached — which is correct for the new block — and have no idea the block as been tampered with. But that’s ok, because we’ve got encryption higher up in the stack (which we’ll touch on later).
CRCs involve some awesome maths, and I highly suggest learning more about them and exploring their various uses if they sound nifty to you.
Internet and Transport Layers
Inside the link layer frame sits our IPv4/6 packet, which has its own headers. IPv4 headers contain a header checksum, but the newer IPv6 has done away with them and leaves the bit-checking for the link layer to handle. IPv4 still accounts for most internet traffic though, and TCP segments (i.e. the next layer) use the same type of checksum, so it’s worth looking at.
TCP and IPv4’s header checksums are different from the link layer CRC checksum in a couple of ways: TCP/IPv4 are only checking the integrity of their headers — not the payloads, and the method of generating the checksum is a much simpler algorithm. The headers get split into 16-bit numbers and added using ones’ compliment arithmetic. The ones’ compliment of that sum is the header checksum that gets attached to the packet. Similarly to CRCs, these checksums are about guarding against data corruption in transit, and do nothing to protect against intentional malicious behavior. But unlike before in the link layer, at the internet layer we do have some cryptographic options in IPsec.
Internet Protocol Security
IPsec is a suite of protocols in the internet layer that modify or wrap IP packets before they get passed off to the link layer. There are two modes of use: transport, where only the payload gets modified (authenticated and/or encrypted); and tunnel, where the entire packet gets wrapped in a new, encrypted/authenticated IPsec packet (which somewhat resembles and is treated like an IP packet).
“Is tunnel mode related to virtual private networking?” Yep, that’s how you can stream Netflix in China; by tunneling your traffic to a border gateway in Japan so their servers will respond to you. Anyone sniffing the IPsec packet en route to the border gateway won’t be able to see the IP packet inside or that packet’s intended final destination. And once it’s unwrapped, it acts just like a regular packet and continues on to Netflix.
IPsec is a vast topic of its own, so I’m glazing over a bunch of stuff like the cryptographic hash functions it uses and the gritty details about the various protocol implementations. Like CRCs, if this piques your interest I highly suggest looking deeper into IPsec.
Last stop on the encryption train
After the internet and transport layers there’s just the application layer left. In it we’ve got a handful of options for encryption, but in terms of web traffic, Transfer Layer Security and its predecessor Secure Sockets Layer (and their interaction with HTTP) are the stack we’re looking at. SSL has some differences, but we’re going to ignore them and focus on TLS in the interest of brevity (Ha! Too late for that).
Despite the name, TLS is in the application layer. It establishes a trusted connection through a handshake protocol. After establishing the connection TLS will encrypt HTTP requests with a hashing function and then send them down the stack. TLS records are payloads wrapped with a handful of TLS headers (protocol version, content type, and record length, et. al.). The headers are used by the receiving host to determine the purpose of the record (whether it’s part of a handshake, some type of alert, or just an encrypted payload). If it’s wrapped data, it gets decrypted with the hash function and secret keys determined during the handshake before passing it up to HTTP. The cryptographic hash functions are another massive topic on their own, so we’ll side step them along with some implementation details of TLS that we’ve missed, but hopefully the gist of how it works is clear. Cryptography is hugely important (and awesome) though; I’m just utterly unqualified to write about it.
There we have it
Gone is the pain in my gut. With a better understanding of the fabric of my world I sleep better at night. I hope the tale of my tribulation and deliverance was helpful. Of course if you happen upon a more succinct (or contradictory) accounting of packets, I would greatly appreciate a comment from you. Or if you feel the urge to elaborate on any of the areas covered in haste above I’d also enjoy that tremendously.
Until doubt shakes me silly again, cheers!