Networking in a Hurry đź’¨

Omar Hussein
The Modern Scientist
28 min readJan 16, 2024

You don’t have to be a DevOps Engineer or even a Networking Engineer to benefit from networking. as a Machine Learning Engineer with limited experience in Networking, I’ve relied on fundamental concepts to navigate this domain. However, gaining a comprehensive understanding of the intricacies, purposes, and mechanisms behind networking has proven invaluable. It not only aids in immediate problem-solving but also enhances my ability to swiftly acquire new knowledge on the job. So the knowledge here is designed to give you a solid understanding for most use cases but by no means is it all of Networking- Just the parts you need to understand.

We’ll begin with a relatable story to set the stage, followed by a detailed exploration of its components but if you are not ready for the story go through the basics section. Remember to be patient and that it is okay to not understand all the lingo at one go, Ready to dive in? Let’s get started, considering your time is valuable.

DISCLAIMER : In this post, my aim is to provide you with a clear and concise breakdown of networking concepts in a practical and that is why I explain TCP/IP as it reflects how real computer networks are implemented making it much more relevant to learn despite OSI’s theoretical vision. B

The OSI model is a theoretical framework that describes a complete range of functions required for network communications.

Basics

The client-server model is all about requests and responses between two components — a client and a server.

A client is an application or computer that makes requests to access resources or services. This could be a web browser, mobile app, etc.

A server is a computer that receives these requests and provides the requested resource or performs a task to fulfill the client’s request. Servers are usually more powerful computers.

For example, when you type a website URL in your web browser (client):

  1. The browser sends an HTTP request message to the web server asking it to send the website page.
  2. The server receives the request, processes it, and finds or generates the page data.
  3. The server then sends an HTTP response back to the browser with the requested website page content (usually HTML, CSS, JS files).
  4. The browser displays the pages, images, etc. on your screen.

IP

Here is a simplified explanation of IP addresses:

IP addresses are like home addresses for devices on the internet. They uniquely identify each device so that data can be sent to the correct destination.

When you type a website name like etherealai.io or apple.com into your browser, a DNS service looks up the website’s IP address. This allows your device to make a request to that address to load the site.

both apple & etherealai servers can then send back the website data to your device’s IP address in response. This allows two-way communication over the internet between your device and the website’s servers.

So IP addresses act like postal addresses that ensure internet data reaches the intended recipients. DNS services help map domain names we can remember to numerical IP addresses that devices use to route traffic. Together, they allow seamless data transfer to the right destinations.

Requests

A request is like asking someone a question. It is how a client program initiates communication and demands resources from a server.

For example, when you type youtube.com in the browser (client program) to visit youtube website:

  1. The browser constructs an HTTP GET request message for youtube web server, asking — “please give me the homepage for youtube.com”.
  2. This request message also contains the address of the browser, so youtube knows where to send back the response.
  3. The youtube web server receives this request.

So in essence, a web request is a message from a client asking a web server to provide some resource like a webpage, image, file or data. This initiation of communication by the client is termed a request.

Here is a simple explanation of some common HTTP requests:

GET — Fetches data from a server. For example, getting a web page, image or other file from a website:

“Hey server, can you GET me the homepage file?”

POST — Sends user-generated data to the server. For example, submitting a form on a website:

“Hey server, here’s some form data I want you to POST to your database.”

PUT — Uploads data to update something on the server. For example, uploading a file or updating a database entry:

“Hey server, please PUT this new file in place of the old one.”

DELETE — Deletes data from the server. For example, deleting a file:

“Hey server, please DELETE that document you have stored.”

So in short:

GET fetches data POST submits user data PUT updates data DELETE erases data

These requests allow the client and server to perform basic CRUD (create, read, update, delete) operations on data over the web.

HTTP

HTTP (Hypertext Transfer Protocol) is the communication rules between the client and the server. When Alice types a website URL into her browser (the client):

  • The browser uses HTTP to send a GET request message to the server asking for the website content
  • The server understands this HTTP request, fetches the website content, and constructs an HTTP response with the content
  • The server then transmits the HTTP response back to Alice’s browser with the website files
  • Alice’s browser receives this response and displays the website to Alice using the HTML, image files etc. included in it.

Now repeat this 7 times- I mean it. do it.

HTTP/HTTPs is in the application layer

TCP is in the Transport layer

End of Basics

Preface

TCP/IP Model Overview

Before we dive into the story of Alice and Bob, let’s look at the layers of the TCP/IP model that will be referenced, do not worry too much about all of them we will be mainly focusing on some of them. The important thing is that we are using now HTTP from the application layer, TCP from the transport layer and IP in the internet layer.

Each layer of the TCP/IP model has specific protocols and types of data units it handles, which play a crucial role in how data is transmitted across a network. With this model in mind, we will follow the journey of data as it travels from Alice to Bob, encapsulated and transformed at each step of the process. We will be walking through this top-down. From Application layer (HTTP) through Transport layer (where are a lot of the action happens starting with TCP) and so on.

The Story of Alice and Bob in the World of Networking

Client-Server Model: A model of interaction in a network where a client (Alice) requests resources or services, and a server (Bob) provides them.

Beginning the Journey: Alice’s Request

  1. Alice’s Intent: Alice wants to access a webpage hosted on Bob’s server.
  2. Private IP and MAC Address: Alice’s device has a private IP address (e.g., 192.168.1.101) in her LAN and a unique MAC address (e.g., A1:B2:C3:D4:E5:F6).
  3. DNS Query: Alice’s device checks her DNS cache for Bob’s server IP. If it’s not there, a DNS query is sent to resolve Bob’s domain name to a public IP address.
  4. DNS Cache Update: Once resolved, the DNS information is stored in Alice’s cache for future use.

Networking Layers in Action

  1. Application Layer (HTTP Request): Alice’s browser creates an HTTP GET request.
  2. Transport Layer (TCP Segmentation): This request is segmented by TCP. Each segment is labelled with sequence numbers for ordered reassembly and includes control information for reliable delivery.
  3. Network Layer (IP Datagram): Each TCP segment is then encapsulated in an IP datagram. This datagram includes Alice’s private IP as the source and Bob’s public IP as the destination.
  4. Data Link Layer (Ethernet Frame): The IP datagram is further encapsulated into an Ethernet frame, which includes Alice’s MAC address and her router’s MAC address.

On the Road: NAT and Routing

  1. NAT at Alice’s Router: Alice’s router uses Network Address Translation (NAT) to replace Alice’s private IP with its own public IP in the datagrams.
  2. Routing Through the Internet: These datagrams traverse the internet, hopping through routers using protocols like BGP. Here, stateless decisions are made solely based on destination IP, ignoring what the packet contains (payload).

Entering Bob’s Domain: VPC and Security

  1. Bob’s VPC: Bob’s server is in an Amazon VPC, an isolated network within AWS’s cloud infrastructure. 2. Security Groups: The VPC uses security groups, acting as stateful firewalls, to manage incoming traffic. These groups allow HTTP requests from any public IP, including Alice’s.
  2. CGNAT in AWS: Upon reaching AWS, Alice’s public IP-based request is translated to an internal private IP using Carrier-Grade NAT (CGNAT), allowing communication with Bob’s server.

Bob’s Server Processing

  1. Data Reception: Bob’s server receives the request, where the Ethernet frame is stripped to reveal the IP datagram, which is then processed to retrieve the TCP segment.
  2. TCP Reassembly: The server’s TCP stack reassembles segments using sequence numbers, ensuring the data is in the correct order and complete.
  3. HTTP Request Processing: The HTTP request reaches the application layer on Bob’s server, where the requested webpage is prepared.

Response Journey Back to Alice

  1. Response Creation: Bob’s server creates an HTTP response.
  2. Encapsulation Reversed: The response goes through similar encapsulation — TCP segments, IP datagrams, and finally Ethernet frames.
  3. Return Route: These frames travel back through AWS’s infrastructure, public internet, and Alice’s router.

Final Steps: Completing the Cycle

  1. NAT at Alice’s Router: Alice’s router translates the public IP back to Alice’s private IP and forwards the Ethernet frame to Alice’s device.
  2. Response Delivery: Alice’s device receives the frame, extracts the IP datagram and TCP segment, and finally, the HTTP response is rendered in her web browser.

Networking Concepts in the Alice and Bob Story

1. Private IP and Public IP

  • Private IP: Alice’s device has a private IP (192.168.1.101) used within her local network. It’s not routable on the internet.
  • Public IP: The router translates Alice’s private IP into a public IP to communicate over the internet.
  • Alternative: In some setups, VPNs could be used for secure communication over public networks while maintaining private IP addresses.

2. DNS Query and Cache

  • DNS Query: Alice’s device sends a DNS query to resolve Bob’s domain name to an IP address.
  • DNS Cache: The resolved IP address is stored locally to speed up future access.
  • Alternative: Sometimes, local hosts files might be used for small-scale, manual DNS-like functionality.

3. TCP Segmentation and Reassembly

  • Segmentation: TCP breaks down data into manageable segments, each with a sequence number for orderly reassembly.
  • Reassembly: At the destination, TCP reassembles these segments in the correct order.

4. IP Datagram and Routing

  • IP Datagram: Each TCP segment is encapsulated in an IP datagram, containing source and destination IP addresses.
  • Routing: Routers direct the datagram through the network based on the destination IP.
  • Alternative: MPLS (Multiprotocol Label Switching) can be used in some networks for more efficient data routing.

5. Ethernet Frame and MAC Address

  • Ethernet Frame: The datagram is further encapsulated in an Ethernet frame, containing MAC addresses of source and destination hardware.
  • MAC Address: Unique to each device, used for local network communication.

6. NAT (Network Address Translation)

Image is by cloudflare
  • NAT: Translates Alice’s private IP to a public IP for internet communication, and vice versa for incoming traffic. ( Why ? Here is a Link by cloudflare explaining it)

7. VPC and Security Groups

  • VPC: Virtual Private Cloud, providing an isolated network within the cloud.
  • Security Groups: Act as firewalls to control inbound and outbound traffic.

8. Handshakes and Connection Establishment (Detailed Story)

  • Initiation: Alice’s browser initiates a connection to Bob’s server by sending a TCP segment with a SYN (synchronize) flag set, indicating the start of a new connection.
  • Acknowledgment: Bob’s server acknowledges this request by sending back a TCP segment with SYN-ACK (synchronize-acknowledge) flags set.
  • Final Agreement: Alice’s device responds with an ACK (acknowledge) flag set, completing the handshake and establishing the connection.
  • Purpose: This handshake ensures both parties are ready for communication and agree on parameters like sequence numbers.

Basically translates to :

you good ?

I am good, you good ?

yeah, I am good.

Congestion Control

  • TCP Congestion Control: Adjusts the rate of data transmission based on network capacity, preventing congestion.

10. Firewalls and Protocols

  • Firewalls: Software or hardware-based systems that enforce security rules.
  • Protocols: PPP (Point-to-Point Protocol) is an alternative for direct connections between two nodes, often used in WANs.

10. Firewalls and Protocols

  • Firewalls: Software or hardware-based systems that enforce security rules.
  • Protocols: PPP (Point-to-Point Protocol) is an alternative for direct connections between two nodes, often used in WANs.

Optional

Here are the networking types in brief:

PAN — Connects small personal devices within a short range LAN — Connects computers within a limited area like a building or office

CAN — Interconnects LANs into campus-wide network MAN — Covers a city area connecting nearby buildings

WAN — Network spans a large geographical region like countries

The types vary by coverage area from smallest to largest:

PAN < LAN < CAN < MAN < WAN

Key difference is the geographic span — from within feet between individual devices in a PAN to thousands of kilometers connecting nations across a WAN.

11. Stateful vs. Stateless

  • Stateful (Security Groups in AWS): These remember the state of previous packets, allowing or denying new packets based on established connections.
  • Stateless (IP Routing): Makes decisions based purely on the current packet, without memory of past interactions.

12. Fragments and Bytes

  • Fragments: Large IP datagrams can be broken into smaller fragments to fit the MTU (Maximum Transmission Unit) of the network.
  • Bytes: Each fragment and packet consists of bytes, the basic units of digital data.

13. Header Information

  • Metadata and Headers: Both TCP and IP headers contain critical information (like source/destination IP, port numbers, sequence numbers) essential for routing and reassembly of data.

14. Sequence Numbers

  • Role in TCP: Used to order segments correctly and ensure data integrity. Also crucial in congestion control and for recovery from packet loss.

15. Ethernet Frames

  • PPP Alternative: While Ethernet is dominant in LANs, PPP is used for point-to-point direct connections in WANs. PPP frames differ from Ethernet frames, particularly in how they encapsulate data.

16. Firewalls

  • Network Security: Firewalls act as barriers between networks, inspecting incoming and outgoing traffic based on predefined security rules.

17. Congestion Control

  • TCP’s Role: TCP dynamically adjusts the rate of data transmission to prevent congestion on the network, ensuring reliable and efficient communication.

18. Handshakes — A Closer Look

  • SYN-SYN-ACK-ACK Process: This is the cornerstone of establishing a reliable connection in TCP. It’s a mutual agreement on connection parameters and a check for the readiness of both parties to communicate.
  • UDP’s Lack of Handshake: UDP, being stateless and faster, skips this process, which is why it’s preferred for real-time applications where speed is more critical than reliability, like streaming services.

Now, now this is actually very good, you have gotten this far. For the most part you are now more than functional. You want to learn more ?

Indeed, the initial story between Alice and Bob was based on TCP (Transmission Control Protocol). Let’s first explore the key differences between TCP and UDP (User Datagram Protocol), and then reimagine the story using UDP.

TCP vs. UDP

TCP (Transmission Control Protocol)

  • Reliability: Ensures data delivery with acknowledgments and retransmissions if necessary.
  • Ordered Delivery: Maintains the order of data packets.
  • Error Checking: Performs extensive error checking and recovery.
  • Congestion Control: Adjusts transmission rate based on network capacity.
  • Handshake Protocol: Employs a three-way handshake (SYN, SYN-ACK, ACK) for establishing connections.
  • Use Cases: Ideal for applications where reliability is crucial, like web browsing, email, file transfers.

UDP (User Datagram Protocol)

  • Speed and Simplicity: Less overhead than TCP, faster transmission.
  • No Guarantee of Delivery: No acknowledgments or retransmissions.
  • No Ordered Delivery: Packets may arrive out of order.
  • No Congestion Control: Does not adjust for network capacity; can overwhelm network with rapid transmission.
  • Stateless: No connection establishment or maintenance.
  • Use Cases: Suited for real-time applications where speed is prioritized over reliability, such as live video streaming, online gaming, VoIP.

## Comparison of TCP and UDP

Understanding the differences between TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) is crucial for our story. Here’s a quick comparison:

This table sets the stage for the nuances that will come into play in Alice and Bob’s digital communication adventure.

The Story Reimagined with UDP

In this version, Alice is streaming a live video from Bob’s server, a scenario where UDP’s speed and real-time data transfer are more beneficial than the reliability offered by TCP.

Alice’s Request

  1. Streaming Intent: Alice wants to watch a live video hosted on Bob’s server.
  2. Private IP and MAC Address: Alice’s device still has a private IP and a unique MAC address.

UDP in Action

  1. No DNS Cache: Assuming the IP is known or resolved by another means (as UDP does not involve DNS caching).
  2. Datagram Creation: Alice’s device sends a UDP datagram to request the stream. Unlike TCP, there is no segmentation or sequence numbers.
  3. No Handshake: There’s no SYN-SYN-ACK handshake process. The request is sent directly to Bob’s server.

Data Transmission

UDP Datagram and Routing: The UDP datagram, encapsulated in an IP packet and then an Ethernet frame, is sent through Alice’s router, which

performs NAT (replacing her private IP with its public IP) and sends it over the internet. 2. Speed over Reliability: The datagrams travel to Bob’s server without the need for acknowledgments or retransmissions. This means some packets might be lost or arrive out of order, but the stream continues uninterrupted, favoring real-time viewing over completeness.

No Congestion Control: UDP doesn’t adjust its sending rate based on network conditions. If the network is congested, packet loss may increase, affecting the quality of the stream.

Bob’s Server Processing

  1. Handling Datagram: Bob’s server receives the UDP datagrams. Since UDP doesn’t guarantee order, the server is designed to handle packets arriving out of sequence.
  2. Stream Delivery: The server starts streaming the video. It sends out UDP datagrams containing the video data back to Alice.

Response Journey Back to Alice

  1. Continuous Stream: Unlike TCP, there’s no error checking or retransmission from Bob’s server. Lost packets result in minor glitches in the stream rather than delays.
  2. NAT at Alice’s Router: The datagrams are translated back to Alice’s private IP and delivered to her device.

Viewing Experience

  1. Real-Time Streaming: Alice experiences the live stream with minimal buffering. However, she might notice occasional glitches due to packet loss.
  2. Stateless Communication: The connection is not “maintained” in any way. If Alice pauses or stops the stream, there’s no formal “disconnection” process as in TCP.

In short, The UDP version of the story highlights how UDP’s characteristics are suitable for applications where speed and real-time data transfer are more important than ensuring every single packet is received and in order

. This makes UDP ideal for live streaming, online gaming, and VoIP, where occasional packet loss is preferable to the latency that would be introduced by TCP’s reliability mechanisms.

Key Takeaways in the UDP Story:

  1. Efficiency and Speed: UDP’s lack of a connection establishment process, absence of acknowledgments, and simpler header structure contribute to faster data transmission.
  2. Loss Tolerance: Applications using UDP are generally designed to tolerate some level of data loss. For instance, in video streaming, losing a few frames might not significantly impact the viewer’s experience.
  3. No Congestion Handling: UDP does not slow down transmissions in response to network congestion, which can lead to increased packet loss but ensures continual data flow.
  4. Stateless Nature: UDP does not track the state of the connection or the data being transmitted, making it suitable for broadcasting where data is sent to multiple recipients simultaneously.

So, while TCP is about reliability and ordered delivery, UDP prioritizes speed and efficiency, making it the protocol of choice for time-sensitive applications where occasional data loss is an acceptable trade-off.

A Story of Routing: Alice Sends an Email to Bob

The Setup: Alice’s Email

Alice wants to send an email to Bob. Her email is a data packet with the address of Bob’s email server (let’s say bobsemail.com which translates to the IP address 93.184.216.34). Her own device has a private IP address (192.168.1.5), assigned by her router.

Breaking Down the Data: MTU in Play

Her email is a bit large, so her device breaks it down into smaller packets to comply with the MTU limit of her local network, which is 1500 bytes. Each packet is now small enough to travel without getting “stuck” due to size restrictions.

Router’s Role with Addresses

Alice’s router (Router A) has a public IP address (203.0.113.5) that represents Alice's network on the internet. It receives the packets and notes the destination address (93.184.216.34), which includes a route prefix (93.184) that indicates the general location of Bob's server in the network topology.

The First Hop: Leaving the Local Network

Router A sends the packets to the next hop, which is the ISP's router (Router B). Router B reads the route prefix and knows it needs to send the packets towards a specific direction, closer to Bob's server's network.

Address Resolution and Further Hopping

As packets move through the internet, each router along the way (Router C, Router D, etc.) uses the route prefix to forward the packets toward their destination. They don't need the exact address just yet, just like a letter doesn't need the exact house address until it gets to the local post office.

Reaching Bob’s Network

The packets finally arrive at the router (Router X) responsible for Bob's server network. Router X recognizes the full IP address and directs the packets to the specific server (93.184.216.34).

Bob’s Server Response

Bob’s server receives the packets, reassembles the

email, and sends an acknowledgment back to Alice. The acknowledgment packets have Alice’s router’s public IP address (203.0.113.5) as the destination.

The Journey Back

The acknowledgment takes a similar path back. Routers on the return path use the route prefix associated with Alice’s ISP to guide the packets. Each router examines the destination IP, updating the route as needed based on network conditions.

Final Delivery

The packets reach Alice’s router, which translates the public IP back to Alice’s private IP (192.168.1.5) using NAT, and delivers the acknowledgment to her device.

More or less you get it now, But I mean, if you have more time ?

Let’s dive a bit more into the application layer. Instead of HTTP now we will look into other types so, Let’s first briefly explain HTTPS and FTP and then retell Alice and Bob’s story to illustrate these protocols in action.

HTTPS (Hypertext Transfer Protocol Secure)

  • What is HTTPS?: HTTPS is the secure version of HTTP, used for secure communication over a computer network, commonly the internet. It encrypts data to keep it confidential, which is crucial for sensitive transactions like online banking or shopping.

When you visit a website with HTTPS, there is an extra protection added through something called SSL (Secure Sockets Layer). This does two main things:

  1. Encrypts the data being sent — So if anyone tries to intercept the traffic, they can’t read it because it appears scrambled and meaningless.
  2. Verifies website identity — It makes sure you are sending data to and receiving data from exactly the website you tried to visit, and not some imposter.

This prevents hackers from being able stealthily steal data or pretend to be another website.

  • Encryption: It uses SSL/TLS protocol to encrypt the data during transit, ensuring that the information sent between the client and server is unreadable to anyone else.

This is sufficient understanding of HTTPs for most cases, but if you want more, feel free to read this otherwise move on to FTP.

HTTPS uses TLS (Transport Layer Security) protocol to encrypt communication and provide authentication.

When you first visit a HTTPS website:

  1. Your browser requests the site’s SSL certificate containing its public key and identity.
  2. The TLS handshake happens — The website sends its signed SSL certificate, and your browser verifies it matches the domain.
  3. Your browser then uses the public key from the certificate to securely transmit a randomly generated symmetric session key to the server.
  4. This encrypted exchange of the symmetric key leverages TLS to enable secure transmission of data going forward.
  5. All further HTTP communication uses this symmetric session key and the TLS record protocol to encrypt application data flowing in each direction.

So in summary:

  • TLS enables authenticated key exchange used in HTTPS and encrypts the initial handshake.
  • The SSL certificate provides authentication and ensures the key exchange is with the right server.
  • Session keys are used for fast symmetric encryption of bulk application data sent after the handshake.
  • TLS provides the underlying authenticated and encrypted channels that SSL/HTTPS security builds on top of.
  • TLS is an upgraded version of SSL. It is essentially SSL version 3.1. TLS supports newer, stronger encryption algorithms and keys than SSL such as AES

FTP (File Transfer Protocol)

  • What is FTP?: FTP is a standard network protocol used for the transfer of computer files between a client and server on a computer network. It’s often used for uploading or downloading large files from a server.
  • Modes of Operation: FTP can operate in active and passive modes, which define how the connection between a client and server is established.

More….. Let’s code then.

Again, if you’re in a hurry- you’re good at this point, but we can still do more if you want ? Perhaps a change of pace… Some code ?.

Using Python, we’ll focus on some of the most critical parts, such as making HTTP requests, handling DNS lookups, and demonstrating TCP connections.

1. Making an HTTP Request (GET)

Python’s requests library is commonly used for making HTTP requests. Here's a simple example of sending an HTTP GET request:

import requests

url = 'http://example.com'
response = requests.get(url)

print(response.text) # Prints the HTML content of the requested page

2. DNS Lookup

Python’s socket library can be used for DNS lookups to resolve hostnames to IP addresses:

import socket

hostname = 'example.com'
ip_address = socket.gethostbyname(hostname)

print(f"The IP address of {hostname} is {ip_address}")

3. Establishing a TCP Connection

Creating a basic TCP client in Python involves using the socket library. Here's a simple example to establish a TCP connection:

import socket

host = 'example.com'
port = 80 # HTTP port

# Create a socket object with IPv4 addressing and TCP protocol
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Connect to the server
client_socket.connect((host, port))
print(f"Connected to {host} on port {port}")

# Close the connection
client_socket.close()

The decision to use a low-level TCP connection (as shown in point 3) versus an abstracted HTTP request (as in points 1 and 4) depends on your specific needs and the level of control required over the communication process. But then again I almost never use this and mainly use the requests library.

When to Use a Direct TCP Connection:

  1. Custom Protocol Implementation: If you are working with a custom protocol that operates over TCP but isn’t HTTP.
  2. Lower-Level Network Operations: For educational purposes or when you need to manipulate aspects of the TCP protocol, such as custom timeout settings, packet sizes, or direct streaming of data.
  3. Non-HTTP Services: Interacting with services that don’t communicate over HTTP, like SMTP for email, FTP for file transfers, or other TCP-based protocols.

Example Scenario:

  • Direct TCP Connection: Suppose you’re developing a client that communicates with a custom TCP server for a real-time data streaming service. Using a direct TCP connection would be more appropriate as it gives you the ability to fine-tune the connection and data handling according to the protocol’s requirements.

4. HTTPS Request

For HTTPS requests, Python’s requests library handles SSL/TLS encryption automatically:

response = requests.get('https://example.com')
print(response.text) # Prints the secure HTML content of the requested page

5. Using FTP

Python’s ftplib library can be used for FTP operations. Here's an example of connecting to an FTP server:

from ftplib import FTP

ftp_server = 'ftp.example.com'
ftp = FTP(ftp_server)
ftp.login() # login with anonymous user

ftp.retrlines('LIST') # List directory contents

ftp.quit()

It’s particularly useful for handling large files or batches of files, and for situations where HTTP might not be as efficient or appropriate. Here are some common use cases for FTP:

  1. Website Management: Uploading or downloading web files (like HTML, CSS, images) to and from a web server.
  2. Data Transfer: Sharing or backing up large datasets that might be too big for email attachments or other methods.
  3. Software Distribution: Distributing software and updates, especially in a corporate environment or for server-to-server transfers.

4. Remote File Management: Managing files on a remote server, including renaming, deleting, or changing file permissions.

For example

from ftplib import FTP

def download_file(ftp_server, username, password, remote_file_path, local_file_path):
ftp = FTP(ftp_server)
ftp.login(username, password) # Replace with actual username and password

with open(local_file_path, 'wb') as file:
ftp.retrbinary(f'RETR {remote_file_path}', file.write)

ftp.quit()
print(f"Downloaded {remote_file_path} to {local_file_path}")

# Example usage
download_file('ftp.example.com', 'user', 'pass', '/path/to/remote/file.txt', 'local_file.txt')

Here is most of the code that you will probably need

from ftplib import FTP

# Connect to an FTP server
def connect_to_ftp(host, username, password):
ftp = FTP(host)
ftp.login(username, password)
return ftp

# List files in the current directory on the server
def list_files(ftp):
files = ftp.nlst()
for file in files:
print(file)

# Upload a file to the server
def upload_file(ftp, file_path, server_path):
with open(file_path, 'rb') as file:
ftp.storbinary(f'STOR {server_path}', file)
print(f"Uploaded {file_path} to {server_path}")

# Download a file from the server
def download_file(ftp, server_path, local_path):
with open(local_path, 'wb') as file:
ftp.retrbinary(f'RETR {server_path}', file.write)
print(f"Downloaded {server_path} to {local_path}")

# Change directory on the server
def change_directory(ftp, path):
ftp.cwd(path)
print(f"Changed directory to {path}")

# Disconnect from the server
def disconnect_from_ftp(ftp):
ftp.quit()

# Main function to demonstrate FTP operations
def main():
host = "ftp.example.com"
username = "user"
password = "pass"

ftp = connect_to_ftp(host, username, password)

print("Files in the root directory:")
list_files(ftp)

# Change to a specific directory
change_directory(ftp, '/path/to/directory')

# Upload a file
upload_file(ftp, 'local_file.txt', 'server_file.txt')

# Download a file
download_file(ftp, 'server_file.txt', 'downloaded_file.txt')

# Disconnect
disconnect_from_ftp(ftp)

if __name__ == "__main__":
main()

Breakdown of the FTP Example

Host

  • Host: The host is the address of the FTP server that you are trying to connect to. It can be a domain name (like ftp.example.com) or an IP address. It's similar to a physical address for a house; it tells your FTP client where to find the FTP server on the internet.

connect_to_ftp Function

  • Purpose: This function establishes a connection to the FTP server.
  • Parameters:
  • host: The address of the FTP server.
  • username and password: Credentials for logging into the FTP server. These are required because most FTP servers have some level of access control.

list_files Function

  • Purpose: Lists all files and directories in the current directory on the FTP server.
  • Implementation: Uses the nlst() method of the FTP object, which sends a command to the server to list directory contents.

upload_file Function

  • Purpose: Uploads a file from the local machine to the FTP server.
  • Parameters:
  • file_path: Path of the file on the local machine.
  • server_path: Path where the file will be stored on the server.
  • Implementation: Opens the local file in binary mode and uses storbinary to upload it, which reads the local file and sends it to the FTP server.

download_file Function

  • Purpose: Downloads a file from the FTP server to the local machine.
  • Parameters:
  • server_path: Path of the file on the server.
  • local_path: Path where the file will be saved on the local machine.
  • Implementation: Uses retrbinary to retrieve the file. It writes the data to the local file as it's received from the server.

change_directory Function

  • Purpose: Changes the current working directory on the FTP server.
  • Parameter:
  • path: The directory path on the server to change to.
  • Implementation: Uses cwd (change working directory) command to change the directory.

disconnect_from_ftp Function

  • Purpose: Closes the connection to the FTP server.
  • Implementation: Calls quit() on the FTP object, which properly closes the connection.

Now let us put things together

I want you to think about creating a service for uploading and streaming videos involves several considerations, especially regarding the application layer and transport layer protocols. ….. Done thinking ?

Let’s outline how such a service might be structured, In a simplified scenario, you might use Python for the backend with Flask as a web framework, and an FTP server for file storage. For streaming, HTTP Live Streaming (HLS) can be an effective method.

[ But Omar I don’t know what HLS is and after all of this I would have thunk I would use UDP…. You’ve made it this far patience ]

Below is a conceptual example of how this could be implemented First

1. Flask Web Service for Video Upload

NOTE : In a real-world scenario, you might have multiple videos to upload, and you could implement batch processing or parallel uploads to improve efficiency. However, the core concept of uploading videos remains the same.

I will be using flask for this

  • Flask and os: These are Python modules. Flask is a web framework used to create web applications, and os is a module for interacting with the operating system.
  • Flask App Instance: app = Flask(__name__) creates an instance of the Flask class. This instance acts as the web application.

Here’s how you might set up a Flask application to handle video uploads:

from flask import Flask, request, jsonify, send_from_directory
import os

app = Flask(__name__)
UPLOAD_FOLDER = '/path/to/uploaded/videos'
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER

@app.route('/upload', methods=['POST'])
def upload_file():
if 'file' not in request.files:
return jsonify({'error': 'No file part'})

file = request.files['file']
if file.filename == '':
return jsonify({'error': 'No selected file'})

if file:
filename = file.filename
filepath = os.path.join(app.config['UPLOAD_FOLDER'], filename)
file.save(filepath)
return jsonify({'success': f'File {filename} uploaded successfully.'})

if __name__ == '__main__':
app.run(debug=True)

In this setup, the video is uploaded as a binary file through a multi-part form request, which is a standard way to upload files over HTTP. A binary file is a type of computer file that is not a text file. Unlike text files, which store data in a format that is readable by humans (like ASCII characters), binary files store data in binary form, which is typically readable only by computers.

Route Decorator: @app.route('/upload', methods=['POST']) tells Flask that this function (upload_file) should be called when a POST request is made to the /upload URL.

As for the actual process.

  • Suppose an uploaded file video.mp4 is to be saved in the directory /uploads. If the UPLOAD_FOLDER in the Flask app is set to /uploads, the filepath would be /uploads/video.mp4.
  • Calling file.save('/uploads/video.mp4') would save the uploaded video file to that location on the server.

Networking Protocol for Upload: In the code provided, we are using HTTP for video uploads. HTTP (Hypertext Transfer Protocol) is a widely used protocol for data transfer over the web. It’s often used for file uploads because it’s supported by most web servers and can handle binary data like videos.

We are not using FTP (File Transfer Protocol) in this example for simplicity, but FTP is another protocol specifically designed for file transfers. FTP provides more features and control over file transfers than HTTP, making it a good choice for scenarios where you need fine-grained control, such as transferring large numbers of files or managing directories on a remote server. FTP is not as widely supported in web applications as HTTP, which is why HTTP is often preferred for simple file uploads.

2. File Transfer to Remote Server

After the video is uploaded to the Flask server, it can be transferred to a remote server (e.g., FTP server) for storage:

from ftplib import FTP

def transfer_to_ftp(filepath, filename):
ftp_server = 'ftp.example.com'
username = 'user'
password = 'password'

ftp = FTP(ftp_server)
ftp.login(username, password)

with open(filepath, 'rb') as file:
ftp.storbinary(f'STOR {filename}', file)

ftp.quit()

This function should be called after the file is saved in the upload_file endpoint.

3. Video Streaming Service

To stream videos, you’ll need to convert them into a format suitable for streaming (like HLS) and then serve them via HTTP. This part can be complex and might involve using additional tools like FFmpeg for video processing and a more advanced setup for serving the content.

A simple example to serve video files using Flask:

@app.route('/video/<filename>')
def stream_video(filename):
return send_from_directory(app.config['UPLOAD_FOLDER'], filename)

This example provides a basic framework for a service that uploads videos and allows them to be streamed. In a real-world scenario, you’d need to consider many more details, such as robust error handling, data encryption, efficient storage and retrieval, video transcoding for different resolutions, and bandwidth optimization. Additionally, the actual streaming of video would likely be done using specialized streaming servers or services that can handle the demands of video streaming better than a simple Flask app.

For a scalable, production-grade system, you would explore cloud-based solutions like AWS S3 for storage, Elastic Transcoder for video processing, and CloudFront for content delivery. These services provide the necessary infrastructure to handle large-scale video streaming applications.

Now back to the choice of HLS over UDP

HLS (HTTP Live Streaming)

  • Protocol Layer: HLS operates at the application layer and is built on top of HTTP.
  • How It Works: HLS streams multimedia content using a series of small HTTP-based file downloads. It breaks the overall stream into a sequence of small HTTP-based file segments, each containing a short interval of playback time.
  • Adaptive Streaming: HLS supports adaptive bitrate streaming, meaning it can adjust the quality of video streams in real time based on the viewer’s available bandwidth and device capabilities. (the Settings button on YouTube)
  • Compatibility and Reach: HLS is widely supported on various platforms and devices, including browsers, smartphones, and smart TVs.
  • Reliability: Since HLS uses HTTP, it benefits from the reliability of TCP (Transmission Control Protocol), ensuring that all data packets are delivered correctly and in order as opposed to UDP

So to answer the question you never asked, why Not Use UDP for HLS?

  • Reliability over Speed: For many streaming applications, particularly those delivered over the internet to a diverse range of devices, the reliability and adaptability of HLS outweigh the speed advantage of UDP. HLS ensures that all viewers receive a continuous and adaptable stream, regardless of their network conditions.
  • Network Compatibility: UDP can face challenges with firewalls and network address translation (NAT), as many networks are optimized for TCP-based traffic.
  • Error Correction: HLS’s use of TCP allows for error correction, ensuring that the video and audio data are received correctly, which is crucial for providing a smooth viewing experience.
  • Browser Support: HLS is widely supported in web browsers and does not require additional plugins or software, unlike some UDP-based streaming solutions.

Besides UDP and FTP (File Transfer Protocol), as well as its secure version SFTP (SSH File Transfer Protocol), still have their use cases, but they are less common in modern media streaming and content delivery scenarios, especially for large-scale streaming services. BUT MAKE NO MISTAKE UDP is Faster so it is up to you to make that judgment call if need be.

Troubleshooting on MAC

Troubleshooting network connectivity issues in the terminal on a Mac can be a useful skill for diagnosing and resolving networking problems. Here are some common troubleshooting steps you can perform using terminal commands:

Check Network Interfaces

  • To view a list of available network interfaces, open Terminal and type:
  • This command will display information about each network interface, including their IP addresses and status.
ifconfig

Check IP Configuration

  • To check the IP configuration of a specific network interface (e.g., Wi-Fi or Ethernet), use the ifconfig command followed by the interface name. For example:
ifconfig en0

Ping Test

  • You can use the ping command to test network connectivity to a remote host. For example:

This command sends ICMP echo requests to the specified host (in this case, google.com) and displays the response time.

ping google.com

try -c as an argument, see what happens.

DNS Resolution:

  • To troubleshoot DNS resolution issues, you can use the nslookup or dig command to query DNS servers. For example:
dig google.com

Traceroute

traceroute is a network diagnostic tool that helps you trace the route that packets take from your computer to a destination host or IP address. It provides information about each intermediate router or hop along the path and measures the time it takes for packets to reach each hop. This can be useful for diagnosing network latency or routing issues. Here's how to use traceroute in the terminal on a Mac:

traceroute [destination]

for example traceroute etherealai.io

Finally

here’s a table summarizing the commonly used networking protocols and technologies in modern applications, including web services, media streaming, and various other scenarios:

NSlook up

nslookup [domain or IP]

A while ago, I saw this Gif on linkedin

Thank you for reading

Images & References:

--

--