Networking 101: What are sockets?

K.M Ahnaf Zamil
8 min readJul 16, 2022

--

If you are a programmer and don’t live under a rock, you probably have heard the term “sockets” quite a lot. In this post, I will attempt to explain what sockets are and how they work in a simple manner. I won’t be showing any examples for specific languages, because there are better resources for that. Just search “Sockets in (insert programming language)” on Google and you will find a lot of tutorials.

What’s the fuss about sockets?

Well, sockets are basically a way to speak to other programs on a computer, or other computers on a network. That’s basically what it is, but we haven’t even scratched the surface yet. A newbie programmer would consider a socket to be some sort of abstraction or magic to talk to servers or computers on a network (no offence). But there’s more to it, A LOT more; and that’s what I want to tell you about.

Sockets are mostly a UNIX/Linux concept, but they work in a similar way across all operating systems to maintain a common way of implementing it (so even if you are a Windows user, the knowledge might still be helpful). This might sound surprising to you, but every socket is simply a file descriptor. A file descriptor is an integer that refers to/is associated with an open file. Sockets communicate through reading/writing to the file descriptors. And it’s not just sockets that do it; In fact, every UNIX program does I/O by reading or writing to a file descriptor. That includes network connections as well, which is normally what we would use sockets for.

In this post, I will only be talking about internet/network sockets, and not other types of sockets (that would take me ages to write about).

How do you use sockets?

Sockets are managed and monitored by the OS kernel, and you can use system routines to create sockets. The socket() syscall returns you a file descriptor, which you can refer to as a “socket descriptor”. This socket descriptor is simply an integer, as I stated above. Now you can use send() and recv() syscalls to send and receive data from that socket (there are a lot more functions, but I just gave these 3 as examples).

You might be wondering, “Isn’t their an easier way to do it?” Well, yes. Most of these functions are available for C in the sys/socket.h header file (this is only for UNIX-like operating systems. Windows needs winsock2.h). Again, other programming languages have libraries that abstract away all these complex stuff (i.e Python has the socket module). Use whichever language/library you want. BUT, if you want to get a deeper understanding of how “raw” sockets work and mess around with it, I recommend you use C/C++ for this purpose. Later on, you can use any other programming language or library for building applications with the knowledge you gather.

Types of Sockets

All sockets have a common API for easier usability. The sockets API was introduced by BSD for network and inter-process communication. These are now implemented and used by UNIX and other operating systems.

The major distinction between sockets are that some sockets use a network to communicate and some sockets do not. The prior are called network/internet sockets (also known as Berkeley sockets) and the latter are called UNIX domain sockets. Windows implemented the Berkeley socket API in it’s Winsock library. In this post, I will only talk about Internet sockets and not typical UNIX sockets.

There are several types of internet sockets, but usually we just use any of the 4 types. So, what are the 4 types?

Stream sockets

Also known as SOCK_STREAM , these are connection-oriented sockets where delivery of packets is guaranteed. Connection-oriented simply means that the client and server establish a logical connection through some sort of handshake and acknowledge each other in order to start passing packets.

If you were to send 2 items through the socket “H, i”, they will arrive in the same order — “H, i”. Stream sockets use the TCP protocol for transmitting data. Data records don’t maintain any boundaries though, which means that you don’t know where the message ends/terminates. To tackle this, senders add a terminating null character to the end of their messages, indicating that the message ends at that place.

Web browsers use the HTTP protocol, which is an abstraction over TCP. This means, web browsers also use stream sockets for fetching pages. If you want to know more about TCP, I suggest you read this RFC.

Datagram sockets

Also known as SOCK_DRAM , these are connection-less sockets where messages might not arrive at all (yikes). Datagram sockets use the UDP protocol for data transmission. With this protocol, the client doesn’t make a logical connection with the server; instead it just throws packets at the server and prays for it’s delivery.

Order of messages aren’t maintained as well, so if you send “H, i”, you might get — “i, H”. This might sound bad, but it has it’s advantages. Since it doesn’t need to establish a connection with the server nor acknowledge on every single packet, it’s faster than Stream sockets. This is usually used for video streaming or conferences, where some frame drops are acceptable but you get fast playback. Message boundaries are preserved as well, since you send single messages to a server without having a connection open.

You can read this RFC for more info on the UDP protocol.

Datagram/UDP in a nutshell

Raw sockets

Also known as SOCK_RAW , these type of sockets are fun to play around with (unless you don’t know what you are doing, obviously). Essentially, raw sockets allow you to send and receive IP packets without any specific protocol unlike stream or datagram, although you can use a protocol if you wish to (you would have to implement it yourself).

Other socket types don’t give you access to the protocol headers that are broadcast with the payload, but raw sockets include the header payloads. Raw sockets are wonderful for building your own transport-layer protocols that are not natively supported by the kernel. They are also used for routing protocols such as OSPF, ICMP, etc.

You can try reading this RFC for more info about the Internet Protocol (since that’s what raw sockets run on)

Sequenced Packet sockets

Some people don’t consider this as a “legitimate” socket, and it’s quite unpopular. It’s also known as SOCK_SEQPACKET . These are connection-oriented sockets, similar to stream sockets (TCP). The only difference is that seq packets maintain a message boundary, which can come in pretty handy a lot of times. A record/message can be sent using one or more “send” operations and read using one or more “read” operations, but a single operation will never send parts more than one record/message. People like to call it the “programmer-friendly” socket type.

How does it work under the hood?

If you want to know how data gets encapsulated and sent to other computers, I suggest you to first take a look at the OSI model (I can write a whole post about this, but better ones exist). Since this post is about sockets, I won’t be talking about how data is sent over to the network through tons of layers, protocols, serialisers, encoders, etc. Rather, I want to go over how the (Linux) kernel creates and manages sockets.

When you use the socket() syscall (directly or through some library), it ends up triggering a whole bunch of internal methods. These are mostly methods that run some preliminary validations and allocates a pointer to a socket struct, which holds information about the socket. The socket struct contains a proto_ops interface that holds information about the protocol that the socket will use. An internal method called sock_create fills in necessary information about the socket in this step. Once these steps are done, the sock_alloc() function allocates an actual socket for us (it also allocates an inode but that stuff is boring). After that, sockets are bound to IP families, ports, and whatnot (I don’t wanna go too deep into this).

Now you might be wondering, how do these sockets work? How does the kernel do these network stuff? It’s pretty simple, actually.

Data are passed in the form of packets, so the NIC (network interface) doesn’t receive the data as it is. Rather, it gets them in the form of multiple packets with metadata in it. When some data packet is received by the NIC, the kernel is interrupted (or it just polls the NIC for new data). After the kernel receives a packet from the NIC, it decodes the packet and looks at the metadata (in the packet headers). It checks for the source IP/port, destination IP/port, etc. That information is used by the kernel to look for an active socket in the memory that uses the same IP/port combination (since there can’t be more than one network socket listening on the same IP/port combo, each one is unique).

Each socket has buffers for both reading and writing, known as “receive buffer” and “write buffer” respectively. Once the kernel figures out which socket the received data belongs to, it writes it to that socket’s read buffer (and holds it there). When a user/program runs the read() syscall to read data from that socket, the kernel copies the data from the socket’s buffer to the buffer that was supplied when calling the read() syscall (this user-supplied buffer might be some allocated space or variable in your code). The data that has been “read” is now removed from the socket’s receive buffer so that new data can come in.

Writing to a socket works similarly. When you call the write() syscall, it copies data from the buffer you supplied and puts it on the socket’s write buffer. The kernel will now take the data from the write buffer, serialize/encode it (check out the OSI model for more info about different network layers) and then hand it out the NIC which will actually send the data throughout the network.

Reading and writing can be delayed a bit from when the user calls the syscalls because of many reasons i.e the NIC might be busy, protocol window is full, kernel is busy with something else, etc.

I could write a whole post about this single topic and go deeper into listening, accepting connections, backlogs, semantics, etc. But I won’t do it in this post. Let me know if you want a post on those topics :)

Conclusion

Networking was the most interesting thing for me when I first got into backend development, after distributed systems of course! I’ve been wanting to write a post about networking for a long time, so I finally did. I hope that this post was helpful, and gave you some insight about how sockets work. Yet, if there are any mistakes, PLEASE let me know. I’m more than happy to get a mistake cleared up by an experienced developer :)

With that said, I wish you a great day! Happy networking…

--

--

K.M Ahnaf Zamil

Backend Dev | Databases, Distributed Systems, Networking | Programming YouTuber