Yet another guide to Sockets Programming

Supratik Chatterjee
5 min readNov 16, 2020

--

The base construct behind the internet is the socket, which is a virtual representation of a real world socket, an abstraction to which other programs can connect to.

As with a lot of widely used concepts, there are different ways to segregate the sockets.

Sockets can be divided based on their functions :

  1. Server Sockets : A socket that listens for incoming communications (imagine a receptionist at a hotel).
  2. Client Sockets : A socket that generates a request

They can be divided based on the way they exchange information :

  1. Stream Sockets : Lossless sockets that ensure that complete information is sent. Some protocols used with this are Transmission Control Protocol (TCP), Stream Control Transmission Protocol (SCTP) or Datagram Congestion Control Protocol (DCCP).
  2. Datagram Sockets : Sockets that send the information without confirming the reception of the sent ‘packets’ of information, sent in an unordered form. A popular protocol used with this is the User Datagram Protocol (UDP).

You can imagine sockets as a tiny file residing in a memory that you can bind to in order to exchange information with between 2 running programs.

Usually program at at least one end of a socket is a service, however, that is not a required criteria, and just about any program can generate a socket to expose some or all of it’s functionality to another process.

Life cycle of a socket

States of a socket

The life cycles of a server and a client socket are different, as the diagram on the left shows.

There are some common operations that we have on sockets which are :

  1. Create : Defining operational parameters of a socket file descriptor such as addressing methodology(IPv4, IPv6 or file pointer) and the method of information exchange(Stream or Datagram)
  2. Write/Send : Passing information to the socket for any socket accepting information from it to read.
  3. Read/Receive : Receiving information passed to a socket.
  4. Close : Close a socket file descriptor

Operations specific to server sockets :

  1. Bind : Operation to bind to a specific port on a particular ‘local’ IP address.
  2. Listen : Start listening to the bound socket pointer.
  3. Accept : A function which returns ‘true’ on an incoming connection request.

Operations specific to client sockets :

  1. Connect : Operation to connect to a remote IP address, which is ‘accepting’ connections.

Usages of sockets

Sockets can be made use of in one of the 2 representations shown below :

Sockets usage types

Local communications occur through the use of UNIX sockets which is essentially a file, that can be accessed through the file system of the operating system. This is used for inter process communications(IPC).

Network communications make use of either a predetermined addressing scheme or are ‘raw’. These are used for remote process communications(RPC) and for our favorite HTTP based communications.

A Closer look at socket parameters

Sockets in computing require 3 pieces of information :

  1. Addressing Format
  2. Packet Format
  3. Transport scheme

Addressing Format mentions the scheme for the addressing of clients. Some of the common types are :

  1. AF_INET : IPv4 addressing
  2. AF_INET6 : IPv6 address
  3. AF_UNIX/AF_LOCAL : Addressing through OS filesystem
  4. AF_IPX : Addressing for inter-network packet exchange
  5. AF_PACKET : Provides access to packets for Data-link layer manipulation

In case you are wondering what these are, these are Addressing Families(AF) as stated by the Berkeley Sockets API standard. These are not limited to the above in any way. There are others such as AF_PUP, AF_APPLETALK and many more other standards.

There exists Protocol Families(PF) as well, however, since in programming, we do not make use of it as frequently, I shall not get into it.

Packet Format is the format for the individual units of data, called a ‘packet’.

Datagram is a type of packet with a header and body that mentions the source and the destination for any unit in the header and the information in the body. The header can then be sent in just about any path, with the risk of packet loss along the way.

Example : UDP. You have data, you read your client address, add it to the data header as a destination address. IP then adds another unit to it, mentioning the received router address, and sends it to it. OSPF then takes care of the rest, till it reaches the client, or gets lost in the process of doing so. How it gets lost, is a subject for another article.

Streams are ‘like’ a connected pipe that sends information to a location, but what it actually does is creates an abstraction where the underlying unit creates packets of a set size and sends it to the next lower layer for addressing.

Example : TCP. You send the data, the protocol breaks it up into chunks and forwards it to the subsequent IP layer for addressing. There are very specific packet formats for it, since TCP ensures ‘reliable transmission of ordered, error-checked octets’.

It is recommended not to define your own units unless completely imperative. It is mentioned through the usage of the Address Family with the exception of AF_PACKET, which allows you define your own data-link units.

Transport schemes are of the following types :

  1. SOCK_STREAM : Stream type socket
  2. SOCK_DGRAM : Datagram type socket
  3. SOCK_RAW : Raw type socket, provides the packet as-is

These do not change no matter which programming language you may want to implement these in.

This tutorial is for people who have done some sockets programming before, and wish to understand it better. It should also help out beginners looking for some better explanation on the same.

For tutorials on implementing the concepts :

  1. Python
  2. Java
  3. C/C++

It is imperative to understand the OSI Model to be effective in working with the network more effectively.

Sockets programming is the base of all processes within any network. It allows control over Layers 2 and 3 of the OSI Model.

Hopefully this will enable you to approach sockets programming more effectively.

--

--

Supratik Chatterjee

Software Developer, Researcher and Mentor. Aeromodelling and Robotics hobbyist.