40 years of Berkeley Sockets
Programming in current FreeBSD versus programming in 1983’s 4.2BSD
Introduction
The need for low-level Socket programming is not common today, they are usually buried in higher layers of protocols, but the need arises occasionally when high performance and low latency is required. These days I am working on the analysis of an end-to-end financial foreign exchange institutional trading pipeline and I have the need — again — for selected parts to work directly with the BSD Socket implementation.
The year is 1983 and 4.2BSD is announced. 4.2BSD represents a milestone in the timeline of UNIX, as Berkeley Sockets — the implementation funded by DARPA to implement their new set of protocols — are incorporated into UNIX®.
Back then it was not clear that the DARPA protocol stack was mean to become the de facto standard, and TCP/IP was often seen as an interim solution while the OSI protocols were fully settled and both hardware and software implementations became available. Other architectures such as the Xerox Network System and IBM Systems Network Architecture were considered serious contenders. IBM SNA was, within its perimeter, for a while. XNS is barely described in Wikipedia now.
Sockets have proved to withstand the passing of time. In this post, I cover how 4.2BSD material — which is 40 years old — can be still used to program FreeBSD. There have not been significant changes, and the C source code of that period can be — after the natural conversion from K&R to ANSI C — compiled with very minor adjustments.
Official notes and tutorials
As part of the 4.2BSD documentation we can find the Networking Implementation Notes. This is a fairly technical document which provides a good insight on how Berkeley Sockets were implemented in the UNIX® operating system. The implementation notes are not a light reading but it is worth if you want to have an overview of how the stack is implemented.
A more practical and extremely well-crafted article was provided by Bill Tuthill in the April 1985 issue of UNIX Review. The article describes the new functionality of Sockets as a means of IPC — which is what Sockets are really about — . The article is very informative, and educational and covers a simple TCP example. Other than the K&R C style and the usage of the deprecated BSD bzero and bcopy functions the examples are usable. For the bzero and bcopy it is advisable to use their newest counterparts memset and memcopy. Both were part of POSIX but the latest come from System V while the first one come from BSD.
It surprised me that even the header file locations and filenames are still the same in modern FreeBSD — 40 years afterwards — .
A very concise but informative tutorial on 4.2BSD sockets and how to implement a complete UDP client server tool is given in the Journal of Computer Communications, Volume 10, Issue 1, February 1987. This article written by David Coffield and Doug Shepherd is extremely good — even today — if you need a quick and practical introduction to how to program BSD Sockets. The usage of UDP instead of TCP is particularly useful as most material focuses in TCP.
Are 40 years old articles and documentation still useful? In the case of Sockets, they are. The reason is that Sockets have not substantially changed since their introduction. I have found it easier to read as introductory material than modern documentation, maybe because back then it was not considered a commodity and hence the explanations were intended to be provided to people without a previous background in networking.
By going through this material I also discovered that 4.3BSD implemented a fairly interesting Socket type, the SOCK_SEQPACKET, which was back then planned for the Xerox Network System. A review of the current FreeBSD man page reveals that the Socket is still there — at least documented — . It allows guaranteed communication (like in SOCK_STREAM) but with fixed-length messages, a feature especially interesting to implementing messaging systems.
Books
The classic three volumes written by Douglas Comer’s introduction to TCP/IP are still a reputable source. They were all updated during the 90s so they cover ANSI and POSIX programming, originally the Socket implementation book (Volume III) was published in two versions, one for AT&T System V and the other one for BSD sockets. They can be often found through Amazon at a very reduced price through the usual used books vendors.
Another set of classic books worth reading are the set of books written by William Stallings in the early 80s describing the DoD DARPA protocols — do not get fooled by the DoD codes, they are the same as the IETF ones — . The books are pretty old but constitute a good introduction to the Internet Protocol Stack. Some of the books will describe protocols that are no longer mainstream (such as FTP and Telnet), but they are still pretty informative in understanding how the Internet Networking Protocol Stack was conceived and how each layer is intended to cooperate with adjacent ones. William Stallings’s books are in general always highly recommendable as he manages to make technical books not boring—something that it is not easy to achieve — .
A last book, and likely the most practical one of the ones described here would be UNIX Network Programming written by W.Richard Stevens, who was able to write three editions of the book before he unfortunately passed away. The books are an authoritative source and were reviewed by Dennis Ritchie (among others). The third edition is reasonably updated. The first edition contains a very interesting condensed history of AT&T and BSD UNIX versions, as all the IPC methods are described.