I had already decided to use the very elegant
ocaml-ctypes module for the SRT binding so I went with it and created a
ocaml-sys-socket module using it as well. It was a very interesting experience that I would like to describe here!
The idea behind OCaml ctypes is to create a binding against a C library without having to write C code, or as least as possible. The most straight-forward way of using it is via
libffi , providing access to dynamically-loaded libraries.
The second way of using it is by letting the module generate the basic C stubs required to build and link against a shared library. This is the mode that we’re going to use here. In this mode, the programmer has to describe the C headers of the library they intent to bind to using dedicated OCaml modules, operators and types. From that description, ocaml-ctypes is able to generate the required glue for the binding.
One advantage of using ocaml-ctypes is that the created bindings make as few assumptions as possible about the OCaml C interfacing API. This is pretty nice, in particular since the OCaml compiler is moving pretty quickly these days (which is awesome!) and also if, perhaps one day, support for multi-core is added to the compiler, which will undoubtedly change the C interface API quite a bit.
jbuilder ) is a build system for OCaml projects that has recently raised to much popularity, particularly due to its tight integration with the rest of the OCaml ecosystem, such as
My personal motto in programming in general is that “Simple things should be simple, but complex things should be possible”.
dune certainly does not fit into that category but, rather, makes some complex things extremely easy to setup. It’s the kind of tool that will make your life incredibly easier when what you intent to do fits well within their workflow but might not be easy to bend to some very specific niche use. We will see one such case below.
At any rate, it’s been an amazing experience getting to learn how to use
dune and the resulting code and build system is remarkably short and elegant, yet very powerful.
socket.h is the Unix header that describes the C API to various socket operations, IP version 4 and 6 as well as unix file sockets. There is also a windows API mimicking it, which makes most code using it easily portable to windows.
Most network-based C libraries refer to
socket.h to describe the type of socket that can be used with their API so it’s an important entry point for a lot of network operations and one that would be nice to support as generically as possible in OCaml.
The catch, though, is that, most likely for historical reasons¹, the POSIX specifications only partially defines some of the required data structures and types, which makes it possible to write C code using them but does not give enough information to write C bindings without having to use the compiler to parse the actual system-specific headers of the running host.
For instance, here’s how the
sockaddr structure is specified:
The <sys/socket.h> header defines the sockaddr structure that includes at least the following members:sa_family_t sa_family address family
char sa_data socket address (variable-length data)
Likewise, here’s what is specified about the size of the
socklen_t data type:
<sys/socket.h> makes available a type, socklen_t, which is an unsigned opaque integral type of length of at least 32 bits.
Thus, in order to know the exact offset of
sa_family inside the
sockaddr structure or the actual size of a
socklen_t integer, one has to include the OS-specific header, parse its definitions for that specific OS and, only then, is it possible to compute that offset or data size. Let’s see how it’s done in our binding now!
Putting it together
The C binding requires 4 separate passes:
constantspass, which computes and exports some specific constant and data sizes, computed from the C headers
typespass, which, given the system-specific constants and sizes exported in the previous phase, defines the actual C data structure bindings.
stubspass, where we define the actual bindings to the C functions that we wish to export in our API.
- Finally, the last pass does a cleanup of the
stubspass to export a relevant and OCaml- (and
ocaml-ctypes) specific public API that is to be used by users of the module.
dune makes each of these steps fairly easy to integrate into the next one, defining compilation elements and binaries to build before moving to the next pass.
During that pass, we compute and export all required C values defined in the headers. We also add our own constants, which give us the sizes that the POSIX specifications leave up to the OS. Here’s the OCaml code for it:
Pretty straightforward! Some of these constants are defined by the POSIX headers and some are custom defined for our needs, for instance
SOCKLEN_T_LEN . Here’s how they are extracted, using the
dune build configuration for
This OCaml code makes use of
ocaml-ctypes to build a binary that exports the OCaml interface defined by
Sys_socket_constants.Def . Once compiled, its output looks like this:
The files used to describe how to build this binary using
dune are located in a separate
generator directory. Here’s the entry to build this one:
This executable is compiled during the next phase. Let’s move into it now!
During that phase, we use the constants exported during the previous phase to describe the various C structures and types. This is by far the most complex part of the code, making use of first-class modules and several OCaml tricks.
First, let’s look at how we tell
dune that we need to generate the
.ml file exporting our required constants from the previous pass:
With only this information, if the code refers to a
dune will know that this module needs to be generated and how to do it. We will explain later the use of the
exec.sh wrapper here.
Now that we can make use of the exported constants in our OCaml code, let’s see how we define the
Socklen module, exporting abstract types and interface to use
As you can see, we make use of first-order modules and the size of the
socklen_t integer to define the right API for the compiling host. Now let’s see how we define the
Here, too, we make use of the size of
sa_family as exported previously to define the right structure fields.
Next step, we need to compile this interface again to export the right offset for the various structures that have been defined. That’s
dune’s job again!
First, the generator code:
And the build instructions:
Once, compiled, the exported
.ml looks like this:
As you can see, this exports all the offsets required to access the fields inside a
sockaddr_t structure. We’re now ready to move to the final stage, which is the actual binding stubs!
First step in this pass, just like with the previous ones, we need to configure
dune to be able to build the exported
.ml code from the
And we can now define the proper bindings. Here’s how it looks like:
As you can see, we’re exporting the
getnameinfo function, taking various arguments, including a pointer to a
sockaddr_t structure and a couple of
socklen_t integers, making use of all the various data types and structures previously defined. The exact specifications of this function can be found here. We can now define out top-level API..
Building upon the previous modules, we export various OCaml idiomatic APIs that the binding user can now use to build new bindings against the
Just like with the previous steps, first we need to configure the build system:
This time, we need
ocaml-ctypes to generate two compilation units: a
.ml file describing the API exported during the
stubs phase, as well as the C code to glue it with the C APIs. Here’s the code for that generator:
.c files are omitted here for simplicity but the reader can generated them themselves from the
ocaml-sys-socket repository if they are curious about their actual content.
We can now export our top-level API:
That’s it! We now have
ocaml-ctypes specific data types and structures that can be used to interface with the host’s native
socket.h APIs. Note that we also worked on top of the original low-level binding to
getnameinfo to export a higher-level function more idiomatic to the OCaml language.
Lagniappe: cross-compilation to Windows
On windows platforms,
liquidsoap is compiled using
ocaml-cross-windows and, since windows does have compatible socket APIs, we wanted to also look at cross-compiling for the windows target, which is where we hit a snag on the current
The problem is that, at each intermediary steps, in the case of a cross-compilation, the compiled binaries need to use the target’s OS headers and not the host’s headers, otherwise we end up using offsets specific to e.g. Debian but for a windows binary.
In this case, this means that the compiled
.exe binaries need to be windows binaries and that we need to execute them as windows native binaries, using
dune has a truly amazing support for cross-compiling, which we do not cover here, but, unfortunately, its primitives for building and executing binaries do not yet cover this use case. Thus we had to trick it into compiling things the way we wanted to do, which why we are using the
exec.sh wrapper. Here’s its code:
Now, you can go back to the previous
dune files and see how this wrapper allows to execute binaries according to the system that the corresponding
ocamlopt compiler has been configured to build for.
It’s been a fun time working on this binding! It’s amazing to see the level of details that can be built through
ocaml-ctypes using their provided primitives. Ultimately, the binding is very clean and elegant, with very few low-level assumptions.
Likewise, the simplicity and power of the
dune build system makes this very fluid to build. Without it, each of the described steps above would have been much more painful to execute and compile.
: My bet is that, at the time the POSIX specifications were being written, there we already several inconsistent
socket.h headers out in the wild among the various historical UNIX flavors..