Serialization series — Do you speak Erlang ETF or BERT? (part 1)

Mathieu Kerjouan
30 min readJul 17, 2017

--

Sometime, a software need to speak to another software. Even more, sometime, software will need to speak to hardware or remote material. So, how do you communicate to one equipment if there is no defined norm or standard? Well, you could make your own, but, encoding data with your own format is not so easy…

ETF (External Term Format) and BERT (Binary ERlang Term) are answers to this common issue. If you know a little about serialization or if you have ever used protobuf (2001), msgpack (2009), nanopb (2011) or apache thrift (2007), this post will only show you another way to do the same thing. For others, I will try to teach you how to talk to software or hardware with binary format.

ETF and BERT are like JSON or XML, just smaller, faster, easy to use but a bit harder to understand because of its inherent binary nature. The External Term Format was made in 1997 by Bjorn Gustavsson at Ericsson. Binary ERlang Format was standardized by Tom Preston-Werner. He released BERT-1.0 in 2009.

The main purpose of binary format is to transform a any kind of structure from a language N to a term in another language M transparently. In this article, I will show you how to communicate to and from C node. This binary format is also used internally in Erlang and give us a simple way to talk to another nodes. Indeed, you can also talk to another language, like Java, Perl, Python, or whatever you want.

Is ETF and BERT used in production? Currently, Github uses BERT in production, concerning Open-Source project, Erlang N20 framework uses it too. Obviously, a lot of Erlang softwares/applications uses ETF internally or to communicate with ports, nodes, Jinterface or NIFs.

Serialization

In computer science, in the context of data storage, serialization is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer) or transmitted (for example, across a network connection link) and reconstructed later (possibly in a different computer environment).

Encoding and Decoding ETF in Erlang

Before starting to write some C code, I will show you how to encode and decode ETF in Erlang. Three built-in functions are mainly used:

A simple Erlang, encoding and decoding is made possible with only 2 functions.

Copy/paste this snippet in etf.erl file, compile it with erlc and execute it. You can also compile it directly in erl (with c/1function).

$ erl -noshell -run etf encoding
term: [{key,"value"},1,2,3]
encoded: <<131,108,0,0,0,4,104,2,100,0,3,107,
101,121,107,0,5,118,97,108,117,101,
97,1,97,2,97,3,106>>

etf:encoding/0 function will output binary string (encoded Erlang term). Your list is converted in binary format.

$ erl -noshell -run etf decoding                     
term: [{key,"value"},1,2,3]
encoded: <<131,108,0,0,0,4,104,2,100,0,3,107,
101,121,107,0,5,118,97,108,117,101,
97,1,97,2,97,3,106>>
decoded: [{key,"value"},1,2,3]

etf:decoding/0 output Erlang terms (decoding ETF binary string). This code take data returned by etf:encoding/0as binary term and decode it to generate comprehensible data-structure for Erlang: a list containing multiple items.

Interesting isn’t it? We can do a lot of more stuff with that! For example, we can store lambda function (know as fun in Erlang) or arbitrary term in binary format and store it in database, send it through a socket or store in some another secure place… Obviously, you could retrieve it and directly execute it in another place. Pretty amazing! One more question, how it works?

binary_to_term/1 and term_to_binary/1 are BIF (Built-In Functions), all BIF are written in C and are integrated in Erlang release. Big part of all encoding and decoding functions are available in external.c and external.h.

Communicate with Ports

Now, we can encode and decode Erlang term on the fly, but we need another thing… If we want to communicate with another software, how to start it with Erlang? This question can have multiple answers but in this article, I will show you only port method. You can obviously use any other method for talking to your software (e.g. tcp/udp/unix socket).

Ports schema from Erlang User’s Manual.

A port is a software running outside the BEAM (Erlang Virtual Machine). All standard input/output are caught by the BEAM over a special dedicated process named “port”. If your external piece of code crash, or something goes wrong outside the VM, this one will not be affected. Even more, BEAM can restart port automatically if you want.

For this first example I will show you how to execute arbitrary Unix commands. You can start erl and execute these functions:

% execute ls command
erlang:open_port({spawn, ls}, []).
% all output messages (stdin or stderr) are redirected
% and sent to the process connected to the ports, in our
% case, our shell process
flush().

First function will open a port, the first argument is the method used to start the port. In our case, we want to spawn an instance of ls without argument. The second argument is for extra-options.

When executed, the created port will be automatically connected to our shell process, and all messages (from ls command) will be sent to the shell mailbox. The flush here is for purging mailbox and see what we have stored in it. Normally, we have received the list of the current directory. If you want to check for yourself, you can just type ls() and see if all files were printed.

Erlang gives us lot of useful functions for managing ports, here is a non-exhaustive list of them:

  • erlang:ports/0 returns a list of all started ports on current Erlang Virtual machine;
  • erlang:open_port/2 open a new port, first argument is set with the method to start the port followed by the command (and optional argument of this one).Second argument is optional feature;
  • erlang:port_close/1 stop argument sets in first argument. If you want, you can use erlang:list_to_pid to encode your target process;
  • erlang:port_info/1 returns information concerning the process (pid) set in the first argument.

GNU Makefile

Building a C project from scratch is a little tricky, there are many thing to understand and learn from compilers and other tools. This quick and dirty GNU Makefile will help you to compile all associated C code and help you to play with ETF. If you want to create your own snippet, just use etf.h, this header contains all your need.

GNU Makefile

This Makefile is configured with good options for C compiler including Erlang library and header path by default (work with clangand gccon Linux and FreeBSD). I’ve enabled debugging (-g) options and all warnings (-Wall). To compile one hypothetical file named yourfile.c , just invoke make like that:

make build TARGET=yourfile.c

This command will create a directory _buildand an executable named yourfile in it. If you are using BSD or another kind of operating system different of Linux, don’t forget to set ERLANG_PATHvariable with the good Erlang library path location (on FreeBSD, Erlang libraries are stored in /usr/local/lib/erlang, on NetBSD in /usr/pkg/lib/erlang) and use GNU make (gmake):

gmake build TARGET=yourfile.c ERLANG_PATH=/usr/local/lib/erlang

Headers and Helpers

Like any other framework or library, Erlang External Interface require at least one header: ei.h. This file contains all functions prototypes, data-structures and macros associated with ETF.

In our case, the file ei.h is already included in etf.h . Some other functions were added by myself to help programmers to see what’s going on in the encoded data-structure and to print raw data from the buffer.

  • void ei_encoded_printf(char *, size_t) prints buffer set in first argument until its size set in second argument is reached;
  • void ei_encoded_fprintf(FILE *, char *, size_t)does the same task as ei_encode_printf but you can configure a file-descriptor in first argument;
  • void ei_raw_print(char *, size_t) prints all data from buffer in raw to stdout;
  • void ei_x_vsizeN_fprinf(FILE *, ei_x_buff *) prints size of ei_x_buff data-structure based on Erlang port protocol communication format, where N is an integer value (1, 2, and 4) representing the binary byte length of the size;
  • void ei_x_vsize_fprintf(File *, ei_x_buff *, int) is another method for printing data-structure size, last argument is set with the size in byte;
  • void ei_x_printf_raw(FILE *, ei_x_buff *) prints raw data-structure in specified file-descriptor.

All functions using write are currently not working well. Please don’t use it ;)

I use ei.h because its the recommended interface for C node, but you can also use erl_interface.h. This interface has some interesting macros like ERL_IS_* which can help you sometimes for type validation.

Encoding Version

ETF and BERT supports multiple different versions. ETF standard isn’t static and can be altered when new OTP version is released. To make your code safer and easily comprehensible with all your application, you’ll need to set current format version on your encoded buffer. To do it, you can use these functions:

  • int ei_encode_version(char *, int *) first argument is set with your buffer, the second argument is a pointer to index of the buffer. After executing this function, index is incremented and buffer is set with right value;
  • int ei_decode_version(const char *, int *) first argument contain your buffer, followed by a pointer to index, the last argument is a pointer to an integer, this last argument will contain the format version.
Encoding ETF version in C on buffer using standard C API

Encoding/Decoding Atom

Here we are, one of the first useful term to know is atom. In Erlang, atoms are unique term used principally for tagging messages. Following code snippet show you what is an atom represented in Erlang:

MyAtom = 'atom'.
% or simply
MyAtom = atom.

In C, 2 functions are used for encoding atoms:

  • int ei_encode_atom(char *, int *, char *) first argument contains your buffer, the second contains a pointer to the index of this buffer, and the last argument is a string containing the name of your atom.
  • int ei_encode_atom_len(char *, int *, char *, int) is like ei_encode_atom but we can here set a limit to its size. An atom name shouldn’t exceed MAXATOMLEN, generally set to 256.
  • int ei_decode_atom(const char *, int *, char *) decodes a buffer containing an atom. The first argument is your buffer containing encoded pattern, second argument is a pointer to the buffer index, and the last argument is an array of characters (will contain atom).

In ETF and its C counterpart, an atom is simply a string, a list of character ended with the NULL term (\0).

OTP-20 release give you the possibility to encode atoms in utf8. Its a pretty new feature and I will not show you how to use it here.

etf.h will help you to build C source code with Erlang External Interfaces

Encoding/Decoding Numbers

Numbers are generally easy to use. In Erlang case, only 2 types of numbers exists, integer and float. In C, its a little different… But before attacking C side, I will show you how Erlang represent numbers:

MyInteger = 1234.  % define a positive integer
MyFloat = 123.123. % define a float

In C, 5 functions could be used depending of the size and the type of the number:

  • int ei_encode_long(char *, int *, long) encodes a long integer (64bits) number, first argument is the buffer, the second a pointer to the index of this buffer and the last one, a signed long integer value;
  • int ei_encode_ulong(char *, int *, ulong) encodes long unsigned integer (64bits) number, first argument is the buffer, the second a pointer to the index of this buffer and the last one, an unsigned long value;
  • int ei_encode_longlong(char *, int *, longlong) encodes long integer (64bits) number with GCC compatibility, this function is like ei_encode_long ;
  • int ei_encode_ulonglong(char *, int *, ulonglong) encodes long unsigned integer (64bits) with GCC compatibility, this function is like ei_encode_ulong ;
  • int ei_encode_double(char *, int *, double) encodes floating number and double precision number (64bits), first argument is your buffer, the second argument is a pointer to the index of the buffer and the last argument is a double value;
  • int ei_encode_bignum(char *, int *, bignum *) uses the GNU MP (Multiple Precision) Library. I will not show you here how to use it and enable its support in Erlang.
  • int ei_decode_NUMBER(char *, int *, NUMBER *) all number encoding functions exists in decoding part. The first argument is your buffer, followed by a pointer to the buffer index, and the last argument is a pointer to your right typed variable.
Encoding ETF number in C on buffer using standard C API

Encoding/Decoding Tuples

Erlang support multiple terms for organizing data. A tuple is one of them, and it isa fixed length data-structure. To use it properly we need to configure its arity, here the high-level Erlang representation of a tuple:

MyTuple = {1, 2, 3}.
MyTupleArity = erlang:size(MyTuple).

In C, we have 2 functions mainly used to encode tuples:

  • int ei_encode_tuple_header(char *, int *, int) encodes a tuple. The first argument is a pointer to char, a buffer. The second argument is the index, where the data will be encoded on the buffer. The last argument is the size of this tuple (arity);
  • int ei_decode_tuple_header(char *, int *, int *) decodes a tuple from binary encoded string, the first argument is your encoded buffer, the second argument is a pointer to buffer’s index, the last argument is a pointer to an integer which will contains arity of the tuple.
Encoding ETF atom in C on buffer using standard C API

Encoding/Decoding List

Another important term in Erlang is list. Lists are everywhere in Erlang and can be used to make everything. A list is a dynamic data-structure and can contain multiple items of multiple type. By default, list creation is made like tuple, without dynamism but we can hack a little around it due to Erlang definition of list.

Erlang, like other functional programming language, can represent lists in different way, the classic way (syntactic sugar), is to add elements separated by comma surrounded by square bracket. But you can also make a list from concatenating single or more element to empty list. The final result will be the same, you have made a list.

% syntactic sugar representation:
MyList = [1,2,3].
% or equivalent representation:
MyList = [1|[2|[3|[]]]].

With Erlang Interface API in C, we have 3 functions to encode and decode list:

  • int ei_encode_empty_list(char *, int *) encodes an empty list, this is always the last term used to close the list. The first argument contain your buffer, the second is a pointer to the index of the buffer;
  • int ei_encode_list_header(char *, int *, int) encodes the pattern who define the list, the first argument is your buffer, the second argument is the index of your buffer and the last term is an integer representing the size of your list (maximum numbers of elements contained in this list);
  • int ei_decode_list_header(char *, int *, int *) decode a binary has list, the first argument is your buffer, the second one is a pointer to the buffer’s index and the last one is a pointer to an integer, this last argument will contain the size of the list.
Encoding ETF list in C on buffer using standard C API

Encoding/Decoding String

In Erlang, a string is simply a list containing printable ASCII characters. You can represent it like this:

MyString = "a,b,c,d".
% is equivalent to
MyString = [97,98,99,100].
% or
MyString = [97|[98|[99|[100|[]]]]].

In C, you can use those functions to encode and decode string in ETF:

  • int ei_encode_string(char *, int *, char *) encodes a string, the first argument is your buffer, the second one is a pointer to the buffer index. The last argument is a C string, a list of characters ended by NULL ;
  • int ei_encode_string_len(char *, int *, char *, int) encodes a string with defined length, the first argument is your buffer, the second argument is a pointer to the buffer index. Third argument is your C array containing char, this time, your string can be ended by what you want, because the last argument define the size of the string;
  • int ei_decode_string(char *, int *, char *) decodes a string, the first argument is your buffer followed by a pointer to the index buffer. The last argument is an array of char, pointed variable will be set with the decoded string.
Encoding ETF string in C on buffer using standard C API

Encoding/Decoding Binary

Erlang support also bitstring notation (also named iolist or binary), one of the best feature of Erlang. To make thing simpler, its a kind of syntactic sugar dedicated to binary terms. You can split and use pattern matching on this term easily, pretty useful for decomposing raw data from the wild world. Erlang represents bitstring like this:

MyBinary = <<0, 1, 2, 3, 4>>.

In C, You can use 2 function to encode and decode binary:

  • int ei_encode_binary(char *, int *, void *, long) encodes a binary term, first argument is your buffer, second one is a pointer to buffer index. Third argument is a pointer to void, you can case any type of data here, the last argument is the size of the void data-structure (how many elements need to be encoded);
  • int ei_decode_binary(char *, int *, void *, long) decodes binary term, the first argument is your encoded buffer, the second argument is your buffer index, third and fourth arguments are like ei_encode_binary but will be used to store decoded data.
Encoding ETF binary in C on buffer using standard C API

Encoding/Decoding Maps

In recent release (OTP-17), Erlang Team released experimenta map data-structure. This new Erlang term isequivalent to hash in Perl and dictionary in Python, to be clear, its a bucket of key/value. Its a pretty new feature and isn’t compatible with old OTP release. In Erlang, you can create a map like this:

% create a new empty map
MyNewMap = maps:new().
% create a new map with one key/value
MyMap = #{ key => value }.

In C, to encode and decode a map, you can use these functions:

  • int ei_encode_map_header(char *, int *, int) creates a new map data-structure, the first argument is your buffer, the second is a pointer to the buffer index and the last argument is the even arity of the map (a key/value reprents two elements, not one);
  • int ei_decode_map_header(const char *, int *, int *) decodes a map data-structure from binary term, the first argument is your buffer, the next one is a pointer to buffer index and finally, the last argument is a pointer to an integer who will store arity of the map.
Encoding ETF map in C on buffer using standard C API

Dynamic Buffer Encoding

All encoding exists also with ei_x_* extension. Those functions require a special data-structure named ei_x_buff used to encode term. This data-structure contains an allocated buffer from the heap managed with malloc, an index and the size of the buffer.

typedef struct ei_x_buff_TAG {
char* buff; /* buffer on the heap containing encoded ETF */
int buffsz; /* size of the buffer */
int index; /* index of the buffer */
} ei_x_buff;
  • int ei_x_new(ei_x_buff *) creates a new ei_x_buff data-structure allocating memory space from the heap in ei_x_buff.buff . This structure need to be cleaned after use.
  • int ei_x_new_with_version(ei_x_buff *) creates a new ei_x_buff with version directly set. Same behavior than ei_x_new .
  • int ei_x_free(ei_x_buff *) frees ei_x_buff data-structure.
  • ei_x_* all encoding functions using ei_x_buff instead of buffer and index.
Encoding ETF with buffer on heap in C using standard C API

Get Type, Size and More

Type and size are pretty important value, those will help you to allocate and deallocate chunk of memory. It’s also the first validation part of decoding

  • int ei_get_type(char *, int *, int *, int *) retrieves type and size of the current term pointed by buffer index. The first argument is the buffer used, the second argument is the index, the third argument is a pointer to an integer who will contain the type of the term. The last argument is a pointer to an integer who will store the size of the term.

If you are using ei_x_buff data-structure, named buffer, you can use buffer->buff for the first argument and buffer->buffsz for the second.

  • int ei_skip_term(const char *, int *) is used to skip a term in a buffer. This is another way to get size of dynamic data-structure like list, tuple, binary or map. If you receives some unsupported encoded term, you can use it to pass to the next term.
nGet type and size of encoded data with standard C API

Real Life Example

Showing some examples is nice, but, a real example is more useful. So, I will show you here a small piece of code I’ve created recently to extract linux namespace from any running process and push it on Erlang side. On Linux, all information concerning processes are stored in procfs filesystem, this is a simple and “elegant way” (unfortunately with risk of race condition) for storing process state without breaking userland. /proc contain all PID (in numeric format) and other information concerning the operating system, for example, information about your init (PID 1) is represented by the directory /proc/1.

In this directory, you have access to lot of information concerning PID 1 process. In our case, we are just interested to collect all namespaces of this one. Namespaces are located in /proc/X/ns/* directory where Xis a positive integer and * represent all symlinks. Yes, namespaces are stored as symlink pointing to an non-existing hypothetical files, something like Namespace:[X] where Namespace is one of 7 linux namespaces (cgroup, ipc, mnt, net, pid, user and uts) and X is an integer representing the namespace id.

Using symlinks as key/value data-structure is common on Unix. For example, malloc.conf on OpenBSD use also a symlink containing a non-existing target for configuring its malloc function behavior.

Before starting, a little bit of specification is needed to make minimal list of requirements. Remember, this is an example, please don’t run this code in production! What we need to implement and what we know our environment:

  • process PID is a long integer;
  • process PID path is a string starting with /proc/;
  • namespace (cgroup, ipc, mnt, net, pid, user and uts) are long integers;
  • namespace path is composed of pid path (/proc/PID/), followed by ns/NS when NS is namespace name;
  • namespace are stored in a symlink in format NS:[X] where NS is the name of the namespace and X is the namespace value;
  • Some path can be opened and others not due to file owner and modes
  • We can’t use stdin/stdout (buffered channels), we must use explicitly file-descriptor 0 (read) and file-descriptor 1(write)
  • We need to define simple protocol between Erlang node and C node (erlang:open_ports/2 with options {packet,4} will be sufficient, all binary packets sent its size in first 4 bytes/32bits of the raw message)

We have lot of constraint, and to make happy everyone, we’ll choose our Erlang data-structure first. In our case we want a portable and simple data-structure, compatible with all release and BERT. proplists seems the best one. This is a list containing multiple tuple/2 where first tuple value is the key and second one, the value) is a good way to start, you can represent it like this in Erlang:

[{PIDN
,[{cgroup, X}
,{uts, X}, ...]}
,{PIDM
,[{cgroup, X}
,{uts, X}, ...]}
]

The first item of the list is a proplist with a PID as key and another proplist as value. We can see some repetitive pattern, all namespaces have same structure and use same data type: atoms and integers.

A well written C code is split in multiple small functions doing one thing, but doing it well. I will not show you all internal code (I’m using lot of macros), but some of useful functions I have created. So, a PID is a long integer, a namespace too, I think the first function will be something to check digits:

/* 
* check if a character is a digit
*/
int
is_digit(char c) {
if (c>='0' && c<='9')
return 1;
return 0;
}

A string is an array of characters ended by \0, we can now create a function to validate if a string is a number:

/* take an array of char and evaluate each term on it
* until you reach \0 or you match the limit size. If this
* this string contain non-number item, return 0 else
* return 1
*/
int
string_is_number(char *string, size_t size) {
int i = 0;
for(i=0; string[i]!=0 && i<size; i++)
if (!is_digit(string[i]))
return 0;
return 1;
}

We have here a way to validate if a string is a number. To convert this value to a long integer (in term of C), we’ll use strtol (string to long). The next step is to extract namespace value from symlink. Here we need to split this problem in two parts, first get the pointed value from symlink, next extract the value from it. This new first problem is easily solved with readlink function, the second one require a new function:

/* we assume the symlink value is well defined and
* we drop only non-digital charaters.
*/
long
get_namespace(char *value,
size_t size) {
int i = 0;
int b = 0;
char buffer[size+1];
memset(buffer, 0, size+1);
if (value[i]=='-') {
buffer[b] = '-';
b += 1;
}
for(i=1; value[i]!=0 && i<size; i++) {
if (is_digit(value[i])) {
buffer[b] = value[i];
b += 1;
}
}
return strtol(buffer, 0, 10);
}

We can now read the content of any namespace symlink with readlink, all namespaces are links containing value stored in target, so, we can generalize this feature. A read_symlink_namespace function will read content of symlink and extract digit term. Please note, we assume this symlink is well formated, we trust the kernel and procfs filesystem. So read_symlink_namespace take a path as string (const char *) and a limit (size_t) as argument and will return the namespace ID (long). Note, if something goes wrong, this function always return -1 .

long
read_symlink_namespace(const char *symlink_path,
size_t name_len) {
char buffer[name_len+1];
memset(buffer, 0, name_len+1);
ssize_t size;
if((size=readlink(symlink_path, buffer, name_len))<0) {
return -1;
}
else {
long i = get_namespace(buffer, size);
return i;
}
}

We have access to all namespace ID! We can now encode them, to do this, we creates 7 functions for each namespace, each of those functions will reads a given path based on PID, and return the value of read_symlink_namespace. This function is to help (and protect) us, it will make automatically the path based on fixed macro PROCFS_PATH (set to /proc) and the name of the namespace, cgroup in the following code example:

long
read_ns_cgroup(long pid, size_t len){
char path[len+1];
memset(path, 0, len+1);
snprintf(path, len+1, PROCFS_PATH "/%ld/ns/" "cgroup", pid);
return read_symlink_namespace(path, len+1);
}

We have a simple “API” to get any cgroup, with default value when an error occurs, this function return a long, and we can now encode our value in proplist. encode_ns_* functions will take the buffer containing all encoded value (ei_x_buff *) and the associated PID (as long). The first step is to create a new list header, second step, create a tuple of arity 2 and last twi steps, encode an atom (cgroup) and the namespace id (returned by read_ns_cgroup function).

void
encode_ns_cgroup(ei_x_buff *buffer, long pid) {
ei_x_encode_list_header(buffer, 1);
ei_x_encode_tuple_header(buffer, 2);
ei_x_encode_atom(buffer, "cgroup");
ei_x_encode_long(buffer, read_ns_cgroup(pid, LIMIT_NS));
}

We have 7 functions, if you want, you can copy/paste all of these functions, renames them and… made mistakes (e.g. read_ns_cgroup). Bad time for you, you shall now modify all other functions!

Macro is useful in this particular case. If we have multiples functions doing the same stuff, with only some trivial change (like different name or values), we can make templates. A macro is a piece of code reads by C preprocessor before compilation time. All line beginning with a #and followed by a keyword (e.g define, include…) are macros. You can define static macros (a single value like an integer, a string or any other kind of value), or dynamic macros (those one will take one or more argument and can generate “dynamic” content during preprocessing).

/* This macro template will generate read_ns_* functions
* You can generate all needed ns record (will return long type)
*/
#define READ_NS(X) \
long GLUE(read_ns_,X)(long pid, size_t len) { \
char path[len+1]; \
memset(path, 0, len+1); \
snprintf(path, len+1, PROCFS_PATH "/%ld/ns/" G(X), pid);\
return read_symlink_namespace(path, len+1); }
/* This macro template will generate encode_ns_* functions
* used to encode namespace in ETF
*/
#define ENCODE_NS(X) \
void GLUE(encode_ns_,X)(ei_x_buff *buffer, long pid) { \
ei_x_encode_list_header(buffer, 1); \
ei_x_encode_tuple_header(buffer, 2); \
ei_x_encode_atom(buffer, G(X)); \
ei_x_encode_long(buffer, \
GLUE(read_ns_,X)(pid, LIMIT_NS)); }
/* This macro generate read_ns_* and encode_ns_* functions
*/
#define NS(X) \
READ_NS(X); \
ENCODE_NS(X)
/* This macro generate read_ns_* and encode_ns_* functions
* prototypes
*/
#define NS_PROTOTYPE(X) \
void GLUE(encode_ns,X)(ei_x_buff *, long); \
long GLUE(read_ns_,X)(long, size_t);

Those macros calls others macros (e.g. GLUE), there are common helpers to generate quoted value or generate function name. You can define it like this:

#define Q(X) #X
#define QUOTE(X) Q(X)
#define GLUE(X) tuple_ ##X

We have our macros, right… How to use it now? We just need to call them as standard function in our code. Note, semi-colons are there only for the form, you can remove it if you want.

NS_PROTOTYPE(cgroup);
NS_PROTOTYPE(ipc);
NS_PROTOTYPE(mnt);
NS_PROTOTYPE(net);
NS_PROTOTYPE(pid);
NS_PROTOTYPE(user);
NS_PROTOTYPE(uts);
NS(cgroup);
NS(ipc);
NS(mnt);
NS(net);
NS(pid);
NS(user);
NS(uts);

If you want to see your C file after preprocessing and before compiling, you can use -E flag with GCC and Clang, this flag will output (on stdout) C source before compilation step, example: cc -E mysource.c.

How to represent the main proplist, with PID as key? We can apply the same procedure see for namespace, instead of using atom as id, we can use here PID. Instead of an integer as value, we can use namespace proplist generated before:

void
encode_pid(ei_x_buff *buffer,
long pid) {
/* a pid struct is contained in list */
ei_x_encode_list_header(buffer, 1);
/* and is defined by a tuple of arity 2*/
ei_x_encode_tuple_header(buffer, 2);
/* the first element is the pid itself */
ei_x_encode_long(buffer, pid);
/* and the rest is list of namespaces */
encode_ns_cgroup(buffer, pid);
encode_ns_ipc(buffer,pid);
encode_ns_mnt(buffer,pid);
encode_ns_net(buffer,pid);
encode_ns_pid(buffer,pid);
encode_ns_user(buffer,pid);
encode_ns_uts(buffer,pid);
/* finalize namespaces list here */
ei_x_encode_empty_list(buffer);
}

We can see the end of the program! We have everything to encode all our values. Now, how to print it? Erlang know ETF, but don’t know how many bits will be sent by our program. You can define your protocol, but Erlang guys made it for us. The first byte(s) (1, 2 or 4) of the binary defines the length of the message.

This feature needs to be activated during port opening with erlang:open_port/2, we’ll see that later on Erlang code part. So, the first step is to print this sequence of 4bytes containing the size of the message, I use a function helper (ei_x_size32_fprintf), in the second step, we calls ei_x_fprintf_raw to print all the data-structure containing our encoded values.

void
print_pids(FILE *fd,
ei_x_buff *buffer) {
/* protocol used by Erlang port
* first 32bits (4bytes) defined size of
* the binary pattern
*/
ei_x_size32_fprintf(fd, buffer);
/* we can now print the content of
* our buffer
*/
ei_x_fprintf_raw(fd, buffer);
}

Unix and Linux systems are concurrent and multi-user, so, multiple concurrent processes are running on the same host, we need to list them all! To do this, we open /proc with opendir function and list all directory with readdir. If these directories are numbers, we assume their are PID, we extract this PID (we convert the PID name to a long integer with strtol function) and execute encode_pid function with it. When all PID were listed, time to print our buffer is arrived, we can do this action with print_all_pids function defined before.

void
list_pids(FILE *out,
const char *path) {
/* initialize our dynamic buffer*/
ei_x_buff buffer;
ei_x_new_with_version(&buffer);
/* initialize our directory file-descriptor */
struct dirent *dir;
DIR *fdir = opendir(path);
while((dir=readdir(fdir))!=NULL)
/* If directory is an integer, its a pid... */
if (string_is_number(dir->d_name, 256))
/* we can encode its content. */
encode_pid(&buffer, strtol(dir->d_name, 0, 10));
/* last term is an empty list (make our full list) */
ei_x_encode_empty_list(&buffer);
/* We print our data to standard output */
print_pids(out, &buffer);
/* close directory and free our buffer */
closedir(fdir);
ei_x_free(&buffer);
}

Don’t forget to close all remaining file-descriptor (risk of file-descriptor leaks) and to clean the buffer after last usage (risk of memory leaks)!

Okay, our program is practically done. The main function is the most important part. We want a long living process, when this one receive any kind of input, we want to print something in ETF format. Here we only execute list_pids when 'l' character is matched, else we do nothing, just waiting for special char pattern.

int
main(void) {
/* initialize file-descriptor
* 'in' is stdin set to read-only
* 'out' is stdout set to write-only
*/
in = fdopen(0, "r");
out = fdopen(1, "w");
/* fread variables used to catch input */
ssize_t s;
char c[1];
/* we want to read from standard input until
* the end of the stream (EOF)...
*/
while((s=fread(c, sizeof(char), 1, in)) != 0) {
/* ... and if we get 'l' string, we print pids */
if (c[0]=='l')
list_pids(out, PROCFS_PATH);
}
/* we don't need those file-descriptor anymore */
fclose(out);
fclose(in);
}

Our C port part is done! We could use another method to get encoded term, but, I think this example is long enough. Now we need to make the Erlang code part, pretty simple, firstly we’ll create some function helpers to start process and associated ports:

-module(real_example).
-export([start/1, start_link/1, start_monitor/1]).


% start a new process
start(Path) ->
spawn(fun()
-> start_loop(Path) end).

% start a new linked process
start_link(Path) ->
spawn_link(fun()
-> start_loop(Path) end).

% start a new monitored process
start_monitor(Path) ->
spawn_monitor(fun()
-> start_loop(Path) end).

Next we need to initialize our main process loop, in this example, we only need Port value, I will not add more complexity in this part:

% init loop
start_loop(Path) ->
% set ports options, received data as binary
% and all first 4 bytes are the size of the
% binary packet
PortOpts = [binary,{packet, 4}],
% open port defined by Path
Port = erlang:open_port({spawn, Path}
,PortOpts),
% Enter in main loop
loop(Port).

Finally, the main loop. When this process received some external data with defined pattern, we make an action via receive (looking into process mailbox). We have made a long running process, connected to another external running process. When this process receives run pattern, it send 'l' character to external C node via port_command/2 function and get at the same time printed value from file-descriptor 1 . This last step is done with pattern matching, messages from port are defined as a tuple/2. First value is the port “name”, second value is another tuple. This last tuple contains a tag and a term (binary in our case).

% main loop
loop(Port) ->
receive
% when we receive a message from Port
{Port, {data, X}} ->
% We encode it...
Encoded = erlang:binary_to_term(X),
% And print it on stdout
io:format("receive data: ~p~n"
,[Encoded]),
loop(Port);
% when run command is received
% we send 'l' command to the port
run ->
erlang:port_command(Port, <<"l">>),
loop(Port);
% return info concerning connected port
info ->
io:format("~p~n", [erlang:port_info(Port)]),
loop(Port);
% just quit the process
exit ->
ok;
% if we receive another pattern
% we just print it
_Else ->
io:format("Received wrong patter: ~p~n"
,[_Else]),
loop(Port)
end.

First step, build your C code with make build TARGET=real_example.c , next build your Erlang code with erlc real_example , start an Erlang shell in current directory and run:

Pid = real_example:start("./_build/real_example").
Pid ! run.
Pid ! run.

The first pattern sent do nothing (it’s a little bug, if you want, you can try to correct it yourself). When running this command, you’ll can see this output:

receive data: [{1,                              
[{cgroup,-1},
{ipc,-1},
{mnt,-1},
{net,-1},
{pid,-1},
{user,-1},
{uts,-1}]},
{2,
[{cgroup,-1},
{ipc,-1},
{mnt,-1},
{net,-1},
{pid,-1},
{user,-1},
{uts,-1}]},
{3,
[{cgroup,-1},
{ipc,-1},
{mnt,-1},
{net,-1},
{pid,-1},
{user,-1},
{uts,-1}]},
{5,
[{cgroup,-1},
{ipc,-1},
{mnt,-1},
{net,-1},
{pid,-1},
{user,-1},
{uts,-1}]},
{7,
[{cgroup,-1},
{ipc,-1},
{mnt,-1},
{net,-1},
{pid,-1},
{user,-1},
{uts,-1}]}},
...
{32469,
[{cgroup,-1},
{ipc,4026531839},
{mnt,4026531840},
{net,4026531957},
{pid,4026531836},
{user,4026531837},
{uts,4026531838}]}]

Well… Its done! We get our PIDs and our namespace in simple and portable data-structure! This pasted result from my terminal show us lot of interesting thing.

Firstly, I don’t have the right to list namespace of PID 1 and some others, to do this, we need to execute this process as root or privileged users. Secondly, cgroup feature is disabled on my test server, and this value return always -1. All others namespaces seems to work as expected.

Data Binary Encoded Pattern

How to debug your raw binary encoded data-structure? A good way is to read the documentation first. Another method is to understand what is the format of ETF. Here a simple table containing all preprocessor macros with decimal, hexadecimal an character representation, extracted from the source code.

| GCC Macro               | Dec | Hex  | Char |
|-------------------------|-----|------|------|
| ERL_SMALL_INTEGER_EXT | 97 | 0x61 | a |
| ERL_INTEGER_EXT | 98 | 0x62 | b |
| ERL_FLOAT_EXT | 99 | 0x63 | c |
| NEW_FLOAT_EXT | 70 | 0x46 | F |
| ERL_ATOM_EXT | 100 | 0x64 | d |
| ERL_SMALL_ATOM_EXT | 115 | 0x73 | s |
| ERL_ATOM_UTF8_EXT | 118 | 0x76 | v |
| ERL_SMALL_ATOM_UTF8_EXT | 119 | 0x77 | w |
| ERL_REFERENCE_EXT | 101 | 0x65 | e |
| ERL_NEW_REFERENCE_EXT | 114 | 0x72 | r |
| ERL_NEWER_REFERENCE_EXT | 90 | 0x5a | Z |
| ERL_PORT_EXT | 102 | 0x66 | f |
| ERL_NEW_PORT_EXT | 89 | 0x59 | Y |
| ERL_PID_EXT | 103 | 0x67 | g |
| ERL_NEW_PID_EXT | 88 | 0x58 | X |
| ERL_SMALL_TUPLE_EXT | 104 | 0x68 | h |
| ERL_LARGE_TUPLE_EXT | 105 | 0x69 | i |
| ERL_NIL_EXT | 106 | 0x6a | j |
| ERL_STRING_EXT | 107 | 0x6b | k |
| ERL_LIST_EXT | 108 | 0x6c | l |
| ERL_BINARY_EXT | 109 | 0x6d | m |
| ERL_SMALL_BIG_EXT | 110 | 0x6e | n |
| ERL_LARGE_BIG_EXT | 111 | 0x6f | o |
| ERL_NEW_FUN_EXT | 112 | 0x70 | p |
| ERL_MAP_EXT | 116 | 0x74 | t |
| ERL_FUN_EXT | 117 | 0x75 | u |

In erl_interface.h (we have seen it before in this article), some macros exists to help you for term validation:

#define ERL_IS_INTEGER(x)           (ERL_TYPE(x) == ERL_INTEGER)
#define ERL_IS_UNSIGNED_INTEGER(x) (ERL_TYPE(x) == ERL_U_INTEGER)
#define ERL_IS_LONGLONG(x) (ERL_TYPE(x) == ERL_LONGLONG)
#define ERL_IS_UNSIGNED_LONGLONG(x) (ERL_TYPE(x) ==ERL_U_LONGLONG)
#define ERL_IS_FLOAT(x) (ERL_TYPE(x) == ERL_FLOAT)
#define ERL_IS_ATOM(x) (ERL_TYPE(x) == ERL_ATOM)
#define ERL_IS_PID(x) (ERL_TYPE(x) == ERL_PID)
#define ERL_IS_PORT(x) (ERL_TYPE(x) == ERL_PORT)
#define ERL_IS_REF(x) (ERL_TYPE(x) == ERL_REF)
#define ERL_IS_TUPLE(x) (ERL_TYPE(x) == ERL_TUPLE)
#define ERL_IS_BINARY(x) (ERL_TYPE(x) == ERL_BINARY)
#define ERL_IS_NIL(x) (ERL_TYPE(x) == ERL_NIL)
#define ERL_IS_EMPTY_LIST(x) ERL_IS_NIL(x)
#define ERL_IS_CONS(x) (ERL_TYPE(x) == ERL_CONS)
#define ERL_IS_LIST(x) (ERL_IS_CONS(x) || ERL_IS_EMPTY_LIST(x))

About Compatibility

ETF depends of Erlang version. OTP-R16 will not be totally compatible with OTP-R19 (new term like map) but you can use standard term like atom, number, string, binary, list and tuple.

If you want a strong compatibility, you can use BERT, who is standardized outside Erlang community and doesn’t follow same rules. Here a small table summary containing compatibility over different release:

Type        | OTP-16 | OTP-17 | OTP-18 | OTP-19 | OTP-20 | BERT |
------------|--------|--------|--------|--------|--------|------|

atom | OK | OK | OK | OK | OK | OK |
atom (utf8) | - | - | - | - | OK | - |
double | OK | OK | OK | OK | OK | OK |
long | OK | OK | OK | OK | OK | OK |
bignum | OK | ok | ok | OK | OK | - |
string | OK | OK | OK | OK | OK | OK |
binary | OK | OK | OK | OK | OK | OK |
list | OK | OK | OK | OK | OK | OK |
tuple | OK | OK | OK | OK | OK | OK |
map | - | OK | OK | OK | OK | - |
pid | OK | OK | OK | OK | OK | - |
pid (new) | - | - | - | OK | OK | - |
ref | OK | OK | OK | OK | OK | - |
ref (new) | - | - | - | OK | OK | - |

Be careful with this table! It was made with some diff and comparison between headers and files. Currently no running code was made to check compatibility over version, I will probably create it one day…

C is cool, but…

Yes, I know. You want to use it with Go, Rust, Scala, Clojure, Python, Perl or Haskell? You can! BERT was designed to make that possible and many community framework were born! BERT is same as ETF, only with reduced term and RPC feature. Here some implementation in other language:

Learn More

  • I use memset function a lot to sanitize buffer, you can use bzero as alternative.
  • I’ve hardcode lot of value in my code, it was totally unjustified and arbitraty. I’ve read lot of code, but no one give me the right answer. I will update this article if I found a good way to replace all those hardcoded values.
  • When I wrote C code, I use stack over heap memory allocation, in our example, a stack allocation is sufficient. An alternative (but not portable way) is to use alloca , or “how to use stack like heap”.
  • You can use stdin or stdout with Erlang ports, but its really not a good thing, I’ve tried it before using file-descriptor, and it will generate some strange output due to buffering. I think I will write something on this behavior.
  • Thanks to valgrind, I’ve found some scary bugs on my code (race condition, memory leaks…), this tool is really helpful and should be always used when you code with C. Don’t forget to add a debug mode (not explained here) and try to make test everywhere! Here my command shortcut to print error when I code in C:
while sleep 3
do
make build TARGET=myfile.c && valgrind ./_build/myfile
clear
done
  • If you need a good little “test framework” for C, I recommend you minunit, its just 3 lines of macros, and it will help you to test everything! If you want a more complete framework, I’m using ATF and Kyua (used by FreeBSD and NetBSD project).
  • No benchmarking and no more information about all of my code… With a simple ps, this little (real) example use ~4MB of virtual memory and ~700kB of non-swapped memory. Never used gprof and valgrind profilling tool, I think its the good time now!
  • All this code was written on my freetime, and take approximatively 1 week (~3h/day). Before this article, I’ve passed 1 week to read documentation and writes notes. The final rush was hard (~6h/day for 3 days), correcting typos, article mistakes, debugging on different linux versions, adding comments… But it was fun! ;)

The End

I was thinking about this subject long time ago. Erlang is a fabulous language. All important features are already integrated, documentation is pretty awesome, and source code is well written.

Currently, this article was only made to learn all those features and mastering it. I use ETF with some of my projects, to connect embedded device (ATMEL and ARM Cortex-MX chip) and perhaps for some other high level programming... Other goal was to show you one important thing: if you want to write Erlang code in your team, only one guy can write Erlang code, others can use any kind of language! Want to use crazy feature in your project without knowing Erlang? Now, you can.

If you are in this part, you know how to connect C software to an Erlang VM and how to talk with it. Like any project, we need to solve more issues and add features. In our case, the real example doesn’t list PID of other users, to do that, we can run Erlang with root user (wrong anwer). We can also use setuid on our real_example software (another wrong answer) or we can use isolation and privilege separation:

  1. Isolate connection, not using stdin/stdout but unix or udp socket;
  2. Separate our program in 2 parts, monitor process running with root privilege, and a forked one running with restricted user and privilege.

Next Time

I hope you enjoyed this little presentation of ETF. You can see all source code on a my Github account. Next time, I think we will speak about ETF too… It would be great to communicate with javascript using bertjs and make a simple web framework. Another important subject is privilege separation explained previously and how to implement it in Erlang. I never coded something like that before but its seems challenging!

References & Bibliography

Oh! The best part, thanks to everyone! Especially to Nicolas (text correction, and asks lot of questions), Mickael (asked questions and second official reader), Alarig (first official reader), and all I’ve forgotten to mention! A big thanks to Erlang community! I hope I will see you next year at EUC2018! ;)

http://bit.ly/2tYTYo2

--

--

Mathieu Kerjouan

Distributed peasant, plants Erlang and Elixir nodes everywhere he can. Uses fertilized operating system like OpenBSD and FreeBSD. Follow me on twitter! ;)