Static vs Dynamic libraries

Andrew Birnberg
7 min readAug 9, 2017

--

“Static vs dynamic libraries,” performed by a man and a walrus

When writing a C/C++ program it is almost (maybe completely) impossible to do anything meaningful without using an external library. Besides the standard library functions that are packaged with your compiler, there are additional libraries used by the compiler to generate boilerplate assembly code and to maintain particular calling conventions.

As a programmer you have two choices about how to make those libraries available to your program: either dynamically or statically linked. Actually, in many cases you may not have a choice if, e.g. the standard library is only available as a shared object. And when you are trying to write code that can be easily maintained in the long term, using dynamic libraries can significantly reduce the burden of updating your application.

What is a dynamic library and how does it differ from a static one?

In general, a library is an archive of pre-compiled object files comprising a collection of related code. By including the library you gain access to those functions without having to write them yourself. When you compile using a static library, all of the code that is required by your program is directly pasted into your final executable by the linker and increases the final size of your program according. By contrast, a dynamic library, or shared library, is included in your program only at run-time allowing the executable size to be significantly smaller. Like the man and the walrus doing sit ups at the beginning of this article, a statically linked program carries its house on its shoulders.

Static vs dynamic linking

There are a number of other benefits to dynamic linking in addition to reducing your program size:

  1. The required library can be updated without needing to recompile the executable depending on it.
  2. Multiple programs requiring that library can share a single copy across the system, reducing the total memory footprint of all those running processes.
  3. It is often possible to include only individual functions from a dynamic library without having to load the entire library into memory.

While dynamic libraries present a number of advantages, there are several drawbacks associated with their use:

  1. If the target system can’t be guaranteed to have copy of the required dynamic library, your code might not run.
  2. A non-backward compatible change to a library your code relies on could break your program.

The first of these drawbacks is frequently encountered in embedded systems or where users don’t have permission to install or run certain types of code. The second drawback can be mitigated by specifying version numbers when compiling your code, but is not always foolproof as symbolic links may be updated to point to newer versions of a library. With a well designed and maintained dynamic library, however, most code is safe from this kind of problem.

How to make a dynamic library

Say you have developed your own machine learning framework and you want to create a dynamic library to call in your own work or to distribute to others. The primary way to do this on UNIX-based systems, such as Linux, is to use gcc and compile your code collection with the -c flag to stop compilation before the linking stage. Then, again using gcc, compile the group of .o files as a shared object with the -shared flag. When initially compiling to object files, there is another important flag to use, -fPIC. PIC stands for position independent code. What this means is that the machine code instructions for each compiled function do not reference absolute memory addresses within an executable, but are specified only relative to themselves by an offset found within a global offset table, or GOT. The primary reason for this is that since a shared library may be used by multiple programs at once, it does not share the same virtual memory as any of the processes that use it. Therefore, any absolute address in its code might conflict with an address from the process that called it.

A simple bash script to compile a folder of .c files into a dynamic library is shown below:

As described, the -c option stops compilation before linking and-fPIC produces code that can be run anywhere within a program by referencing a GOT. It is often a good idea to include -g in the compilation flags for your homemade dynamic libraries, as this produces debugging information that can make tracking down issues in your code much easier, especially when you’re only updating the dynamic library, but not the applications depending on it.

When you have a shared library (either created by you or from a 3rd party) you might want to know what functions it implements. On Linux the simplest way to do this is by using the nm utility to list the symbols present in the object files. If you shared library is called libc.so you could use nm -D path/to/libc.so to list the symbol contained. There are many options for formatting the output of this command, but the letter printed next to the function name indicates whether it comes from the Text (code) section,Read only section, etc. Unknown means that the symbol is undefined in the object files but frequently indicates that it is defined in another shared library called by the one you are interrogating.

Output of executing `nm -D /lib/libcryptsetup.so.4`

Likewise, if you have an executable and you want to know what shared libraries it relies upon, the ldd utility lists the library dependencies for that program and where they are located in memory.

Output of `ldd gm`. gm is an irrelevant executable just for an example

Dynamic libraries work differently on different operating systems

So far, I have only been addressing dynamic libraries on Linux. In this context, they are typically called shared object files and given the extension .so. This is distinct from static libraries, which receive the .a extension, for archive. Other OS’s using different extensions, such as .dylib (macOS) or .DLL (Windows).

While I won’t go into detail about how these libraries work in those contexts, they are fairly similar in that the code must be position-independent to integrate properly with existing executables and should have the ability to be called by multiple programs at once.

On Linux, there are several standard locations where it is customary to store and search for shared libraries. First, /lib contains libraries used by the bootloader for the entire system. Second, /usr/lib contains mature shared libraries routinely used by the system. Third, /usr/local/lib contains user-installed and homemade shared libraries. These locations will be searched by default when loading a shared library into memory. Other locations can be added by the ldconfig utility, as described below.

How to use a shared library in a program

Once you have compiled and set up your shared library on your system, the process of using it is similar to a static library. With gcc you must specify both where in the file system the library file is located and what it is called. The -L flag tells the compiler where to search and the -l flag immediately precedes the base name of the library, prefixing whatever you write with lib and automatically giving the .so, e.g. if you make a library called libdyn.so, and it is located in the same directory as your source code, you would write gcc -L. -ldyn myfile.c.

This is not the only way to use a dynamic library, however. In practice, to gain the benefit of code sharing across applications, your shared library should be easily discoverable by other programs. The way this is done under Linux is by creating a cache of symbolic links to all the known shared libraries on the system to obviate the need to search through a list of directories every time a shared library-dependent program is run. The command-line utility ldconfig is used to update this cache and the references every time a dynamic library is updated or a new one is added to the system.

Alternatively, if you only want to provide short term access to a new library, you can set the LD_LIBRARY_PATH environment variable to include the directory containing the .so file. Like the PATH variable, LD_LIBRARY_PATH is a colon-separated list of directories to search.

One other method is to bake the path to your dynamic library into the executable itself by giving the -rpath=<path/to/lib> flag to the linker. This will store the path to the library in the code itself, which the loader will recognize at runtime. This is not optimal, however, because moving the library will break the program.

Overall, dynamic linking of shared libraries can streamline your application development by separating the library code from the program code. It also reduces the size of executables by including only the logic of the program in the machine code. And it provides a mechanism for widely used code, such as the C standard library, to be shared among the many processes that need to use it.

--

--