Well, that’s a mouthful… Anyway let’s start with some context. I am a French software developer working for Doctolib in our Berlin offices with a team of developers and product owners.
We build Zipper, a standalone program that stands between the Doctolib website in a browser and our partners’ software to bind everything together, all thanks to Native Messaging. These bridges help our users save time by removing the need for double entry of patient data, in Doctolib and in their own tools, with easy navigation between the two.
Sometimes we need to have Zipper hook into native libraries, some of which are proprietary. For this purpose node-ffi usually works fine, until we need to asynchronously call into thread-unsafe libraries. Node.JS being inherently concurrent, mixing these causes trouble.
Since pkg runs with Node.JS we can do everything a Node.JS program can do, including loading and calling into DLLs (if you don’t know what this means, hang on; I will explain it later). Recently though, we needed to interact with some software by simulating user input so we turned to AutoIt, a scripting language designed to interact with Windows GUI elements, whose functions are also available as a DLL. It turns out this library is not thread-safe and that by using node-ffi naively we would get into trouble (crashes, mostly) by issuing concurrent calls. But before going any further let’s just have a refresher on what DLLs are and how they are used, especially with Node.JS.
Intro to node-ffi
Dynamic-link libraries (DLLs) contain code a running Windows program can load and execute. Some are provided by the operating system, some are provided by third parties and are installed either by the programs that need it or separately by the user. “Dynamic-link” means the libraries are loaded at runtime as opposed to being included directly into your executable, which has an interesting consequence: as long as the interface is the same, you can swap library versions and still have your program work fine with them without rebuilding it.
node-ffi is the de facto standard for loading and calling into DLLs (and their equivalent on other systems) from Node.JS. It provides you with an object whose functions represent functions from the library, which you can call synchronously or asynchronously. Let’s see an example.
toto.dll is a library that was provided to us by a third party, along with
toto.h, a C header file which contains the definitions of the functions from the library.
This simple library provides two functions:
toto_foohas two integer parameters and returns an integer
toto_baraccepts a single pointer argument and returns nothing
Using node-ffi we can load this library like this:
This works fine, until you find yourself with a thread-unsafe library.
What if the library is not thread safe?
Libraries have initialization code and deinitialization code, can allocate or deallocate memory, and have access to all the same memory as your main process. But most importantly, they can hold global state. Anyone who’s ever worked with concurrency most certainly knows that concurrency and global state cause much sadness and suffering when put together.
toto_foo is thread-unsafe. Maybe it uses some global state or does some I/O that is not properly synchronized. The following code will randomly crash or misbehave because Node.JS may have multiple threads calling into the library simultaneously, which the library does not expect.
Using synchronous calls
The obvious solution in that case would be to use synchronous calls.
Sadly for our purposes this could not work as we use many AutoIt functions which wait for specific events to happen, and would block our process from performing any of the other tasks it needs to perform at any time.
Serializing asynchronous calls
Fixing the library
In the case of a free and open-source library, or a library you built yourself, you can of course fix the library to make it thread-safe. There is no way I can cover this subject in a single blog post, or even many. For each library adding support for multi-threading will be a different problem which requires solid knowledge of concurrent programming and of the internals of the library being modified, plus lots of time, especially for larger libraries.
Wrapping the library
This is the solution we eventually went for. We actually had cases where we needed to wait on some event using AutoIt, while simultaneously issuing other calls that would lead this event to happen. However, the DLL’s implementation of the waiting function was blocking. Node-ffi lets us run this blocking function asynchronously by running it in a separate thread.
However, if we serialize the calls, this will inevitably lead to a deadlock: if we simultaneously run
- a call that waits for an event
- a call that contributes to producing said event
and we serialize the calls, the second call will never happen and the first one will never return (unless it times out, which isn’t what we want either).
Because we do not have access to the sources of the AutoIt library we could not try and make it multithreaded, so we decided we would write a wrapper around the library which exposes an identical interface (making our wrapper a drop-in replacement for the real library). I will only give a high-level overview of this solution because it is quite a bit more complex than the previous ones I presented. If you are curious you can get the code to our wrapper on Github.
We were thinking: this library is a high-level wrapper for Windows system calls which are thread-safe, so the issue had to be in the library implementation, likely in the form of global state or the like. So we thought a possible solution would be to load the library multiple times, each time instantiating a duplicate of its internal state. And so as a proof-of-concept we built a wrapper with no internal state which for every call to the library would
- Load the library (with LoadLibrary).
- Get the function we’re calling.
- Call it.
- Unload the library (with FreeLibrary).
This did not work. It turns out calling
LoadLibrary multiple times to load the same library always returns the same instance. The more flexible
LoadLiraryEx does not have an option to override this either so we decided to trick Windows into believing we were loading a different library. Thus our second proof-of-concept attempt was still a stateless wrapper which would do this at each call.
- Find the library.
- Copy it to a temporary file.
- Load the temporary file as a library.
- Get the function we want to call.
- Call it.
- Unload the library.
- Delete the temporary file.
It roughly looks like this:
Of course this is getting ridiculously inefficient because it copies, loads, and deletes a file for each and every call to the library but it works! We could do many simultaneous calls and nothing broke (almost, more on that later). We later improved the performance by keeping instances of the library in a pool so that we don’t need to copy and load it for every call.
While this strategy worked fine for our purposes, it is only a first working solution. It has allowed us to use AutoIt to interact with multiple GUI elements simultaneously, speeding up these interactions significantly! (One particular form used to take about a second and a half to fill, and is now complete in about 100 milliseconds.) There is much room for improvement: we could for example build a generic tool that would apply this technique to arbitrary libraries.
Are you a full-stack developer? Do you want to join an awesome team dedicated to easing access to healthcare in France and Germany? We are recruiting! Get in touch with us at https://about.doctolib.com/jobs and join us in our Paris or Berlin offices!