Run, config, run: How we speed up config deploy

Eugene Tupikov
Bumble Tech

--

Configuration files (configs) are an integral part of most applications, but practice shows that they are far from being the most popular topic of discussion. Conversations about configs are mostly limited to discussing working with them directly in the code: how to structure them, whether or not using environmental variables, where to store passwords etc.

In my view, there is another side to working with configs that deserves attention — deployment. Over the course of my career, I have seen quite a number of ways of deploying configs. I explore them in this article and hope that everyone will learn something new.

A few years ago I was working on a system which enabled us to speed up the config deploy process for 1,000+ servers from around a minute to just a few seconds.

If you are interested in knowing how the config deploy process is set up at Bumble Inc — the parent company operating Badoo and Bumble apps — and the tools we use, then scroll on.

The purpose of any tool is intended to solve a specific task. Before we go on to describe our config deploy tools, here is a little of the background.

Real-life situation(s)

Previously, we had over a hundred different services written, and their number continues to increase. At the same time, each service can have anything from two or three, to several hundred instances, and that means that our code has to offer the option to promptly remove queries from a given instance. After all, as Murphy’s Law says, “What can go wrong, will go wrong.”

The simplest example is that one of the instances of a service starts to misbehave, and so we need to remove traffic from it in order to avoid things snowballing (when query run times increase, queries start to queue up, which in turn impacts on query run times etc).

Another example is scheduled work such as for example, a service update. In this case, we know beforehand that the service is going to be unavailable for some time and so there will be no point sending queries to it.

To switch off services, we use our own system which is called “disable hosts”. The principle which it operates by is pretty simple:

  • we select the services in the web interface to be switched off (or, switched on)
  • we click the Deploy button
  • the changes are saved in the database and are then delivered to all the machines on which PHP code is run.

To check the host availability in the code we use the following method:

if (\DownChecker\Host::isDisabled($host)) {
$this->errcode = self::ERROR_CONNECT_FAILED;
return false;
}

In the first version of “disable hosts”, for the purposes of delivering changes to servers, we decided to use our main config deploy system called mcode. Using SSH it runs a given set of commands with the additional framework of this process.

The deploy process breaks down into several steps:

  • packing the configs to a tar archive
  • copying the archive to a server via rsync or scp
  • unpacking the archive to a separate directory
  • switching the symlink to a new directory.

The distinctive thing about mcode is that it waits for the current step to be completed on each server before moving on to the next one. On one hand, this provides greater control over each step, and guarantees that for 99% of servers the changes will reach their destination at more or less the same moment in time (pseudo-atomicity). On the other hand, problems with one of the servers can lead to the deploy process freezing, while it waits for the current command to complete (time out). If on a given server, it proves impossible to perform a command several times in a row, then it will get temporarily excluded from the deployment. This allows you to reduce the impact that problem servers have on the general duration of deploy.

Because of this, mcode provides no guarantees that changes will be delivered at high speed. All you can do is predict the maximum duration, i.e. calculate the sum of the timeouts for all steps.

Now, let’s imagine that we are starting to experience problems with a particular service. We have added it to the list of disabled hosts but, in the process of the config deploy, one of the servers has become temporarily unavailable (for example, problems with the network have arisen). While the system is waiting for the current command to complete timeout, we are losing precious time.

Clearly, it has become clear that our traditional way of deploying configs is unsuitable in this case, and an alternative is required. We need high-speed delivery of changes, and for this, we need to deploy to each server independently of one another. At the same time, we are aware that a proportion of the servers will continue to have an obsolete version of the config.

Looking for an alternative transport

At this point, it might be reasonable to ask: Why reinvent the wheel if you can use a classic approach with a database and caching?

Overall, this is an approach that works but in my view, for a number of reasons, it is inferior to an approach with files:

  • Using a database and cache means additional dependency on an external service, and that means another potential point of failure
  • Reading from a local file is always faster than a query to the network, and it may sound a bit over the top, but every millisecond counts for us
  • Another advantage of reading from a file is the fact that the whole config is essentially a static array, and that means that after it has run for the first time it gets cached to OPCache, which delivers a slight increase in productivity.

However, the approach with a database may be slightly improved upon: instead of direct reading from the source (database), a script, which is run on each server with a set periodicity (for example, using cron), reads the data you need from the database and updates the config. I think a lot of us have used this approach, for example, for “warming up” a cache.

However, with this approach, we disliked the idea of a cycle of queries. We have about 2,000 servers on which PHP code can be run. If we perform just one query to the database/cache per second, this will create quite a large pool of queries (2k rps), while the data itself will not be updated so frequently. There is a solution for this eventuality: an event-based model, for example, the Publisher-Subscriber (PubSub) design template. Of the popular solutions, one you could use is Redis but, in our case, this solution didn’t “catch on” (we use Memcache for caching, while we have our own separate system for queueing).

On the other hand, Consul, which has a mechanism for tracking changes (watches) based on blocking queries, did “catch on”. Overall, it is similar to PubSub and fits in with our approach using event-based config updating. We decided to make a prototype of the new transport based on Consul, which over time evolved into a separate system called AutoConfig.

How AutoConfig works

This is how the system works in general terms:

  • The developer calls a special method which updates the value for the key in the database and adds the key to the deploy queue
  • A script operates in the cloud, which excludes a batch of keys from the queue and writes them to Consul; besides the keys themselves, a special index map, with a list of all the keys, is saved to Consul (I will tell you about this a bit later)
  • Consul watch is run on each server — this tracks changes to the index map and calls a special script handler when the index map is updated
  • The handler writes changes to a file.

As for the code, working with the system is quite simple:

Updating the key

$record = \AutoConfig\AutoConfigRecord::initByKeyData(‘myKey’, ‘Hello, Medium!’, ‘Eugene Tupikov’);$storage = new \AutoConfig\AutoConfigStorage();
$storage->deployRecord($record);

Reading the key

$reader = new \AutoConfig\AutoConfigReader();
$config = $reader->getFromCurrentSpace(‘myKey’);

Removing the key

$storage = new \AutoConfig\AutoConfigStorage();
$storage->removeKey(‘example’);

More about Consul watch

Consul watch has been implemented using blocking queries in HTTP API. How this works can be better shown based on the example of tracking changes to the key.

Firstly, a new key has to be created. To do so, we open the terminal and run the following command:

curl -X PUT — data ‘hello, Medium!’ http://127.0.0.1:8500/v1/kv/medium-key

Then we send a query for our key to be read

curl -v http://127.0.0.1:8500/v1/kv/medium-key

API processes the query and returns a response containing the X-Consul-Index HTTP header with a unique identifier that corresponds to the current state of our key.

...
...
< X-Consul-Index: 266834870
< X-Consul-Knownleader: true
...
...
<
[
{
"LockIndex": 0,
"Key": "medium-key",
"Flags": 0,
"Value": "dXBkYXRlZCAy",
"CreateIndex": 266833109,
"ModifyIndex": 266834870
}
]

We send a new read query and, in addition, we send the value from the X-Consul-Index header in the index query parameter.

curl http://127.0.0.1:8500/v1/kv/medium-key?index=266834870
...
...
Waiting for changes to the key...

When it sees the index parameter, the API begins waiting for changes to our key, and waits until the waiting time exceeds the assigned timeout.

Then we open a new terminal tab, and send a query to update the key

curl -X PUT — data ‘updated value’ http://127.0.0.1:8500/v1/kv/medium-key

We come back to the first tab, and see that the read query has returned an updated value (the Value and ModifyIndex keys have changed):

[
{
“LockIndex”: 0,
“Key”: “medium-key”,
“Flags”: 0,
“Value”: “dXBkYXRlZA==”,
“CreateIndex”: 266833109,
“ModifyIndex”: 266835734
}
]

When we call the command

consul watch -type=key -key=medium_key <handler>

Consul watch automatically runs the sequence of calls referred to above and calls the handler if the value of the key changes.

Why we need an index map

Besides tracking changes to a specific key, you can subscribe to changes to the keys with a given prefix. Let’s assume that all our keys have a shared prefix: auto_config. Run the command

consul watch -type=keyprefix -prefix=auto_config/ <handler>

And there will be a surprise in store. Change any of the keys and a list of all the keys will be returned. It was clear that over time, the number of keys will only increase and, in addition, the total amount of transmitted data will also increase. And this will increase the load on the network which needs to be avoided.

On this point, there has been an Issue open on GitHub for quite a while already and, judging by the comments, things are moving forward. Consul developers have started work on improving the subsystem of blocking queries, which should resolve the issue referred to above.

To avoid this limitation we decided to use an index map with all the available keys with the hash of the current value. Consul Watch tracks changes to this index map.

This is the format of the index map:

return [
‘value’ => [
‘version’ => 437036,
‘keys’ => [
‘my/awesome/key’ => ‘80003ff43027c2cc5862385fdf608a45’
...
...
...
],
‘created_at’ => 1612687434
]
]

If the index map is updated, the handler:

  • reads the current state of the map from the disk
  • finds the keys that have changed (for this you need the hash of the value)
  • via HTTP API reads the current values of the keys that have changed, and updates the necessary files on the disk
  • saves the new index map to the disk.

And a bit more about Consul

Consul is an excellent system but over the years it has been used as a transport for AutoConfig, we have discovered a range of limitations which we have had to adapt our system for.

Limitation of the size of the key-value (and not just that)

Consul is a distributed system and to achieve consistency the Raft protocol is used. For stable operation of the protocol in Consul, the maximum size of the value for one key is set at 512 Kb. This can be changed using a special option but it is not at all recommended that you do so, since the change may cause the whole system to behave unpredictably.

We use transactions for atomic writing of changed keys and the index map. An analogous limitation of 512 Kb has been set for the size of a single transaction. Besides this, the number of operations permissible in a single transaction is limited, namely to 64.

To bypass these limitations we have:

  • split the index map down into several parts (1000 keys for each) and we only update the parts containing updated keys
  • limited the maximum size of a single AutoConfig key to 450 Кb, in order to leave space for shards of the index map (values chosen by trial and error)
  • improved the script which processes the queue for deploy, so that it:
    - first reads N keys from the queue and verifies their cumulative size
    - if the size does not exceed the set limit, then it deploys all the keys at once, and, if not, it deploys the keys one-by-one.

No built-in replication

Being off-the-shelf Consul does not have replication between data centres. We have several data centres, so we are forced to write changes to each data centre sequentially. Periodically situations occur where data has been written to one data centre but not to another, and this leads to inconsistency. Where a temporary error has occurred, we retry the query but this does not solve the problem completely.

We have a separate script for fixing inconsistency. It reads the current state of all keys from the database, compares them with what has been saved in Consul, and updates data if there is a discrepancy.

In lieu of a conclusion

Despite all the limitations described above we have managed to create a system for delivering configs to most servers within 3–10 seconds.

Should a given server be temporarily unavailable the system automatically restores the current state of the config as soon as it returns to normal.

In recent years the system has become popular with our developers and today it is the de facto standard we are working with configs for which atomic distribution is not required. For example, via AutoConfig we deploy the parameters for A/B tests and promo campaigns settings and, on this basis, Service Discovery functionality is implemented — and much more besides.

The total number of keys at present is of the order of 16,000, and their cumulative size is around 120 Mb.

Thanks for reading!

--

--