One year of continuous product improvements at ArangoDB

santo leto
11 min readMay 6, 2019

--

Introduction

You may have heard that ArangoDB v3.4 has been declared GA and ready for production use on December 6, 2018.

Before publishing v3.4.0 GA several RC versions were published from September to November 2018. The first public milestone of 3.4 was released in January 2018. That means that the Engineering Team at ArangoDB has invested quite some effort and time for v3.4, about one year.

If you read the list of new features and over forty improvements delivered with v3.4, you can easily understand why it took some time to complete this version, and what made it kind of a “special” release. I believe the words of one of our users express this very well:

“I just read a book called ‘What’s new in 3.4’. Congrats on the fat and interesting update.”

Good feedback like the above is a great motivator for the entire ArangoDB family, who works daily (literally) to improve and make the product better.

However — and now we come to the main topic of this post — one might think that all the effort of our Engineering Team was focused on the new 3.4 version. This is not exactly true, actually.

While working on new major versions and delivering new features and improvements is important, it is also important to improve the experience of users and Customers who are already using ArangoDB in production — and they are many!

So what has happened during the last year to the previous GA version, 3.3, and what has been included in the several patch-releases that have been published?

The short answer is “indeed a lot”.

The ArangoDB Product and Engineering Teams, in my opinion, have been great to not only provide the needed bug fixes, but also listen to the feedback coming from our paying Customers and from the Technical Support Engineers who have been passionately focused on keeping them happy and successful. The net result is that the 3.3 series has greatly improved over time.

I will go into some details of a few improvements delivered in the 3.3 patch-releases. Some other improvements are mentioned at the end of this post, and the full changelog for 3.3 can be found here:

  1. Rolling Upgrades of ArangoDB Starter deployments (v3.3.8)
  2. Brand New Replication Tab in the ArangoDB Web Interface (v3.3.11)
  3. New arangorestore options to simplify change of shard structure in Clusters (v3.3.22)
  4. Fast Cluster Restore Procedure (any 3.3 version)

v3.3.8 — Rolling Upgrades of ArangoDB Starter deployments

The ArangoDB Starter tool highly simplified the way one can deploy ArangoDB, especially the way one can use to deploy Clusters, Multiple Data-Center Clusters, and Active Failover setups.

While upgrading these deployments was possible with some “manual” work, there was no integrated mechanism inside the Starter to perform a rolling upgrade, i.e. an upgrade that is done in a rolling fashion and that does not require downtime of the service. This arrived in v3.3.8 (so very welcomed!) and was then further improved in v3.3.14 with the new Starter command:

arangodb upgrade — starter.endpoint=<endpoint-of-a-starter>

(the previous command required to trigger the rolling upgrade from v3.3.8 to v3.3.13 was an HTTP post request to the endpoints of all the Starters part of the deployment)

In v3.3.14 new Starter commands to retry or abort the upgrade (arangodb retry upgrade and arangodb abort upgrade respectively) were also included.

The probably non-visible (to the external users) work that was needed to allow such rolling upgrades was the new maintenance mode of the supervision job of the Agency.

You can read the full details of the rolling upgrade procedure for Starter deployments in our Documentation (the procedure can be used to upgrade from 3.2 to 3.3 and from 3.3 to 3.4).

v3.3.11 — Brand New Replication Tab in the ArangoDB Web Interface

This section first requires a short introduction.

Version 3.3.0 included a new deployment mode of ArangoDB: the Active Failover.

In an Active Failover setup:

  • One ArangoDB Single-Server instance is read / writable by clients, and is called Leader
  • One or more ArangoDB Single-Server instances, are passive and not writable, and are called Followers. They asynchronously replicate data from the Leader
  • One Agency acts as a “witness” to determine which server becomes the leader in a failure situation

Version 3.4 further improves Active Failover, allowing reads from Followers (which were not possible in 3.3).

With the introduction of the Active Failover, which coexists with the traditional Master/Slave setup that was present in 3.2 (and still is available in 3.3 and 3.4) the possible “roles” a single instance can have increased. In fact, one server could be (in additional to a standard single-instance):

  • A Master of the legacy Master/Slave setup
  • A Slave of the legacy Master/Slave setup (which replicates specific databases, or all databases, in case the global replication applier is in use)
  • A Leader of the Active Failover mode
  • A Follower of the Active Failover mode (global applier is in use in this case: the Followers replicate all databases of the Leader)

It soon became evident that it was necessary — and important — to improve the Replication Tab of the ArangoDB Web Interface so that additional and more useful information could be provided to the users for each of the combinations described above.

This is exactly what happened in v3.3.11 (and this is a clear example on how the feedback coming from our Open Source Community — and internally advocated by the the ArangoDB Support Team — translated into a very specific and delivered product improvement).

v3.3.22 — New arangorestore options to simplify change of shard structure in Clusters

This is a recent improvement, delivered with v3.3.22 (as well as v3.4.2).

Having the most optimal shard structure in place in a Cluster is crucial. Query performance depends on it (having an optimized data model in place is obviously important as well, and, actually, it’s important to think and discuss all these topics — data model, shard structure and AQL queries — together and well before going in production).

Unlike the replication factor (other crucial parameter to keep in mind when using the Cluster), the shard structure is not something that can be changed dynamically at runtime, when the Cluster is up and running. On the contrary, it requires a maintenance window to be scheduled, and in most cases a dump and restore (using DC2DC to replicate the data of the Cluster into another Cluster that has a different shard structure in place is also something that can help — should you be in the need to change and/or test a new shard structure and you want to minimize the operational impact of such activity on your production Cluster).

ArangoDB uses hash sharding based on the shard key. In this setup changing the number of shards requires a complete redistribution of data. This cannot be done during runtime without major disturbance of the operations. We will address this problem in future versions. Going into more technical details of shard key and shard structure is outside the scope of this post.

But what has improved regarding this topic from v3.3.22 (and v3.4.2)?

Starting from these versions, the tool arangorestore has a new (and more powerful) option called — number-of-shards. This option deprecates the old — default-number-of-shards.

How this new option can be used to change shard structure is easy:

  • Assuming you have a specific shard structure, let’s call it ShardStructure1 and you have taken a dump of your Cluster, let’s called it Dump1
  • To have a new shard structure in place, let’s call it ShardStructure2, you can restore your dump using the option — number-of-shards to set new number of shards for all your collections, or some of them (the option is quite flexible in allowing you to handle many different cases) in a way that it reflects ShardStructure2

Before 3.3.22 (or 3.4.2) the following was needed instead:

  • Create the collection structure (before the restore) — manually or making use of some code
  • Restore the backup (only the data, without replacing the newly created collection structure with the new shard structure in place)

From an operational point of view, not having to perform the extra collection structure creation step helps — as you can do everything with just a single arangorestore command.

The fact that a dump and restore is needed, makes this operation still a bit inconvenient, but as discussed above this is the needed trade-off. Still things are much easier now thanks to the new — number-of-shards option.

Note that in the same versions, a new option — replication-factor has been added, to easily change replication factor when restoring — but that is another topic :)

Fast Cluster Restore Procedure

The in-the-making 3.4 version of ArangoDB was addressing parallelization of restores already (a — threads option in arangorstore has been introduced starting from v3.4.0). This was indeed important and needed to speed-up and optimize the restore of a backup.

However built-in parallelization of restore required some work that couldn’t be easily backported into the already GA 3.3 serie. It is always important, in fact, to not compromise the stability of a GA version, while still striving to improve it.

What to do then to handle this topic in 3.3? Something was needed to address some speed issues when restoring huge backups, especially in the Cluster.

A brand new Fast Parallel Restore procedure was hence introduced soon after 3.3 was GA — and it can actually be used also in 3.4 (on top of the built-in — threads option mentioned above) to further parallelize the restore.

The idea behind the Fast Parallel Restore procedure is not complex. In the “standard” procedure what you do is to pass the endpoint of a Coordinator to arangorestore, and the restore tool use it to restore the full backup in the Cluster. Now, if we could leverage the other Coordinators to restore, in parallel, the process must be much much faster. This has to be done, however, in a way that different collections are restored from different Coordinators — and we can’t leave this required check to our users and Customers: we were needing a way to automate this process to keep operations required from users and Customers as simple as possible.

This is why the parallelRestore script was developed: it reads the content of a backup directory and produce different scripts (that you can then copy to the different Coordinators) to restore, from each Coordinator, a sub-set of (different) collections. As the commands included in the created scripts are ready to be executed, you do not need to double check if all collections are included and which one will be restored from which Coordinator: you just have to execute the scripts.

Speed improvements obtained in 3.3 with the use of the Fast Parallel Restore procedure are impressive — especially if also linked to the temporary reduction of the replication factor to 1 (which is then increased after the backup has finished and that reduces the number of network hops needed during the restore).

If you are in 3.4 and you are wondering if this procedure can also give you benefits on 3.4, the answer is in general yes, but it also highly depends on your specific case. The reason why it helps is that in addition of parallelizing the restore of multiple collections in a single Coordinator (which you obtain with the — threads option) with this procedure you also parallelize over multiple Coordinators (so a parallelization over multiple Coordinators, and on each Coordinator an additional parallelization as the restore can be done using different threads).

Other Improvements & Conclusions

Is this blog post enough to list all work done by our Product and Engineering Teams during the last year to improve the already GA-ed 3.3 and make our users and Customers more happy and successful?

Probably not. However I do hope I gave you useful details on a few specific improvements that were delivered in 3.3, and highlighted at the same time how we handled the need (and at the same time challenge!) to improve a production-ready version (3.3), while working to the new so reach of features and improvements 3.4 release.

Other improvements introduced in 3.3 and worth to mention in my opinion are:

  • A new — ignore-missing option in arangoimp — from version 3.3.1
  • A new — force-same-database option in arangorestore — from version 3.3.3
  • Improved write performance on the RocksDB engine, with the introduction of the new option — rocksdb.throttle — from version 3.3.3
  • A new fallback rule for databases and collections for which an access level is not explicitly specified — from version 3.3.4
  • A new “collect-in-cluster” and “restrict-to-single-shard” optimizer rule — from version 3.3.5
  • Improved supportability thanks to the new debugDump helper function in ArangoSH — from version 3.3.6
  • A new /_admin/status HTTP API — from version 3.3.6
  • A new — database.required-directory-state option — from version 3.3.8
  • Improved startup resilience in case there are datafile errors (MMFiles storage engine) — from version 3.3.8
  • New scan-only and index-only optimizations for AQL queries — from version 3.3.9
  • New /_admin/server/availability HTTP API for monitoring purposes
  • New arangoinspect client tool, to help users and Customers easily collect information of any ArangoDB server setup, and facilitate troubleshooting for the ArangoDB Support Team (let me add a yay, since I am part of the Support Team and I know how important that topic is :) ) — from version 3.3.11 (and then improved from version 3.3.13)
  • A new — rocksdb.sync-interval option — from 3.3.13
  • Improved cursor API (now with load balancer support) — from version 3.3.13
  • A new — query.optimizer-max-plans option — from version 3.3.15
  • More detailed progress output to arangorestore, showing the percentage of how much data is restored for bigger collections plus a set of overview statistics after each processed collection — from version 3.3.15
  • Several advanced new options for configuring and debugging LDAP connections — from version 3.3.17
  • A new option — rocksdb.enforce-block-cache-size-limit — from version 3.3.20
  • Memory optimization (under Linux and Windows) with the new option — rocksdb.total-write-buffer-size from version 3.3.20 (and then additional memory improvements from version 3.3.21)
  • Additional validation (in the cases where it was not done before) of the uniqueness of attribute names in AQL — from version 3.3.22
  • New — server.jwt-secret-keyfile option — from version 3.3.22
  • New — cleanup-duplicate-attribute in arangorestore — from version 3.3.22
  • In general a bunch of AQL query, index and performance optimizations

What about Kubernetes? Good question. Yep: a brand new ArangoDB Kubernetes Operator starting from 3.3.13 (and improved over time).

All the above improvements — and the Kubernetes Operator — are also available in v3.4.

A full list of the changes delivered in the 23 so-far-published patch-releases of the 3.3 serie can be found here (well they are actually 22 as 3.3.18 was not publicly released).

If you didn’t have a chance to try our 3.4 version yet — I invite you to do so, and report your feedback via the usual channels (GitHub, Google Group, Slack, Stackoverflow).

If you are still using 3.3 and want to have more time before upgrading to 3.4, please consider to upgrade at least to latest available patch-release of 3.3, so you can benefit from all the bug fixes and improvements introduced in 3.3 over the time.

The next ArangoDB version (currently called 3.5-devel, but the name might change) is currently being worked on.

Could we expect improvements to be delivered in 3.4 as well, while 3.5-devel is in the making?

My personal answer to this question, having had the chance to see what happened with 3.3 while 3.4 was being developed, and having had the privilege to work with the highly committed and great Product and Engineering Teams at ArangoDB is: most probably yes. So stay tuned!

About the Author

Santo Leto is leading the Team that provides worldwide post-sales services to all ArangoDB Customers, including Development and 24x7 Production Technical Support, Customer Success (on-boardings and proactive customer engagements) and remote and on-site Consultancy and Training delivery. Previously, in addition to the post-sales services, Santo has worked to build the QA and Documentation Teams, before they were mature enough to ‘spin-off’ and become separate Teams. With 12+ years of work experience in engineering, customer-facing and managerial roles, at VC-backed start-ups, bootstrapping companies, as well as large public corporates, Santo started his career as Lead Developer of graphical user interfaces for RDBMS, before specializing in post-sales Customer Services, working in the Oracle’s MySQL Team first, and for other graph and multi-model database players then.

--

--