Consolidating Configurations: A Modern GraphQL API

Published in

Xandr-Tech

6 min readAug 21, 2020

Yield Analytics (YA) is one of hundreds of products offered by Xandr that adds value to advertisers seeking to reach customers at scale. The YA product serves as an “oracle” for customers by using historic data from advertisers to make quantitative predictions about current and future campaigns. Technically speaking, YA has been around since 07'. While it isn’t sporting a near two-decade lifetime like Windows XP, it is a mature piece of software by any standard.

Through the years of development cycles YA has accumulated a number of configurable options reflected at the client level. Some options give control over core forecasting metrics, others address data pipelining parameters, and still others work to integrate third party features. All of these options in total results in each client having (approximately) 700–1000 configuration key-value pairs. Adding insult to injury is the spread of sources that the key-values are found in. Some exist in the source code, others are present in a multitude of XML files, and still others exist in a database. Conflicting keys can override one another as defined by a pseudo-hierarchy at runtime.

Any developer or service tech looking to make changes to a client configuration must understand the following:

Out of the (on-average) 1000 configurations for a client, which key or group of keys do I need to change?
What source are those keys in? An XML file, a database, or perhaps some combination of the two?
Do any overrides exist in the same source or other sources which could conflict with the new value?
What is a reasonable value to use for that configuration?
If a change is made, will it be made in the correct source location?

The time commitment alone to change even a single configuration value becomes increasingly large with each complication. Obviously, this has become a serious pain point among developers and services alike.

Designing a Solution

A good solution to the configuration problem eliminates most of the above complexities, and adds a few small features:

Clearly defines the location of configuration keys, reasonable values, and override hierarchy
Defines a seamless add, update, and manipulate single keys and groups of keys
Log configuration changes and the users that made them

Bearing in mind the requirements, several attempts were made at designing a robust yet understandable data backend.

The first iteration represented groupings of configurations with a wrapper for the Commons Configuration object that I coined a Layer. The Layer objects were composed to complete a Configuration Stack — not unlike the Commons CompositeConfiguration object. Primarily, the Layer objects served as metadata containers for singleton configurations and groupings of configurations which naturally belonged together.

While the Layer system lent itself well to forming a hierarchy, the data views became cumbersome as each client required their own configuration stack to be queried. Additionally, placing configurations in pre-determined groups proved to be almost impossible as the naming schemes even within features were inconsistent.

The second iteration elected to store only the unique values found for each configuration key, as well as a default value. The unique values are then responsible for managing which clients point to them and for which grouping they are active. Storing only unique values and client names brought surprising simplification to the necessary data views and updates, enabling the system to store complete information about a single configuration in a set of maps.

Despite the advantages this design offers regarding data retrieval and modification, the hierarchy is weakly represented by a group parameter in each unique value. In an attempt to address this design weakness, a graph relationship was proposed that could capture the required data and enable more dynamic searches. However — in the absence of creating a fully connected structure — the requirements of uniqueness when attempting to locate a given node could not be fulfilled. Ultimately, the V2 unique-value structure was decided to be the best solution moving forward.

Enter Quarkus & GraphQL

To support the new configuration backend, an API was created which handles all data transactions related to configurations. The Quarkus framework is used for general convenience and its really sweet ability to compile to a native executable. A MicroProfile GraphQL implementation from SmallRye is made available within the Quarkus library, and is used extensively throughout the configuration API.

MicroProfile GraphQL is a fairly new paradigm which enables users to take advantage of automated schema generation and simplified annotations to quickly build efficient GraphQL endpoints. Phillip Krueger — a core developer on the SmallRye GraphQL project — has a fantastic post here which contrasts MicroProfile to a more traditional JAX-RS implementation. It also gives a more general overview of GraphQL and why you might be interested in using it for your development.

Now that you’re all caught up on the latest and greatest in the GraphQL world, the details of the configuration API implementation should be a breeze. If you look back to the V2 architecture diagram, you’ll see that there are two primary entities we are going to represent with schemas: ConfigurationObject and UniqueValue.

Schema fields for a **ConfigurationObject**

Both the ConfigurationObject and the UniqueValue fields are described plainly by the GraphQL schema introspection. It should be noted that the clientValue field in the ConfigurationObject is actually a field that is only used on the query side to determine if a new UniqueValue needs to be created.

A master mapping of ConfigurationObjects is maintained by the configuration API in-memory and on a Postgres database utilizing the reactive client. All users are allowed to query the API to determine current configurations for clients.

Change Requests & Audit

If a mutation — new configuration or modification — needs to be made, system users can submit a request to the API containing the proposed changes. An administrator can then approve or deny the mutation request. Selective endpoint — or in this case function — security is made possible with annotations and roles via the Quarkus security library. The current API implementation only uses basic file permissions as a proof-of-concept, but the supporting authentication source can be changed without modifying annotations as long as the roles and users align.

The request system also doubles as an audit log. Any change request is timestamped and marked with the username of the request creator. In a similar manner to request creation, approval by the administrator is noted on the stored request object.

Future Work

Virtually all of the remaining work (save deployment) regarding the configuration API can be attributed to the deficiency in the V2 design. Namely:

Despite the advantages this design offers regarding data retrieval and modification, the hierarchy is weakly represented by a group parameter in each unique value.

Ideally, the hierarchy between configuration groupings — formerly sources — can be represented using a property more intrinsic to the system. One example of such a feature would be graph traversal order. If the end result utilizes overrides occurring from different groups, the traversal order of a graph can represent a pre-set ‘tier’ at each iteration. Unfortunately the problem hasn’t proven to be that simple to solve. At each junction in a new design, it seems that a sacrifice must be made regarding data views, performance, or application footprint.

About the Author

I am a second year master’s student at Colorado State University studying Computer Science. My work focuses primarily on technology at scale. I have interests in anything outdoors, table top games, and my parents’ dog, Fred.