In building out our fresh new GraphQL API, we’ve conformed to the Relay Specification, which introduces a new way of identifying objects via Global IDs. Because our new API service sits atop both new and legacy systems and existing IDs have myriad existing uses and constraints, we faced some interesting challenges introducing Global IDs, while maintaining backward compatibility. In this post, we’ll walk through how we approached this problem.
Background: GraphQL, Relay, the Node interface, and global IDs
At Braintree, we constantly strive to stay up-to-date with the latest technologies and software development practices. We temper our desire to adopt new technologies with a healthy dose of skepticism and evaluate new technologies on their merits, not just their buzz. With a committed base of merchant customers and untold lines of code working in their service, we are obliged to provide a stable and predictable payments platform. When we adopt a new technology, we always consider how it will work with what we’ve already built.
One new technology we’re taking a big bet on is GraphQL. We are using it to build our next-generation payments API. We made this decision for a variety of reasons that you can read all about here.
One of the many things that we like about GraphQL is that it has standards. Building an API that adheres to a public specification allows all of our integrating merchants (and the clients that they rely on) to make some basic assumptions about how the API works.
We chose to build our API to be compliant with the Relay spec. We decided to follow the spec because it is a sensible, consistent public standard that allows popular GraphQL clients to work seamlessly with compliant APIs. Adhering to this standard means there are fewer questions a merchant has to make about our API — a good deal of its behavior is already known, from how our mutations are structured to how pagination works.
A major component of the Relay spec is the Node interface. The Node interface gives objects a globally unique ID that can be used for looking up objects in the API. A globally unique ID must be unique, not only between instances of a single domain entity, but also across domain entities. So, a Transaction can’t have the same ID as a PaymentMethod or a Refund. It also means that, given a global ID, the API (or underlying service) must be able to fetch the right object, without knowing what type of object to fetch. So the ID must encode the type of object it refers to. We like the Node interface because it provides a consistent API for fetching objects of any type. It’s also handy for including mutations that can accept a variety of different objects. For instance, it allows us to charge single-use and multi-use payment methods with the same mutation.
Say you have the ID of an object that may be a Refund or Transaction. You can query for it and handle either outcome as follows:
Note that the query specifies a Transaction fragment in its payload, but the query itself takes only an ID. It does not require a type. This query will return the following data:
The above requirements aren’t difficult to build into a brand new service, but our GraphQL API fronts both new and legacy services that already have their own IDs. Our legacy services, for the most part, do not use globally unique IDs, nor can we infer the type of object referenced by its ID. Supporting the Node interface was challenging for a few reasons:
- We couldn’t simply adopt a new ID format that fulfills the globally unique requirement in our legacy systems. This would break existing merchants’ integrations, since they likely have persisted their own data from our SDKs, including IDs that we expose. If we changed our ID format, they wouldn’t be able to re-fetch objects with those IDs. Additionally, object IDs appear in URLs in our Control Panel. For instance, to access a Transaction, the URL looks something like: http://braintreepayments.com/merchant/:merchant_id/transactions/:transaction_id.
- We know that good URLs never change, so that’s another reason we couldn’t just wholesale re-format our IDs.
- In the case of Transactions, the ID serves not only to identify the transaction in our API and our web application, but we also use this ID in our outgoing authorization and settlement messages to processors. Some of these processors have very restrictive allowable formats for IDs, which would severely limit our ability to cut over to global IDs, and we even if we could make the formatting work, we would still need to support legacy IDs for existing transactions in case we ever need to chase them down with a processor.
To support our existing merchants, processors, and our own Control Panel, we needed to support legacy IDs alongside these new global IDs. To do so, we’d need a way to translate between the two.
Anatomy of our global IDs
We decided to format our global IDs based on the legacy IDs. A global ID is made up of the legacy ID, prefixed with the object type, then URL-safe base64-encoded without padding. For example:
- Take a Transaction legacy ID such as
- Concatenate the type and legacy ID to
- Apply URL-safe base64-encoding to get Global ID
We apply this last step of base64 encoding because we want our IDs to be opaque. We prefer our merchants not to make assumptions about the format of our IDs for a couple of reasons:
- We should provide enough information about our objects, their data, and behaviors in our API and documentation, so the content of the ID should be superfluous.
- We want to reserve the flexibility to change our ID format in the future without worrying about it breaking merchant integrations.
(SO! Now that I’ve told you all about what’s inside our IDs, please disregard it muahahahaha!)
How global IDs fit into our architecture
Before getting into how we expose both legacy and global IDs in our API, here’s a little bit about our architecture. Our GraphQL API lives in a service we’ve internally named Atmosphere, which is deployed in AWS and sits in front of several backing services, written over several years, with different ID schemes, persistence layers, and programming languages.
The Atmosphere service provides a consistent interface into this diverse set of services. Our SDKs, however, are built on our XML API, served by another service called Gateway. To make global IDs work in our GraphQL API without breaking our SDKs integration with Gateway, we needed to fulfill the following requirements:
- Gateway must be able to understand global IDs passed via Atmosphere.
- Gateway must also be able to understand legacy IDs passed via the SDKs.
The first step toward supporting Node queries was to add global IDs to our Gateway XML responses. When a legacy service’s API renders a resource in a response, it inserts a new value called global-id. This field is mapped to the ID field in Atmosphere, so that when a response comes back from a GraphQL API request, the ID surfaced is the global ID. We also surface the legacy ID in our GraphQL API. For Transaction responses, it’s important to have access to this ID, since it is used as a reference to IDs by our processors. In our SDKs, the global ID field is currently ignored, but we have plans to surface it, for merchants who desire interoperability with our GraphQL API, or just to aid in migrating from an SDK to a GraphQL integration.
We’ve covered the egress of IDs from our API, but what about ingress? We needed a way for Gateway to find objects referred by their global ID. To make this work, we had a couple of choices. We could translate from global ID to legacy ID in Atmosphere, or we could pass the global ID all the way back to Gateway and teach it how to find objects by global ID, too. Ultimately, we decided to go with the latter, so that we leave open the possibility of our SDKs and our GraphQL API interoperating easily; if a merchant plugs a global ID into their SDK, they will still be able to get the thing they’re looking for. It also means that our Control Panel will serve up results from URLs with either legacy or global IDs.
Last but not least…
Let’s tackle one final consideration — one that is, unfortunately, overlooked by many software engineers: support!
At Braintree, we’re very proud of our white-glove service. We consider our world-class support to be an important differentiator in a crowded payments market. When transitioning from IDs that look like
6sf98fx to IDs that look like
dHJhbnNhY3Rpb25fNnNmOThmeA, we add a bit more complexity to our application. It’s not uncommon to have merchants call in with questions about specific objects and read their IDs to our support reps over the phone. Reading our legacy ID is significantly easier than reading a big base64-encoded string, which also happens to be case-sensitive. Moreover, our GraphQL API is a brand-new product and the vast majority of our merchants are on our SDKs, with their legacy APIs. We want to make sure that our support reps are kept in the loop, so they know when a merchant contacts them with one of these new-fangled global IDs. That meant updating our internal documentation and writing new training materials for our support team, so they know how to handle questions about these new integrations.
New tech is super cool! But when you have a long-lived system with years and years of accumulated lines of code and dedicated users, you can’t drop everything for the shiny new toy. Keeping our products up-to-date and tech-forward, without leaving our existing systems in the lurch, requires a lot of thought, planning, and design. We’re excited about this new API and we think that adopting the full Relay standard makes it even better. Head over to http://graphql.braintreepayments.com and try it out for yourself!