GraphQL Federation Exclusively with Ballerina
This article was written using Ballerina Swan Lake Update 8.6 (2201.8.6)
GraphQL Duality: Streamlining Frontend and Untangling Backend Complexities
In the ever-evolving landscape of web development, GraphQL has emerged as a powerful query language for APIs, offering flexibility and efficiency over traditional REST APIs. Introduced relatively recently, GraphQL addresses a common challenge faced by frontend developers — the elimination of over-fetching or under-fetching of data. However, this boon for frontend developers can quickly become a complexity for backend developers when dealing with real-world GraphQL API implementations. As GraphQL schemas grow to include numerous types and fields, managing their intricacies becomes a daunting task for backend developers. It is precisely in addressing this challenge that Apollo Federation steps in as a vital solution.
How does the Federation work?
Apollo Federation emerges as a revolutionary paradigm reminiscent of Microservices principles. Diverging from a conventional monolithic GraphQL API, Federation employs Separation of Concerns to break down a GraphQL API into smaller, more manageable components known as Subgraphs. What makes Federation particularly powerful is its seamless integration of these Subgraphs, crafting an API that clients interact with indistinguishably from its monolithic counterpart. Federation can be understood through its four main components: Supergraph schema, Subgraph(s), the Router, and the Schema Registry.
Federation Components
Subgraphs
Each federated subgraph functions independently as a standalone GraphQL API, with its focus typically centered around a specific domain within a service. These subgraphs are distinctly separated by their concerns. What’s interesting is that each subgraph can operate with its own datasource, programming language, and deployment strategies, often managed by a single team.
Supergraph
The Supergraph, a special type of GraphQL schema, is created by combining one or more subgraph schemas using the Apollo Federation specification. It defines the name and endpoint URL for each of your subgraphs, encompassing all types and fields defined by them. Moreover, it guides the router by specifying which of your subgraphs can resolve particular GraphQL fields.
Router
The router acts as the entry point to the subgraphs and provides a unified interface for clients to interact with the Federated Service. Clients can see the API Schema of the Federated service and send operations to the router’s public endpoint instead of directly into each Subgraph API. The router then intelligently executes each incoming client operation using the Supergraph schema across the appropriate combination of subgraphs and merges subgraph responses into a single response for the client.
Schema Registry
A Federated GraphQL API is designed to evolve as we add new features to it. To facilitate this schema evolution, the Schema Registry is employed. At its core, it operates as a version control system for the schemas in a Federated service, storing the change history of schemas and tracking added, modified, or removed types and fields. Additionally, it is responsible for composing the Supergraph schema using the subgraphs and also ensures the integrity of the Supergraph schema.
We’ll see how each of these components interacts with each other in the coming topics.
What we are going to build
In this article, we will take a different approach by deploying an existing product review system project using the Ballerina GraphQL Schema Registry. Instead of building the product review system from scratch, we’ll focus on the deployment process. If you’re interested in learning how to build the system using Ballerina, check out this article series for comprehensive guidance.
This service offers a range of functionalities, allowing users to retrieve product information, user details, and review information. Additionally, it provides the capability to add new reviews to the products.
In the upcoming sections, we’ll embark on deploying each Subgraph individually into the Schema Registry. This journey will showcase how the Schema Registry seamlessly manages versioning in the Supergraph and tracks changes made to it. We’ll also explore how the Schema Registry safeguards against Subgraphs from pushing breaking changes.
Looking ahead in this series, we’ll delve into the implementation of our custom datasource, offering even more insights into storing Supergraph and Subgraph schemas in the Schema Registry.
Prerequisites
- Ballerina
- Ballerina GraphQL Schema Registry Executable
- Ballerina GraphQL Gateway Mediator Executable
- Ballerina GraphQL Gateway (Generator) Executable
- A MongoDB Cluster
- Git
Why do we need a MongoDB Cluster?
Schema Registry is designed to be data-agnostic, allowing users to seamlessly integrate their own database implementations based on their specific use cases. It supports various implementations by default, including MongoDB, a file-based approach, and an in-memory database. In this post, we’ll explore the existing MongoDB implementation to showcase the versatility of the Schema Registry. As I said earlier, in the coming articles we’ll explore how we can create our own datasource from scratch and plug it in to the Schema Registry.
Setting up
Setup MongoDB
Create a MongoDB database with a preferable name, Here we’ll use graphql-schema-registry
and then obtain the connection URL for the cluster. The connection url should look something like this,
mongodb+srv://<username>:<password>@<cluster-id>.xxxxxx.mongodb.net
We’ll use this url when setting up the Schema Registry.
Starting up the Schema Registry
Download the .jar
file of the Schema Registry to a directory using this link. Or, you can build it from the source and obtain the .jar
. Let’s place the .jar
file on a directory with the name registry
Then let’s create a configuration file for the Schema Registry. Create a file with the name Config.toml
on the registry
directory as the Schema Registry .jar
file. With the contents as below,
[mongoConfig]
connection.url = "<your_mongodb_connection_url>"
databaseName = "graphql-schema-registry"
Note: Remember to replace your Mongodb cluster connection URL with the placeholder
<your_mongodb_connection_url>
in above.
At this point, Our directory structure should look like as below,
.
├── registry
├── Config.toml
└── graphql_schema_registry.jar
Then let’s start up the Schema Registry service by running the graphql_schema_registry.jar
file using the following command,
bal run graphql_schema_registry.jar
Above command will start the Schema Registry service on the port 9090
. To check if the Schema Registry is up and running. Send the following query to the Schema Registry endpoint, which is http://localhost:9090
query Supergraph {
supergraph {
schema
}
}
And If you get the response as follows,
{
"errors": [
{
"message": "No registered supergraph",
"locations": [ "line": 2, "column": 5 } ],
"path": ["supergraph"]
}
]
}
That means the Schema Registry is up and running.
Setting up the Router
Our router consists of two parts: the Gateway Service Generator and the Mediator. The Gateway Service Generator generates a Ballerina GraphQL Service for a given Supergraph schema. The Mediator polls the Schema Registry and provides the Supergraph schema to the Gateway Service Generator. Subsequently, the Gateway Service Generator generates a GraphQL Service, and the Mediator starts up that service, which users will interact with for our federated service.
So first, Let’s download the Gateway Service Generator executable from here. Then download the Mediator executable from here.
Also, Let’s place both executables in the same directory for our convenience, but place them on a separate directory from the Schema Registry. Let’s call our new directory as router
So our current directory structure should look like follows,
.
├── registry
│ ├── Config.toml
│ └── graphql_schema_registry.jar
└── router
├── federation_gateway_mediator.jar
└── graphql_federation_gateway.jar
To start the router run the federation_gateway_mediator.jar
file using the following command,
bal run .\federation_gateway_mediator.jar -CschemaRegistry="http://localhost:9090" -Cport=8000
Here, we pass the Schema Registry endpoint and a port to be used by the Router as arguments.
At this point, the Router will not start serving anything because no Supergraph is registered on the Schema Registry yet. As soon as one is registered, the Router will automatically pull it and begin serving.
Setting up the Subgraphs
Let’s create a directory with the name subgraphs
to place our subgraphs.
Open up another console on that directory and clone the Products subgraph using the following command.
git clone https://github.com/ThisaruGuruge/ballerina-graphql-federation-products-subgraph
Our directory structure should look like the below at this point:
.
├── registry
│ ├── Config.toml
│ └── graphql_schema_registry.jar
├── router
│ ├── federation_gateway_mediator.jar
│ ├── generated_gateway
│ ├── graphql_federation_gateway.jar
│ └── supergraphs
└── subgraphs
└── ballerina-graphql-federation-products-subgraph
Note:
generated_gateway
and thesupergraphs
directories on therouter
directory are automatically created by the Mediator.
Next, navigate to the cloned directory, and start the Products subgraph service.
cd ballerina-graphql-federation-products-subgraph
bal run
If you see a message as Running executable
in the output, then the subgraph is up and running on the port 9091
.
Note: You can find the relevant port of the subgraph on the
service.bal
file of the cloned directory.
Follow the same approach to clone and start the remaining two Subgraph services.
The Users subgraph will run on port 9092
and the Reviews subgraph will be on port 9093
Let’s get our hands dirty
Let’s publish our first subgraph
The Schema Registry itself operates as a GraphQL service, its schema is accessible here. As we can see there, We can use the publishSubgraph
mutation to publish a Subgraph to the Schema Registry.
In that mutation operation, We have to give our Subgraph Schema as a String. So let’s proceed to generate the schema for the Products subgraph.
Let’s navigate to the Products subgraph directory on a new console and execute the following command,
cd ballerina-graphql-federation-products-subgraph
Now, execute the following command to generate the schema for the products subgraph,
bal graphql -i service.bal
This will generate the schema into a file named schema_service.graphql
on the ballerina-graphql-federation-products-subgraph
directory. Copy the contents of this file, and proceed to create the publishSubgraph
mutation as shown below. Paste the copied schema into the schema
argument of the mutation. The final mutation operation query should look something like this,
mutation PublishSubgraph {
publishSubgraph(
schema: {
name: "products"
url: "http://localhost:9091"
schema: """
extend schema @link(url: "https://specs.apollo.dev/federation/v2.0", import: ["@key"])
type Query {
"Returns the list of products"
products: [Product!]!
"Returns the product for the given id. If the product does not exist, returns null"
product(
"ID of the product to be retrieved"
id: ID!
): Product
}
"Represents a product in the system"
type Product @key(fields: "id") {
"The unique ID of the product"
id: ID!
"The name of the product"
name: String!
"The description of the product"
description: String!
"The price of the product"
price: Float!
}
"""
}
) {
schema
version
}
}
Since our Products subgraph is running on the port 9091
, we have provided the url
argument as http://localhost:9091
. And the name
argument can be given as products
. This name parameter will be used by the Schema Registry to uniquely identify a given Subgraph. Therefore, it is important to provide a meaningful name.
Sending the above mutation to the Schema Registry should give the response as follows,
{
"data": {
"publishSubgraph": {
"schema": "<supergraph_schema>",
"version": "0.1.0"
}
}
}
Once you execute the query, observe the Supergraph schema in the <supergraph_schema>
placeholder.
Note: Given the potentially lengthy nature of the Supergraph schema, the placeholder
<supergraph_schema>
is used here.
In the response, note that the initial Supergraph version is set to 0.1.0
. Our Schema Registry seamlessly monitors any changes, automatically versioning the Supergraph. The versioning strategy mirrors the principles of Semantic Versioning, which we’ll explore in greater detail later in this post.
At this point, our Schema Registry has successfully generated a Supergraph schema. The next question is: How will our users access this service? This is where the Router comes into play.
Running our first query
The router will periodically poll the Schema Registry every 5 seconds to retrieve the latest Supergraph schema and serve it to users. So by now, our Router should detect that the Schema Registry has been updated with a new Supergraph and will begin serving it.
To verify if everything is functioning as expected, let’s send a query to the Router endpoint (http://localhost:8000
) as follows,
query Products {
products {
id
}
}
Upon receiving the query, the Router will prepare a query plan by analyzing the query. As this query involves only the Products subgraph, the Router will send a query to the Products subgraph and generate a response for the user based on the Products subgraph’s response. You should receive that response as follows,
{
"data": {
"products": [
{ "id": "product_0" },
{ "id": "product_1" },
{ "id": "product_2" }
]
}
}
Great! Everything is working as expected. 🎉🙌
If you wish to inspect the API schema of the generated Supergraph, send a query to the Schema Registry using the following command.
query Supergraph {
supergraph {
apiSchema
}
}
Let’s see how the Schema Registry versions the Supergraph
Continue with the same approach to publish the other two subgraphs to the Schema Registry. Once you’ve completed that, proceed to send a query using the following steps.
query Supergraph {
supergraph {
subgraphs {
name
}
version
}
}
Response for that should be as follows,
{
"data": {
"supergraph": {
"subgraphs": [
{ "name": "products" },
{ "name": "users" },
{ "name": "reviews" }
],
"version": "0.1.2"
}
}
}
Notice that the version of the Supergraph schema has now incremented to 0.1.2
. As we haven’t published any subgraphs containing breaking or dangerous changes, our Supergraph version remains at 0.1.2
. You can verify this by examining the differences between the initial Supergraph and the current version using the following query,
query Diff {
diff(newVersion: "0.1.2", oldVersion: "0.1.0") {
severity
action
subject
}
}
In the response, you’ll observe that every change between these two versions consists of additions to the schema, including type additions, field additions, and more. Importantly, all of these changes fall into the category of Safe changes.
Let’s update the Supergraph
Now, let’s explore the process of updating an existing Subgraph and reflecting the changes in the Schema Registry. To demonstrate this, Let’s assume that we needed to update the Products subgraph by removing the products
query field from it.
Update the ballerina-graphql-federation-products-subgraph/service.bal
of the products subgraphs as below.
@subgraph:Subgraph
service graphql:Service on new graphql:Listener(9091) {
# Returns the product for the given id. If the product does not exist, returns null
# + id - ID of the product to be retrieved
# + return - The `Product` for the given id
resource function get product(@graphql:ID string id) returns Product? =>
datasource:getProduct(id);
}
Subsequently, generate the schema for the updated product service by using the following command,
bal graphql -i service.bal
Before publishing the updated Products subgraph, let’s observe the impact on the Supergraph by using the dryRun
query. Similar to publishSubgraph
, dryRun
has one key difference — it doesn’t persist changes to the Supergraph in the Schema Registry. Instead, it provides us with the updated Supergraph without making permanent changes. This precautionary step ensures that the modifications made to a subgraph have the expected impact on the Supergraph before committing to the changes. Let’s proceed with the dryRun
query as below.
query DryRun {
dryRun(
schema: {
name: "products"
url: "http://localhost:9091"
schema: """
extend schema @link(url: "https://specs.apollo.dev/federation/v2.0", import: ["@key"])
type Query {
"Returns the product for the given id. If the product does not exist, returns null"
product(
"ID of the product to be retrieved"
id: ID!
): Product
}
"Represents a product in the system"
type Product @key(fields: "id") {
"The unique ID of the product"
id: ID!
"The name of the product"
name: String!
"The description of the product"
description: String!
"The price of the product"
price: Float!
}
"""
}
) {
schema
version
}
}
And the response for the dryRun
query should be as follows.
{
"errors": [
{
"message": "Operation causes breaking changes",
"locations": [ { "line": 3, "column": 5 } ],
"path": [ "dryRun" ],
"extensions": {
"errors": [
{
"message": "BREAKING: Removed 'products' from Query.",
"details": {
"diff": {
"severity": "BREAKING",
"action": "REMOVED",
"subject": "FIELD",
"location": [ "Query" ],
"value": "products",
"fromValue": null,
"toValue": null
}
}
}
]
}
}
],
"data": { "dryRun": null }
}
Unfortunately, an error has occurred in the response. It indicates that the operation causes breaking changes, with further details specifying the removal of products
query from the Product Subgraph. This aligns with our expectations, as it reflects the change we made to the Products subgraph.
The reason for the issue lies in the fact that our existing clients might depend on the products
query within our Federated GraphQL service. Removing it could disrupt their operations, and the Schema Registry, as a safeguard, prevents such potentially breaking changes by default.
Our approach here is to proactively notify clients of our federated service about the impending removal of the products query. Following a suitable grace period, we will then proceed with the complete removal of the products
query. Let’s explore how we can effectively manage this transition.
Let’s modify service.bal
of our Products subgraph service by adding back the removed products
resource function and then applying the @deprecated
directive to it. You can add the reason for the deprecation using Ballerina Markdown Syntax. Result of those changes to the service.bal
of Products subgraph should look like as follows,
@subgraph:Subgraph
service graphql:Service on new graphql:Listener(9091) {
# Returns the list of products
# + return - List of products
# # Deprecated
# The `products` field is deprecated. Use `product` instead.
@deprecated
resource function get products() returns Product[] => datasource:getProducts();
# Returns the product for the given id. If the product does not exist, returns null
# + id - ID of the product to be retrieved
# + return - The `Product` for the given id
resource function get product(@graphql:ID string id) returns Product? =>
datasource:getProduct(id);
}
Let’s proceed to generate the updated schema for that using the following command,
bal graphql -i service.bal
Then, prepare a publishSubgraph
mutation with that updated schema. The final mutation should look like as follows,
mutation PublishSubgraph {
publishSubgraph(
schema: {
name: "products"
url: "http://localhost:9091"
schema: """
extend schema @link(url: "https://specs.apollo.dev/federation/v2.0", import: ["@key"])
type Query {
"Returns the list of products"
products: [Product!]! @deprecated(reason: "The `products` field is deprecated. Use `product` instead.")
"Returns the product for the given id. If the product does not exist, returns null"
product(
"ID of the product to be retrieved"
id: ID!
): Product
}
"Represents a product in the system"
type Product @key(fields: "id") {
"The unique ID of the product"
id: ID!
"The name of the product"
name: String!
"The description of the product"
description: String!
"The price of the product"
price: Float!
}
"""
}
) {
schema
version
diffs {
severity
action
subject
location
value
fromValue
toValue
}
}
}
Upon submitting the aforementioned mutation to the Registry, you should expect to receive the following response,
{
"data": {
"publishSubgraph": {
"schema": "<supergraph_schema>",
"version": "0.1.3",
"diffs": [
{
"severity": "SAFE",
"action": "ADDED",
"subject": "FIELD_DEPRECATION",
"location": [
"Query",
"products"
],
"value": null,
"fromValue": null,
"toValue": null
}
]
}
}
}
As evident from the above response, we have successfully marked the products
fields as deprecated. Users will be able to observe this as illustrated in the image below.
After some time has elapsed, we can proceed to remove the products
completely using the isForced
argument in both the publishSubgraph
and dryRun
operations. Setting isForced: true
overrides the default behavior of the Schema Registry, which prevents us from pushing potentially breaking changes.
Now that users of our Federated service have been informed of the deprecation of products, we can safely assume that most users are no longer using it. Let’s proceed to explore how we can utilize this feature.
Let’s apply this with publishSubgraph
. To do so, first modify our Products Subgraph service by removing the products
resource function and its documentation entirely from the ballerina-graphql-federation-products-subgraph/service.bal
, as before. Then, regenerate the schema using the bal graphql -i service.bal
command. Using the newly obtained schema, prepare the publishSubgraph
mutation as follows:
mutation PublishSubgraph {
publishSubgraph(
schema: {
name: "products"
url: "http://localhost:9091"
schema: """
extend schema @link(url: "https://specs.apollo.dev/federation/v2.0", import: ["@key"])
type Query {
"Returns the product for the given id. If the product does not exist, returns null"
product(
"ID of the product to be retrieved"
id: ID!
): Product
}
"Represents a product in the system"
type Product @key(fields: "id") {
"The unique ID of the product"
id: ID!
"The name of the product"
name: String!
"The description of the product"
description: String!
"The price of the product"
price: Float!
}
"""
}
isForced: true
) {
schema
version
diffs {
severity
action
subject
location
value
fromValue
toValue
}
}
}
Upon sending the above mutation to the Schema Registry, You’ll get the following response.
{
"data": {
"publishSubgraph": {
"schema": "<supergraph_schema>",
"version": "1.0.0",
"diffs": [
{
"severity": "BREAKING",
"action": "REMOVED",
"subject": "FIELD",
"location": [
"Query"
],
"value": "products",
"fromValue": null,
"toValue": null
}
]
}
}
}
You’ll notice the retrieved Supergraph reflects the changes. Additionally, the Supergraph version has now been incremented to 1.0.0
. This version bump is due to the potentially breaking change that we introduced, aligning with a versioning strategy similar to semantic versioning.
Let’s see what got changed
Let’s explore the version history of our Supergraph by executing the following query to the schema registry.
query SupergraphVersions {
supergraphVersions
}
The response should be as follows,
{
"data": {
"supergraphVersions": [
"0.1.0",
"0.1.1",
"0.1.2",
"0.1.3",
"1.0.0"
]
}
}
The current Supergraph version is 1.0.0
. To inspect the changes between the latest Supergraph and the preceding version, send the following query to the schema registry.
query Diff {
diff(newVersion: "1.0.0", oldVersion: "0.1.3") {
severity
action
subject
location
value
}
}
The response should be as follows,
{
"data": {
"diff": [
{
"severity": "BREAKING",
"action": "REMOVED",
"subject": "FIELD",
"location": [ "Query" ],
"value": "products",
"fromValue": null,
"toValue": null
}
]
}
}
In the response, observe that there is only one change in the Supergraph, aligning with our expectations. The removal of the products
field from the Query
type is identified as a breaking change.
Conclusion
In this article, we explored the functionality of the Ballerina GraphQL Schema Registry for effectively managing our Federated GraphQL Service, eliminating the need for Apollo Studio. In the upcoming articles, we’ll delve into the process of integrating our own datasource into the Schema Registry using Ballerina. Stay tuned for further insights and hands-on guidance!