Merge 2 GraphQL APIs at The Same Level (AppSync/Github)
Tips for merging GraphQL APIs with UNION [Sea of GraphQL pt.2]
GraphQL is new movement of API. It has hidden power to encourage data scientists and data stewards to handle remote data more easily. And its ecosystem is growing up and managed service makes it easier. Previously in this article, I showed how to create an application accessing GraphQL managed service AppSync. In this article, I described how to use GraphQL functions to merge AppSync GraphQL API and existing GraphQL API (Github API).
Now I found a new pattern of merging of GraphQL APIs! (I will explain that it is not such a new thing in the data scientist world at the last part.) With this pattern, we can combine multiple GraphQL APIs at the same level. I named this pattern “Union pattern”. So I will explain what the union pattern is for and how we can use this.
Union pattern provides a power to process and search multiple data across different locations and different types (comprehensive data processing), which is not accomplished by the pattern on the previous article. We can try an example with Github API v4 and AppSync API.
Use case of union pattern
We consider a case using two data source APIs as follows.
- A GraphQL API on AppSync which returns a list of private organizations in corporate. (e.g. “HumanResouceDept”)
- Github API which returns a list of public organizations like “github”, “facebook” and so on.
Actually, Github GraphQL API has a root field “organization” which returns information about public organizations.
The two APIs (1. and 2.) don’t share the same organization. But it is considered that these APIs is at the same level because they both have organization information.
Next, we can consider two concrete example cases of merging these APIs for comprehensive data processing.
Example case 1. Return all the union of two datasets
Merged API listUnionedOrganizations
returns a list of all elements (organization name) in two API datasets (1. and 2.). This means the return value of the merged API is a union of two datasets.
Example case 2. Search one element from the union of two datasets
Merged API getOneUnionedOrganization
returns only one (or none) from all elements in two source API datasets (1. and 2.). The return value could be addressed by searching for some ID field (getOneUnionedOrganization(id)
) or an argument of string which includes an organization name (getOneUnionedOrganization(name)
). So the return value maybe comes from 1. and maybe from 2., which depends on the searching condition.
Architecture of union pattern
Architecture for the union pattern is like this figure.
An API service on top of this figure runs on simple Node.js server (like AWS Lambda) and fetches AppSync API and Github API, then merges both. (Same as the previous post, we can use GraphQL functionality “Schema Stitching”, which is described in this post).
This time, we will use a type definition method union
in addition to schema stitching.
A type definition method ‘union’
Any GraphQL APIs have data types for return values. This rule applies to both data source APIs. Then union
definition on the API service creates a new type that merges multiple data types.
For example, we have the following type definition.
At the first line, union
defines new type UnionedOrganization
that can contain both types of data; MyOrganization
and Organization
.
At the second line, a query getUnionedOrganization
which returns this type UnionedOrganization
is defined. Then actual type returned from getUnionedOrganization
is MyOrganiztion
type in some cases, and it is Organization
in other cases. This feature is useful in order to merge APIs at the same level.
Build an example API
Let’s see details about an example API service on AWS Lambda.
#1 Deploy AppSync API
The data model of AppSync API is here. This data models includes a type MyOrganization
for “private organizations” and two queries.
Feature
MyOrganization
has only two fields for ids and names of organizations.- A query
listMyOrganizations
returns a list of private organizations (little bit tricky, it’s not[MyOrganization]
butMyOrganizationConnection
) - A query
getMyOrganization
returns only one element from private organizations.
Create the model on AppSync console.
Put some data in AppSync backend (DynamoDB) via AppSync Queries view or direct operation to DynamoDB. I added some organization data as follows.
Preparation for API access
Download aws-exports.js
in Summary view in AppSync API console.
#2 Github API
In Github API v4, a field Organization
is available in the following format. The argument of the field is a string login
, and this field returns data which has type of Organization
.
Feature
- The type of
Organization
includes four fields ofid
,name
,location
,repositories
(more fields thanMyOrganization
has) - No query for fetching list. Only query is
organization
for fetching single organization
You can check the schema of the Github API on Github API Explorer.
Preparation for API access
We can get access token from a Github setting page (settings -> developers settings -> personal access token),
Put the token into a file .env
.
#3 Spec and implementation of API service on AWS Lambda
The following Node.js application implements an operation of merging AppSync API results with Github API results.
Format of new union type and new Query is here.
Feature
- Type
UnionedOrganization
is union ofMyOrganization
andOrganization
- Query
listUnionedOrganizations
returns a list ofUnionedOrganization
and querygetOneUnionedOrganization
returns single data ofUnionedOrganization
Return values of new API are generated in the following manner.
- 1) Copy values from AppSync API results (
listMyOrganization
,getMyOrganization
) - 2) Copy values from Github results (
organization
) - 3) new queries merges values of 1) and 2) as resolvers implement
Implementation of Node.js server on AWS Lambda is as follows.
Step by step description of resolvers
Next line wraps a process fetching Github and AppSync API as function.
In listUnionedOrganizations
,
Because Github API has no API returning list, I created a list using organization
return value (single element) and wrap into a array [a]
.
Because AppSync API returns not a list but MyOrganizationConnection
, delegate
cannot process return value correctly. (Extra work can be done with some trick in the following issues, but I didn’t use this solution yet. Future work.)
My simple solution is to use low-level graphql()
function directly.
At last, merge two datasets. arr
contains 2-dimensional array [[<github results>], [<AppSync result>]]
. We can use flat()
to convert the array to 1-dimensional array.
Deploy the API service
I added serverless.yml
file to use serverless
for AWS Lambda deployment.
I added a file webpack.config.js
for configuration of webpack.
Deploy new API server into AWS Lambda with the following command.
Result
The evaluation method: the two following queries
Parameter: name
is a value as the same with an element in AppSync API. appsync
is a switching flag (for convenience, optional)
The result is here.
listUnionedOrganizations
returned merged result which includes single Github API data and 4 rows of AppSync data.getOneUnionedOrganization
returns single organization data which is address by an argument$name: ProcurementDept
The result with offline environment.
Summary
I found a new pattern ‘union pattern’ of merging APIs. I showed how to implement an API service of this pattern with AppSync API and existing Github API at the same level. As a result, a API service gathered from multiple GraphQL API, merged them and returned either list type or single element type of data. The implementation of resolvers is a little bit tricky but simple solution is available. After we implement resolvers for union pattern, we can create single API for searching multiple GraphQL APIs.
Extra consideration
Comparison between ‘join pattern’ and ‘union pattern’
I will add some consideration about merge patterns.
A comparison of two different patterns of merge is here.
One is join pattern described in the previous article.
Another is union pattern in this article.
The difference between two patterns is as depicted here.
Join pattern and union pattern is like JOIN
and UNION
in SQL.
In the SQL world, two phrases have the following effects
- JOIN: Merge tables to the left and right and increase columns
- UNION: Append tables up and down and increase rows
In the GraphQL world,
- Join pattern: Merge multiple APIs to the left and right and increase depth
- Union pattern: Append multiple APIs up and down (at the same level) and increase elements
… Yes.
Union pattern is not such a new thing. This is a famous pattern for data scientists.
Like we use JOIN and UNION in SQL, we can use these patterns. This means data scientists can make the best of GraphQL merging techniques to generate any data as they need.