Powering Streak’s Gmail Add-on: Part 1 — Backing into GraphQL

At Streak, we recently partnered with Google to release our Gmail Add-on as part of the Add-ons launch. The Add-on is our first major foray into integrating with your existing mobile apps: it lets you better organize your work email in the mobile Gmail app, and makes it easier to collaborate with your team to make sure nothing slips through the cracks. The reception has been great. Our Add-on is top rated in the Marketplace and tens of thousands of users interact with it regularly. But the backend of the Add-on is also new: it’s 100% powered by GraphQL, albeit in an unusual way, and I wanted to share the engineering story of how we got here.

The Streak Gmail Add-On on Android

A Performance Problem

Our GraphQL journey started with us staring at a loading spinner. The fundamental purpose of our Gmail Add-on is to give the user more context about their emails. Useful context comes in many forms. If you’ve told Streak that an email is part of a sales deal, then we should add information about that deal when you view the email. If the customer you’re talking to is also talking to somebody else on your support team about an issue, you probably want to know that. And if the customer’s trial is going to run out today and might need to be extended, that’s important, too.

Surfacing all this context is the core of our user experience, but the information lives in a lot of different places, so displaying it requires querying a bunch of different API endpoints: pipelines, contacts, deals, other emails, and organizations just to start. Our main Javascript client pipelines these queries using parallel, asynchronous requests against our REST API, making sure that we’re pulling contacts at the same time as organizations at the same time as the deal information.

Some of the many parallel requests fetching one box using the web client

This was where we hit our first Add-ons roadblock. Google Apps Script, which powers Add-ons, is a hosted Javascript sandbox environment with some non-standard APIs. Instead of the traditional XHR interface, requests to backends are required to use a custom URLFetch service. And URLFetch only allows blocking requests. Requesting all the data serially could take tens of seconds. Ain’t nobody got time for that.

We realized that to get acceptable performance, we needed to combine the requests. Instead of getting “/email/15fb6cb3b627304f” followed by getting “/contacts/15fb6cb3b627304f”, we needed one endpoint: “/emails+contacts/15fb6cb3b627304f”. We spent a while manually bundling handlers together, but it turns out nobody on our team wanted to pick up artisanally hand-crafting batch endpoints as a long-term hobby.

Griping about it at lunch one day, one of our frontend developers suggested GraphQL. GraphQL bundles a lot of functionality that’s useful for making flexible APIs in a query language and a simple HTTP protocol. And relevant to our interests, GraphQL lets you request multiple objects in one go.

A single GraphQL query fetching multiple objects

It also provides functionality for introspecting response schemas, only fetching portions of a response (to save bandwidth costs), and for eagerly requesting related objects (e.g. following foreign keys to their object). This is very powerful, but in the context of a user-facing API, also a little scary. Some of our teams have hundreds of thousands of deals. What if somebody requested all of them and their related information in one go?

So we decided to take a measured approach to GraphQL, using it only for communication between our Add-on running in the Apps Script sandbox and our backend to start.

GraphQL Implementation Strategies

There are two main models for retrofitting GraphQL onto a RESTful API backend. The first adds a proxy layer in front of the existing server, typically in Node since that runtime has the original GraphQL backend server. The proxy translates the GraphQL query into traditional API requests, sends them in parallel, and constructs the GraphQL response.

The second model adds a GraphQL endpoint directly on our backend server, using the graphql-java library. This endpoint either programmatically requests information from the existing endpoints, or does the same work they do to fetch the data from our backend datastore.

The main arguments in favor of the proxy approach was that it was completely separated from our existing stack. Our existing endpoints were well-tested, we were confident that we knew how to monitor them, and we knew that they correctly enforced permissions.

The main arguments for the integrated approach center around development and runtime efficiency:

  • Thinking about development time, our existing API uses GSON to serialize data model objects into responses. By reusing the annotation and type information from GSON, we don’t have to duplicate our data model schema when creating our GraphQL endpoints.
  • Then at runtime, many of our existing API endpoints have to fetch multiple objects in order to provide their results. For instance, when fetching a contact, we also have to fetch the team that contact belongs to in order to make sure the current user has permission to view the contact. If a GraphQL query wants both the contact and its team, we shouldn’t need to fetch the team once for the GraphQL query and then again to check permissions for the contact.

Initial Results

Since our whole purpose of using GraphQL was to make our Add-ons more efficient, we decided to pursue the integrated option. After some fun with Maven and getting Java 8 deployed on App Engine, we got a proof of concept working.

The good news: requests were batched and the request took about half as long as the previous serial requests.

The bad news: half as long as the previous serial requests was still roughly six seconds longer than we were looking for.

Hopeful but with a ways to go, we dove into the exciting world of GraphQL layering and instrumentation to track down the lingering performance issues.

We’ll talk more about our GraphQL experience next week, and in future blog posts delve into the build infrastructure we developed to build the Add-on itself and how we used Google Cloud Spanner to enhance our backend performance.