At Geospark Analytics we built a SaaS platform for Global Risk called Hyperion. Our Hyperion Platform was built using an API First Strategy. As we build our web based user interface (Vue if you are interested) and our Native iOS and Android mobile applications our API evolves.
Our API is microservices based and primarily runs on the AWS Lambda runtime engine (mostly node and python) and some Google Cloud Functions. Early in our development API Gateway, which is the AWS way of exposing lambda functions to the web, was very nascent and not easy to use. Tooling such as SAM and Serverless Framework hadn’t even hit the ecosystem to make deployment easier. Also our initial version of Hyperion Web relied on many 3rd party APIs so we needed a central way to secure these APIs. These APIs all had various ways to authenticate (Basic Auth, tokens, api-keys, 2 way SSL!, etc) we built API Proxies for each of them using Apigee, then fronted each of these Proxies with Auth0 validated tokens. This allowed the end user to login once, receive a JWT, and then authenticate to our API and the 3rd party API’s with the same token.
We implemented our Apigee API as a basic wrapper built using Express and Swagger that proxied to our Microservices. This API takes the REST API call and then routes it to the proper AWS Lambda, ECS Service or GCP function with the correct payload for the event. The details of this are not important for this article but by building the application this way we basically had 2 primary URL’s for accessing our application.
- https://hyperion2.geospark.io <- Main Vue APP
- https://api.geospark.io/api-endpoint <- REST APIs
For those not familiar with Apigee this basically just sets the Response headers sent to the browser for all CORS request. Something similar can be accomplished with NGINX.
The implication here is every API call results in 2 calls. One for the GET and one for the Preflight OPTIONS call. This has great implications when you have an API heavy application. This is because browsers enforce a maximum number of parallel connections to a single host. This is why sometimes you will see host with hostname-a.domain.com, hostname-b.domain.com, and so on. This is to make the browser think they are calling different hostnames thus increasing the limit of parallel connections. While in reality each of the a/b/c domain’s are just CNAMES to the same actual host (or load balancer most likely). Due to this limit, the OPTIONS calls can block some API call’s being made to your backend. Even though OPTIONS only take a fraction of a second to complete you can see the blocking happen in the Waterfall of Chrome Network tools and they add up.
As you can see in the graphic above each GET/POST request is preceded by an OPTIONS request. Couple that with the saving of the users preferences that’s a lot of extra calls hitting the same API server and sometimes queuing up when the max parallel connections is limited.
One way to eliminate this Preflight request is to not trigger the Cross Object Resource Sharing in the browser during your API request AT ALL! The way to do that is to ensure your resources originate from the SAME origin! When you do that you get a network tab that looks like this.
We achieved this by utilizing a tool within AWS we were already using, Cloudfront. At its basic level Cloudfront is Amazon’s Content Delivery Network (CDN) and primarily used for delivering content around the globe quickly from an S3 origin and cached at hundreds of Edge locations closer to your end users. The Vue web app I mentioned earlier is a completely static Vue application served from S3 and delivered using Cloudfront. In addition to hosting from an S3 Origin you can also host from Custom Origins which can be any HTTP location, including Web APIs.
By configuring your API as a Custom Origin you can then configure a Behavior for that origin. In this behavior you specify your cacheing strategy but also an origin path. So before our application would access resources using 2 main URLs,
- https://api.geospark.io <- Dynamic JSON data
Now with our new behavior we only use 1 main hostname,
What happened to our API? It now lives at,
Our Behavior specifies that any request to /hyperionapi/** gets forwarded to a custom origin, https://api.geospark.io
As a side benefit we also get the cacheing built in to Cloudfront! One thing to ensure if you implement this is Security. If you use API keys or Authentication headers unique to individual users you will need to set those as cache keys (Whitelist Headers or Query String whitelist) within the Cloudfront Behavior or you could leak users cached data to other users.
So that’s how we reduced our API calls by 50%! We eliminated all CORS requests. Also, all those 3rd party API’s we use have their own Origin and Behavior in our Cloudfront Distribution as well which also helps us with API quotas as some of the calls get cached in Cloudfront as well.
Does working on problems like the one we wrote about interest you? If so: We are hiring! Send an email to firstname.lastname@example.org and tell us why you’d be a great fit for our team.