APIs on Kubernetes with Kong - Part II — Serverless with AWS Fargate

Published in

The Startup

7 min readJan 20, 2020

In the introductory Kong for Kubernetes (K8s) blog post we went through a whirlwind tour of Kong for K8s, Kong Inc’s Kubernetes Ingress Controller with advanced service lifecycle management capabilities. In this post, I want to continue from where we left off and walk over how one can leverage Kong for K8s to enable Serverless workloads with Amazon Web Service’s (AWS) Fargate.

Fargate allows the deployment of containerized applications without having to worry about the underlying servers: That is the server nodes within a Fargate cluster can automatically expand and contract to optimally suite application workload needs. Let’s give Fargate a test with Kong for K8s to see how the two combine to enable a Serverless workload.

Wait… What do you mean by Serverless?

The word Serverless has taken a life of its own and its meaning, mechanics, and benefits can be highly context sensitive. For the sake of this blog entry, what I mean by Serverless is that we, as owners of the Kubernetes environment, do not have to worry about two important things: The lifecycle of the nodes (i.e. servers) underlying our cluster in order to satisfy our workload needs, and the horizontal scaling of the backend applications that they serve based on their API’s traffic.

That is, if our servers “magically” appear and disappear and our application instances automatically scale up and down on these magic servers, both in a way that properly and optimally serve the incoming HTTP application request traffic, then as far this blog’s definition of Serverless is concerned, we have achieved it. I am highlighting the term “properly and optimally serve the incoming HTTP application request traffic” in that last sentence because, as we will see later, that is where Kong for K8s will play a critical role. Our target topology, expanding on the topology we landed on in the previous blog, will look something like this:

Intended Target Topology Leveraging Kong for K8s for moderating traffic, K8s for scaling Pods, and Fargate for Scaling Nodes

Setting up Fargate

For the first “Serverless” benefit described above, that is the automatic provisioning and deprovisioning of our cluster server nodes to address traffic needs, we will leverage the AWS Fargate Service. As a side note, given Fargate’s potential power and benefits (and not to mention AWS fanfare), I found the service’s documentation and tool ecosystem surprisingly light. That may be because the service started on ECS and was only recently introduced for AWS’s Kubernetes service, EKS. That being said, to setup a fargate cluster, we need to have the following prerequisites satisfied:

Install AWS CLI and configure it with one of the regions on which Fargate on EKS is supported (i.e. US East (N. Virginia), US East (Ohio), Europe (Ireland), and Asia Pacific (Tokyo)).
Install eksctl. This is the tool we will be using to create a Fargate enabled EKS cluster and the tool can also be used to manage EKS in general. Note that this tool is not an official AWS tool, but created and maintained by Weaveworks.
Finally clone this repo and get to the “blog2” directory.

With the above setup, it’s time to create the actual Fargate enabled EKS cluster. To do this, we must first decide the namespaces for which we want to have a Fargate profile. Each namespace with a Fargate profile, will effectively be “serverless” — that is for applications deployed to that namespace, it is EKS’s underlying implementation that will ensure provisioning and deprovisioning of nodes to serve the workload for these applications.

In this example, we will Fargate enable the the default namespace for deploying the backend application itself, as illustrated by the following diagram. We begin with an eksctl command to create the EKS cluster with the appropriate Fargate profiles as follows:

This command will take a good 10+ min to complete, but once it is completed, you have an EKS cluster with the appropriate Fargate profiles. Note that other than the single node we create for running the Kong Gateway itself, the cluster doesn’t actually have any Nodes. This is pretty cool, and it’s because we will be relying on Fargate to take care of provisioning the nodes for the backend applications. Let’s now proceed to deploy Kong for K8s, exactly as we did in the previous blog post:

This time we are getting an AWS external IP address for the Kong for K8s proxy gateway service since we are deploying it on AWS and not on our own Minikube as per the previous blog post.

Let’s now create the backing sample Hello World application (and its API) as well as the Kong proxy configuration exactly as we did in the previous blog post.

The node list command response here is interesting. If we do a ‘kubectl describe node’ on each of the nodes above, we will see that they are running the following non kube-system pods:

ip-192–168–28–137.ec2.internal — Runs Kong Gateway with the Kong for K8s controller, running on a standard EC2 server node.
fargate-ip-192–168–112–187.ec2.internal — Runs hello-api backend app on a Fargate (i.e. elastic) node.
fargate-ip-192–168–118–71.ec2.internal and fargate-ip-192–168–95–242.ec2.internal- Runs CoreDNS. EKS apparently needs to have CoreDNS as the Kubernetes DNS server in order for Fargate profiles to function, and you are seeing the CoreDNS processes running on Fargate (i.e. elastic) nodes.

In the second node above, we see that the Kong API gateway deployment is up and running on a standard EKS node while the backend application is running on a Fargate node. We should now be able to start hitting the API endpoint on the proxy, using the same consumer credentials as the previous blog post, but using the AWS external IP address of the kong-proxy service, as follows.

Great, the API is working as per our previous blog, except that we had to go through many more steps to get there… So why is this worth it? The answer is that we can now leverage the Fargate scheduler for the backend application to scale the *optimal* amount of backend compute resources as requests to the application come in. This works because for each new pod the hello-api backend, Fargate spawns up a new virtual node, so a pod will be mapped 1:1 to a fargate node (for a great description of how Fargate works, see this article). To make sure that this works well then, we need to make sure that we spawn out pods to serve traffic needs, and we can do that with Kubernetes’ Horizonal Pod Autosaler (HPA) capabilities as follows:

The command above tells Kubernetes to create an extra pod whenever the average CPU of the underlying nodes for the hello-api pods goes above 80%. On a Fargate cluster, this means that a new node is created to run that pod, and the node is automatically de-provisioned when the CPU load is decreased below the 80% threshold we specified.

So the combination of an HPA and Fargate, allow us to have an environment where we do not have to worry about the lifecycle of the nodes underlying our cluster in order to satisfy our workload needs, and the horizontal scaling of the backend applications that they serve based on their API’s traffic. That is these nodes, both the nodes and the application instances expand and contract to serve our traffic needs, thus satisfying our definition of “serverless” that we set at the onset of this blog. Pretty cool!

Let’s get back to that key phrase we used at the onset: “properly and optimally serve the incoming HTTP application request traffic”. We are able to do this thanks to the Kong for K8s gateway, because without it, our application’s API would be exposed for use by any part at any traffic rate. With Kong in front, we are able to control that traffic to specific client applications (through the KongConsumer CRD) and with appropriate access policies (through the KongPlugin CRD and the rate limiting and AuthN plugins we configured in the previous blog). This means that our backend application serves known client traffic only, and according to their allowed traffic quotas only, thanks to Kong for K8s, while leveraging just-in-time provisioned servers thanks to Fargate and Kubernetes!

Bonus Exercise

Just as we deployed the hello-api backend app to a Fargate profile, it should be possible to do the same thing with Kong for K8s itself and thus have both “serverless” backend-apps, as well as a “serverless” gateway. If anyone tries that out, let me know how that works out!

PS — The headline image for this blog entry is the picture of nacreous clouds that I recently had the chance to experience in south eastern Iceland.

APIs on Kubernetes with Kong - Part II — Serverless with AWS Fargate

Written by Reza Shafii