Cost-effective Kubernetes on Google Cloud

June Rhodes
6 min readOct 25, 2016

--

Here at Redpoint, we’re building out the infrastructure we need to support Minute of Mayhem, a multiplayer online first person shooter.

It’s extremely important to us to get our infrastructure right; we don’t want to be woken up at 3am because our service provider is down and players can’t access our game. We also don’t want to put ourselves in a position where a successful game launch exponentially increases our service costs beyond what we can bear.

Our current cluster configuration in Google Cloud

To this end, we’ve been building out an online multiplayer-as-a-service solution we call Hive, built on Google Cloud. We want to provide not only ourselves, but others, with a cost-effective, highly scalable solution for building multiplayer games. We’ll be announcing more details about Hive as it approaches a public alpha.

As we’ve been building out these services, we’ve noticed there are some steps you can take towards drastically reducing your costs of running microservices on Google Container Engine, in particular around the usage of load balancers, SSL and ingress rules.

The first iteration

When we started building out Hive, we only had a few services, namely:

  • The temporary session service, which offers temporary sessions for testing purposes
  • The NAT punchthrough service, which allows clients to perform NAT punchthrough and retrieve the external addresses of other players in a game, and
  • The game lobbies service, which manages the creation and membership of lobbies for your game

When you have so few services, you can afford to use Kubernetes a little inefficiently; after all, the pricing of Google Cloud Load Balancers means that you’re paying the same cost for load balancing until you exceed 5 global load balancing rules. Beyond this however, things start to get expensive.

Our initial infrastructure on Google Cloud looked a little like this, with each microservice having it’s own inbound load balancer IP address:

Our initial solution for routing traffic to containers on Container Engine

As you can see, this already has an economical scalability problem: Each container has it’s own load balancer, and beyond 5 load balancers, Google charges $7.20 for each inbound rule. If you’re planning on having lots of microservices, this cost is going to start adding up, and it doesn’t make a whole lot of sense to be potentially in a position where your load balancing costs exceed the cost of your Compute Instances.

We started with this initial architecture primarily because Kubernetes makes it so easy to end up here, through the usage of --load-balancer-ip, which is an argument you can pass when creating or updating a service on Kubernetes.

An alternative to --load-balancer-ip

Kubernetes offers an alternative to this argument, called ingresses. Ingresses allow you to specify configuration for an inbound load balancer, separately to the services that your cluster is offering. This allows you to route all of the inbound traffic to your cluster through a single Google Cloud Load Balancer, drastically reducing costs.

If you’re offering your services over SSL (and you should be), you’ll quickly notice that Google Cloud Load Balancers don’t support SNI. This means you can’t serve a different SSL certificate for each microservice through a load balancer. If wildcard certificates are out of your price range, you’ll need to find another solution like we did.

Fortunately for us, someone else has already built a solution, called Kube Lego. This is a pod you can deploy to your Kubernetes cluster; it monitors the ingresses in your cluster and automatically configures the inbound load balancer for Let’s Encrypt’s verification, issues certificates, and enables SSL with the new certificate. It also automatically handles renewal with Let’s Encrypt.

The deployment YAML file we use to deploy Kube Lego to our cluster is displayed below.

So with Kube Lego we were able to vastly simplify our inbound routing to the following model:

After deploying Kube Lego to our Kubernetes clusters

Now we’ve only got two load balancers, and we can route as many microservices as we want through the same load balancer. Done right? Well not quite, and it’s due to an oddity with the way Google prices load balancers.

One project to rule them all

Remember when we said that Google charges a flat rate for the first 5 load balancers?

Source: https://cloud.google.com/compute/pricing#lb

Well if you’re like me, you read this and assumed that these “first 5 forwarding rules” are per billing account. That is, the fact that we have a load balancer in each project doesn’t matter, and we should be charged $18 for both of them.

But this is not correct. Google’s minimum service charge for the first 5 load balancers applies per project. So the fact that we have an entirely separate staging environment with it’s own load balancer means we get billed twice; $36 instead of our intended $18.

This puts us in a bit of an awkward position. We’d really like to keep these two environments fully separate, but our instance costs in the development/staging environment are just $7.70 so it doesn’t make sense to pay for load balancing that’s twice the price (we use preemptive instances in staging at the cost of reduced up-time for that environment). We can’t just move only the load balancer either, since Kubernetes ingress routing will always configure a load balancer in the same project.

So to reduce costs, we need to rearrange our model again:

One project for all compute resources, and a separate project for staging data storage

This isn’t ideal for obvious reasons; we now have our development cluster residing in the same project as production, but unless Google changes their pricing model around load balancers, this is how it has to be.

Summary

We’re now currently running 10 front-facing microservices in production, with more planned as we expand the services that Hive offers. By taking the steps I’ve outlined above, we’ve managed to reduce our load balancing costs on Google Cloud by 83%, down from the $108 a month we’d be currently paying to just $18.

So to recap, when you’re setting up microservices using Kubernetes on Google Cloud, keep the following things in mind:

  • Use a single ingress configuration and avoid --load-balancer-ip.
  • Use an SSL certificate with multiple domains to work around the limitation of no SNI. Kube Lego is a great tool to help you do this, and automatically manages certificate renewal for you as well.
  • Combine your compute resources into a single project to avoid being billed the load balancer minimum service charge multiple times.

June Rhodes is the technical director at Redpoint Games. She is currently working on building out Hive to support the development of Minute of Mayhem. You can follow Redpoint Games on Twitter and Facebook for future updates.

--

--