For the last 4–5 years we have all learned a lot about containers. Technologists have embraced their portability, inherent security benefits, and best of all the really cool abilities to spin up, tear down, and scale that container orchestration brings. What could possibly be better than this?
Before I answer this question let me provide some background. My pandemic hobby was to build a web platform for restaurants to create QR codes. I very much wanted to help with the transition to contactless menus since I’ve always suffered anxiety from the restaurant industry’s inherent problems with scaling. I’d almost always rather order food online/pay via phone than wait for someone to serve me. For now let’s examine this app:
The design consisted of a React front end, Node.js middleware API server, and Mongo/Firestore backend databases. The architecture I started out with was Kubernetes based. I wanted to create an architecture that would allow me to embrace open source technologies while providing the means to scale easily if this product took off. I created my CI/CD workflow to automatically publish new containers in my cluster, generate SSL certificates, and even host my own Mongo DB. Latency was low, automation was high, things worked great!
However, one problem remained, I had no customers. The demo system cost me roughly $100 a month to run, which while not much money was a bit annoying when I had 0 users. This caused me to rethink how my setup was architected and if you’re skimming this article here’s where it gets good. My app really only needed to load quickly in very specific time windows. It was designed for restaurants which have very common hours of operation and predictable peak customer traffic periods. There was little reason my app needed to have low latency at say 4AM. This led me towards a new technology which I strongly believe the world will be using in the near future to further align their application delivery with user demand cycles in a predictable cost structure. This technology is called Cloud Run and is based on the open source Knative framework for serverless applications that Google has pioneered.
Cloud Run allowed me to wake up my container upon receiving a request via my sites URL. This container could then have a configurable lifespan that allowed it to stay up for a period of time without being used before it scaled back down to 0. No longer did I need to have something always running. The first request to my site after a period of inactivity may experience a few seconds of latency but after that subsequent requests would be responded to normally from the same container. If no requests came in over a specified interval the container would scale back down to 0. If the opposite occurred it would scale horizontally the same as if it was a regular Kubernetes deployment with horizontal auto scaling enabled. Using the hosted version of Cloud Run reduced my operating costs ~90% which was perfect for the bootstrap mode I was in.
It Gets Even More Interesting in Enterprise Environments
As a Product Manager in the Kubernetes space, and previously as a Sales Engineer in that same space for many years, I know that Lambda(or Cloud/Azure Functions) sprawl is a problem. Aside from the obvious CI/CD challenges a Lambda type pipeline creates there is the issue of unpredictable cost. Companies rarely know how much their serverless functions are going to cost them ahead of time and sometimes it may not be cheaper than the alternative depending upon the service. The solution for these problems is to deploy serverless applications, or containers, in Kubernetes clusters you are already paying for. This is the future of application delivery.
Cloud Run can be executed in your GKE Kubernetes cluster. This means that you can bound the size of your Kubernetes cluster to fix your costs all the while making use of extra compute capacity within this cluster that you’re already paying for, with serverless applications, or functions, built with containers hosted in Cloud Run. Your Kubernetes deployment specs will be similar to what you have already been using now with benefit that when they are inactive the pods will scale to 0. You can wake these pods up via URL web hook or a message bus triggers with more methods under development.
Still Not Convinced?
Here are 3 takeaways:
- Cloud Run allows containers to scale to 0 meaning you can use your existing container based CI/CD pipelines and DevOps /DevSecOps toolsets to get started.
- If you use Cloud Run in your own K8s cluster you can take advantage of existing capacity and governance not to mention charge back models.
- Cloud Run can be deployed in GKE on Prem, GCP, AWS, or Azure via Anthos giving you serverless capabilities in and out of the cloud.
- You can shift to a non-serverless paradigm in the same Kubernetes cluster without rearchitecting your app should your requirements change.
We are in a race to reduce the marginal cost of service delivery. Everything we are working towards has been to optimize and automate. The concept of turning off applications and waking them up when they are needed WITH a predictable and governed cost model, I feel, is the next phase of application delivery. Don’t let your delivery decision be your next legacy bottleneck.
Cloud Run is available as a managed service, or part of your GKE cluster on GCP or on AWS and Azure Q4 2021. If you are standardizing on Kubernetes, it’s time to standardize on a serverless application deployment framework.