Cloud infrastructure and hosting solutions are evolving at a rapid pace. While it is still common for small applications to live on single isolated servers (physical, virtual, etc) — that’s not the case for applications requiring high performance, reliability or scalability.
This article walks you through the various solutions you’ll see in the wild, starting at the bottom and working our way up. To make it easy to compare the options, I’ve rated them all according to the legend below:
⚠️= Single point of failure
🏗= Requires infrastructure configuration knowledge
🐧= Requires operating system knowledge
🐛= Requires application runtime knowledge
🐢= Scales slowly (manual involvement)
💰= Costs at scale (1 to 3–3 being the worst)
🚀= Performance at scale (1 to 3–3 being the best)
Here we have a single physical server or virtual machine. Housed within are the application code, the filesystem, database, and any other required services.
They’re great for testing, and it’s how we generally develop applications locally (all dependencies loaded on a developers laptop). But in production, they can fall over quite quickly. Scaling involves throwing more physical or virtual hardware at the instance and you typically end up paying for the worst-case scenario.
The biggest weakness with this setup is that all your eggs are in one basket. Should any of your services fail, so will your entire application.
When your beloved application is in production, the last thing you want is a single service failing and taking down everything with it. To avoid this problem of a single point of failure, we need redundancy.
By having more than one instance setup, we can split traffic between them using a load balancer. If one of the instances fails, traffic can continue flowing to the remaining instance(s) and hopefully give your team enough time to bring the failed instance(s) back online.
Generally, the more instances you have behind a load balancer, the more traffic it can handle reliably. However, you still need to plan for the worst-case scenario — if a sudden influx of traffic occurs and your instances don’t have the required resources to handle it, you’re no better off than the single instance setup.
A key point to remember with this setup is that your application is now living on multiple separate instances, so typically we host the database and any other shared services on a separate instance so that each application instance can share a connection to them. We extend the idea of load balancing out to our database and service instances to help avoid the single point of failure problem with those as well.
Here’s where we start seeing green trees and hearing waterfalls and chirping birds. Auto-scaling instances follow the same general principles as load-balanced instances, with the primary difference being the lack of manual involvement required when it comes to scaling.
With auto-scaling instances, we generally set up a single instance as a template with enough resources to reliably handle a low volume of traffic. This template is then used by the underlying infrastructure as a cookie-cutter to stamp out new instances when they’re required.
For example; if our application uses a fair bit of memory (RAM), and we know it generally slows down significantly or fails when there’s less than 15% left, we’ll monitor that metric and make sure the infrastructure spins up new instances as required before we hit this threshold.
The reverse is true as well, when we’ve got an abundance of resources that aren’t necessary for the current level of traffic or workload, we remove instances. This helps to save on costs as you’re typically only paying for resources you use, not the worst-case scenario.
An added benefit of auto-scaling is that your system also becomes self-healing. Should any of your instances fail, the infrastructure will remove these and spin up new ones in their place.
While auto-scaling is great, there’s still a level of operating system (OS) and application runtime knowledge that’s required to ensure your instance templates are set up correctly. It also requires regular OS, runtime and performance configuration updates.
Managed auto-scaling isn’t a silver bullet, as you’re still required to intimately understand your application’s behaviour and resource consumption. Without carefully dialling in which metrics to monitor and when to trigger scale-up and scale-down events, your applications may still fail or become unresponsive during unpredictable traffic or workload spikes.
Serverless infrastructure automatically takes care of the OS, the runtime AND the scaling aspect! All you provide is your application source code, some basic configuration and it’ll handle the rest. Serverless scales up and down incredibly quickly and you typically only pay for the resources you use (down to the second in most cases).
At a very high level, every time a request comes in your app is spun up (if it’s been a little while since the last time it was requested). It then handles the request and returns a response. This setup can handle an immense amount of traffic as you’re never funnelling it all to a single or group of instances, but rather, to an infinitely large pool of available resources and processing power.
There are a few considerations to be aware of with serverless technology. First of all the programming paradigm is a little different to traditional setups, and currently, not all major frameworks support serverless out of the box. There is a bit of infrastructure/DevOps knowledge required to do the initial setup, but not much else required in terms of maintenance and upkeep from an infrastructure perspective, which is great.
If we were to imagine a somewhat perfect world, managed serverless solutions are the fabric that it is comprised of. Providers such as Zeit Now or Laravel Vapor for example completely take care of the infrastructure setup and deployment of serverless applications — all you provide is the application source code. In terms of infrastructure management, it doesn’t get any better than this.
What we have with managed serverless is an ecosystem where developers can now solely focus on what they’re good at, building applications. There’s no need for developers to have DevOps knowledge, or even have humans involved in the process of spinning up infrastructure at all. While these are still important skills when you’re required to work directly with such infrastructure, they’re no longer a requirement for teams looking to build performant, highly scalable and cost-effective applications.
The last word
While it was once commonplace for web applications to live on single instances, we quickly learnt why these don’t quite hold up under today’s production workloads. Applications requiring high performance, infinite scalability and cost-effective solutions require something more. Auto-scaling solutions are the go-to, allowing you to remove manual intervention from scaling operations, introduce self-healing infrastructure and manage high traffic workloads.
However, there’s a new kid on the block called Serverless which completely removes the need to worry about servers altogether, scaling metrics and all the nitty-gritty details of OS and runtime configuration. What you get with Serverless is an experience and workflow where teams can focus on building great applications, while the underlying infrastructure supports the high traffic workflows you’d expect from world-class digital experiences.
So next time you’re looking to spin up a new project ask about serverless solutions 👍