Let’s talk Cloud
1. What is Cloud technology?
Cloud technology abstracts away all infrastructure management. What does that mean? It means, as a businees, you don’t have to deal with bare metal (think computer hardware) and anything related to it anymore.
Before the advent of the cloud, you had to go to or call a hardware vendor to figure out what hardware brand to adopt, what model of hardware to buy, what is the best deal for you in terms of price, memory, cpu etc., lower-level stuff in a nutshell. You also had to think about your maximum expected load and plan for it before making any purchase. That meant buying beefy machine and not using all of its resources for most of the time. And also if your machines ran substantial workloads, you had to think about cooling or your processors wouldn’t work properly. Furthermore, you had to think about hardware maintenance and facility security. You may have the best software security and all of that, but if your machines are not physically well secured, you are still at risk. How much time, money, and effort do you put into making sure your infrastucture meets the demand and your network works well? How do you manage your data, backups, etc. Probably you are doing all these things but they have nothing to do with your core business and to some degree it’s a pain in the neck. With the advent of the cloud, you have many more optimized options and solutions.
So what cloud technology does is, it takes care of all the compute, network and storage management and offer them as services. Computing has become a public utility where it’s easy to get the type of service you want on demand and you only pay for what you use. This idea of “computing utility”, or what we know as cloud computing, was suggested by John McCarthy when he was speaking at the MIT Centennial in 1961,
“If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility. Each subscriber needs to pay only for the capacity he actually uses. The computer utility could become the basis of a new and important industry”.
By providing computing, storage and network as services, cloud technology enables businesses to focus on their core business and use their resources to optimize for what’s truly important to them. It may be customer satisfaction, employee productivity, growth, profit, etc. It’s clear that cloud technology has impact on business operations, at least IT operations of businesses. To take advantage of the features that the modern cloud technology has to offer, IT organizations or IT departments will have to think differently about how to build applications if they are to embrace the new way of doing computing. For example, cloud providers have made it possible and very simple to scale applications in order to handle heavy loads of client traffic. But not all applications are scalable, and not all parts of an application need to scale at the same time and to the same level. That means to take advantage of the cloud technology, software developers have to consider some architectural changes. In this specific case, they have to build applications in a way that takes not only scalability into account, but also make sure that different services of the application can scale independently when necessary.
With the increasing adoption of cloud services, businesses are outsourcing the jobs of system administrators and/or DevOps engineers to cloud providers. One may think that systems administration and operation related jobs will be found solely in data centers, because businesses will be using cloud services and won’t have to deal with infrastructure management. All they will need is software engineers to build software and write deployment codes. So essentially DevOps engineers are working tirelessly to build cloud infrastructures so that there won’t be DevOps jobs in entreprises in the future. This also means DevOps jobs won’t go away, they will just be shipped maybe entirely to data centers. According to the Stackoverflow Developer Survey 2019, DevOps specialists are among the highest paid, most satisfied with their jobs. So draw your own conclusions.
It is undeniable that cloud providers have made provisioning and managing compute resources very simple, as well as deploying apps. In the next sections, I focus on the GCP.
2. GCP in a nutshell
Google Cloud Platform (GCP) is Google’s Cloud offerings. It’s a suite of products and services ranging from basic compute, storage and networking to specialized services such as IoT, analytics, AI, etc. Depending on your needs, you may choose to manage the resources you provision yourself or choose to use fully managed services. Managed services are taken care of by the platform on your behalf and they are redundant, thus highly available. That means management overhead, scalability availability and other issues are all managed by GCP. The whole of GCP is built on top of Google’s global infrastructure. It takes advantage of years of research, engineering and testing that have gone into building that infrastructure.

Let’s geek out a bit over some GCP services (barely scratching the surface).
3. Some GCP products and services
Deployment Manager
I find the concept of deployment manager very interesting. You declare in a config file the resources you want to create and you hand the file to the Google Cloud deployment manager and that’s it; your resources or services are deployed. You can bring your entire infrastructure to life using a couple lines of declarative code. Hence the term Infrastructure as Code. If your deployment infrastructure gets more complex, you can create a template for each resource. For example, you can create a template for a network, a template for the firewall rules, a template for a virtual machine (VM), etc. A template can import other templates. Then you import all the templates inside a single configuration file that the deployment manager will use to create all the resources you have defined in various templates.
One of the important concepts of deployment manager, and that’s true for virtually all the Infrastructure as Code services, is that you don’t give instructions about how your infrastructure should be built; the deployment manager figures that out itself. You just tell it how you want your infrastructure to look like, and it makes sure the state of your infrastructure matches your desired state.
With the Infrastructure as Code approach, you can afford to experiment with different solutions and collect usage information to make better decisions. You can easily tweak your deployment infrastructure to best suit your requirements. As a consequence, when your infrastructure evolves, you just need to update your deployment codes to meet the demand and tell the deployment manager to make your infrastructure looks like what you have described in the codes.
Networking
Deploying a network on the GCP is actually simpler than it should be. The GCP has greatly simplified the creation of a Virtual Private Cloud (VPC) network, that is a global private network that operates on top of Google’s global network infrastructure. It’s the same network that Google uses to deliver its services (think Google Search, Gmail, Google Drive, etc…) to billions of users worldwide. VPC networks on the GCP are global resources. That means, you can have your virtual machines provision in Europe and Asia and all of them being in the same private network.
When you create a project on the GCP, a default VPC network is automatically created. The default VPC contains subnets created in each region. If the topology of the default VPC network doesn’t meet your requirements, you can modify or delete it completely and recreate a custom network where you manually define the subnets. With just a couple of clicks, or a single command you have a globally operational VPC network.
gcloud compute networks create my-network --subnet-mode=customUsing the GCP console, you can create the VPC network and its subnets at the same time. With the command-line, it’s done separately. For that, you specify in what geographic region (American regions, European regions, Asian regions) you want the subnet to be created and you specify the IP address range of the subnet. You can do that with just a single command.
gcloud compute networks subnets create my-network-subnet-eu \
--network=my-network --region=europe-west1 \
--range=10.128.0.0/20You can create firewall rules to allow or block network traffic, based on various protocols and ports to or from certain IP address ranges. It is possible to also create firewall rules that apply to a specific machine instance by using network tags. You can enable the logging of network traffic by just toggling a button.
Custom machine type
The same simplicity is present when you are creating virtual machine (VM) instances. You choose the machine type that best suits your workload, which is either memory intensive or CPU intensive or standard. The thing with machine type is that they are predefined. That is, the number of vCPUs (virtual CPU) and amount of memory of a given machine type are fixed. For example let’s say your workload is CPU intensive and you need 20 vCPUs to accommodate it. Now guess what? There is no predefined machine type with 20 vCPUs, you’ll have to go with a predefined machine type that has 32 vCPUs. So you’ll be wasting money paying for 12 additional vCPUs that are not going to be used. That looks bad. Well, I should say it looked bad; because on the GCP the problem is solved with custom machine types. If none of the predefined machine types properly suits your workload, you can configure your own machine type. While creating the VM you just specify the number of vCPUs and the memory size that you want your VM to have. In the GCP console, you just input these numbers. The command to create a VM that has 20 vCPUs and 30 GB of memory is the following:
gcloud compute instances create my-custom-vm --custom-cpu 20 \
--custom-memory 30 --zone europe-west1-cOther major cloud providers try to solve this problem by providing hundreds of machine types.
You can resize your VM whenever you want. You just need to stop the instance, specify the new memory size and number of vCPUs, then you restart the instance. There are many other properties that you can specify when creating a VM. Properties such as, boot disk type, boot disk size, boot disk image, startup script, tags that can be used for firewall rules, etc. They define the VM configuration.
Managed Instance Group
Now let’s say in general, your single VM is able to handle all your workloads. But sometimes, traffic spikes and additional VMs are needed to share the workloads. What should you do? Should you create more VMs and wait for the spike, thereby wasting money on idle instances? No. Should you wait until your single running VM reaches its capacity limit and gets knock out by the traffic before spinning up another VM, thereby causing unnecessary disruption to your business? No. You should rather use Managed instance groups (MIGs) and enable autoscaling and load balancing services.
Essentially, MIGs are groups of identical VM instances created from the same instance template and managed by GCP. MIGs can automatically scale up or down the number of instances in a group depending on the workload size and distribute incoming network traffic across the instances. Managed instance groups also offer autohealing. Based on health checks, new instances can be provisioned to replace unresponsive instances. If an instance goes down for some reason, another one is automatically created to replace it.
To create a managed instance group, you need an instance template. An instance template is a specification of a VM configuration. Since the VMs of an instance group have to be identifical, they are created with the same properties that are specified in the instance template. Here comes the best part. When you create a managed instance group, GCP manages everything for you once you’ve specified how you want the instance group to behave. You do that by configuring an autoscaling policy and a target utilization level. An autoscaling policy is basically a load metric. it can be the average or maximum CPU utilization, some monitoring metrics that you define or the maximum requests per second per instance. The target utilization level is the target load at which you want to maintain the VM instances of your managed instance group. For example, let’s say your autoscaling policy is based on the average CPU utilization and you set the target utilization level at 80%. That means instances will be added to or deleted from the managed instance group to maintain the average CPU utilization around 80%. The maximum and minimum number of instances in a MIG has to be specified at its creation.
Load balancers
Since managed instance groups consist of many VM instances with different IP addresses which are generally ephemeral, there is therefore a need for a unique and stable IP address that user requests can reach. We also need to distribute traffic to the instances of the managed instance group. That’s where load balancers come in. They provide not only a stable access point via a static public IP address that serves as a frontend for all VM instances in the instance group, but also a system to distribute network traffic among the instances.
GCP has many types of load balancing. Choosing the right load balancer is not as straightforward as creating a VM. You need to understand how they work and given your desired deployment architecture, which one will best meet your requirements. The Google Cloud documentation is very extensive and goes into details about every load balancer, the different components of a load balancer and what you need to know to be able to make the make the right choice.
There are 3 main axes along which GCP’s load balancers can be divided. These are, source of the traffic (internal or external), destination of traffic (regional or global) and traffic type (TCP, UDP, HTTP(S), SSL).
Based on the source of traffic that is forwarded, GCP’s load balancers can be divided into internal and external load balancers. Internal load balancers distribute traffic originating from within a VPC network. Whereas external load balancers distribute traffic coming from the public internet to a VPC network.
Internal load balancers are regional. That is to say they distribute traffic to backends located in one region (can be spread across multiple zones). External load balancers, on the other hand, can be regional or global. Global load balancers distribute traffic among backends spread across multiple regions.
GCP’s load balancers can further be divided based on the supported traffic types. In other words, they operate at different layers of the OSI model.
Coming back to choosing the right load balancer, there is a Flow chart for that.

You need to have 3 pieces of information to effectively use the flowchart:
- The source of the traffic that is going to be forwarded to the load balancer: Is it coming from the internet, external load balancer. Or it’s coming from within your VPC network, internal load balancer.
- The location of the backends that are going to receive the traffic distributed by the load balancer: Are they in the same region, regional load balancer. Or they are spread across multiple regions, global load balancer.
- The type of traffic your load balancer must handle. Depending on 1 and 2, you may choose a tcp/udp load balancer, ssl or tcp proxy or an http/https load balancer.
If you want to distribute traffic coming from within your VPC network, you use internal load balancers. And since internal load balancers are regional, your backends have to be in one region.
If you have to distribute HTTP(S) traffic coming from the internet, you use global HTTP(S) load balancing.
If it’s UDP traffic, the only option is to go with Network TCP/UDP load balancing, which is regional.
But if it’s TCP traffic, that’s where it gets interesting. TCP traffic may be encrypted using SSL/TLS. If it’s the case, you may offload the encryption/decryption work to the platform by using the SSL proxy load balancing. If you don’t want to do that and still forward traffic to backends located in mutiple regions, you may use TCP proxy load balancing. But if you want to preserve client IPs, you have to go with Network TCP/UDP load balancing, because all GCP’s load balancers proxy connections from clients except for Network TCP/UDP and Internal TCP/UDP load balancing.
When using cloud products the devil is in the details, however there is nothing devilish here. For example, with respect to networking on the GCP, you have to choose between two Network Service Tiers, Standard Tier and Premier Tier. Premier Tier is the default. Essentially with Premier Tier, the traffic is routed within Google’s global private network as much as possible and exits it as close as possible to the client; it’s fast, more secured and you pay a little bit more. With Standard Tier, the traffic exits as soon as possible and is routed over the public internet to the client; it’s cheaper and … you know … it’s the internet. So what do you need, Global availability, low latency and high-performance? stay premium. Or you have a tight budget and prefer optimizing for cost? go standard. You can combine the two if you have workloads with different requirements.

The choice of a Network Service Tier affects your GCP resources including load balancers. Internal load balancers support only the Premium Tier of Network Service. But with external load balancers, you have the choice. However global load balancing requires the Premium Tier. In any other cases, opting for Standard Tier restricts your load balancing to be regional. That’s the case of HTTP(S) load balancing and TCP/SSL Proxy. In fact if you think about it, it all makes sense. With internal load balancers, the traffic is already coming from the VPC network. With global load balancers, the backends are spread across multiple regions. The great amount of traffic will be going on within Google’s network. Because a global load balancer sends user requests to the closest available backend. If that backend is at capacity, the traffic is sent to the next closest backend (or region). So a global load balancer uses the Premier Tier by design.
I’ll wrap up with one final remark. Google doesn’t use hardware load balancing but a software-defined network load balancer called Maglev. GCP’s load balancers are all software. Next time you configure a load balancer, think about that.
