Intro into the Clouds — Part 2 — PaaS Compute

Published in

Google for Developers EMEA

8 min readMay 6, 2021

If you have got to this page without reading Part 1, I recommend to start with Part 1 first to know more about what is it about, why I decided to write it etc.

Ok, last time we saw how the history of VMs logically resulted in the bookstore company creating the first public Cloud…

However, there was another company in the beginning of Public Clouds. Now, this company is one of the 3 major players. This is Google. However, from the very beginning, they had a very different approach to Clouds.

Same as Amazon, Google started Public Cloud (Google Cloud Platform) by realizing that the tools they used internally could be needed by other companies. So, they initiated these tools’ export to the public. However, internally those Google tools are very different in many ways.

As you know, Google’s main business is Search and Adds. They both from the very start experienced huge and constantly growing load. As a result, Google had to build a platform to support it. But instead of going with pure VMs, they decided to build a scaling and secure platform that would take over all hosting/infrastructure problems and let developers focus on application code only. In Google’s mind — SCALE was the key to success.

Not only was Google building its own online services, but also they were crawling almost every other site on the Internet, measuring their speed and quality, and eventually — dictating requirements on what a “good online site” means. So, with their platform, they decided to only support what I guess they considered a “true modern 21st century web application” which meant following:

it should be able to scale almost unlimitedly
it should be able to process requests in parallel (and as a result, almost independently of each other)
no request processing should take longer than 1 min (cause no User would wait so long)
it should be fault-tolerant

Note! There is an interesting story about the last point. Reportedly, the first iteration of Google production servers was built with inexpensive hardware and was designed to be very fault-tolerant. So, opposite to the common approach to buy expensive servers, Google decided to use almost PC-supposed hardware, but make sure their applications can withstand this hardware going down. In some sense this approach had a lot of similarities with how RAID was invented as an alternative to IBM’s washing machine size hard disk drives.

Google’s own services perfectly followed all the requirements. Decoupling of hardware/infrastructure from application code suited them perfectly.

And if you have a hammer everything looks like a nail…

In 2008, Google announced App Engine — a platform for developing and hosting web applications in Google-managed data centers.

Note! While officially announced in 2008, App Engine had taken a long path until it officially became a fully supported product only in 2011. However, that did not stop many developers and companies from using it for almost 3 years, just without any official SLA.

As described above, Google App Engine had very strong restrictions:

Code could only be written in Python (with Java added in 2009)
The list of libraries that could be used was very limited
Apps had to be based on web-service calls
Each request had to be processed within 1 minute (and if longer, was interrupted by the platform). It was possible to run backend tasks, but they were also limited to 10 minutes only
Apps had to be fault-tolerant
Almost any application needs storage. Google App Engine had one embedded. But similar to compute, it had to be unlimitedly scalable, which at that time meant using NoSQL DB called Datastore with eventual consistency (opposite to strong consistency, which is a standard for all traditional SQL DB).

All that meant that Google Cloud Platform (GCP) was not built to help any existing applications. It was built with only one purpose — to support writing new applications, as long as they fitted the above limitations.

And even if you were ready to build your application from scratch complying with all the limitations (cause you needed scalability), there was another problem.

When doing simple things, App Engine was very easy to use. But whenever you needed to perform something non-standard or complex, you also needed an A-level team, because things easy in traditional web development are much more complicated when solved with the 1-minute compute limitation and NoSQL DB.

We can only guess, but I think Google was not so much interested in traditional “hosting” business nor in winning the hearts of Business and Enterprise, but rather they were interested in helping new modern Startups (written from scratch), who needed enormous scalability, speed, and had the best engineering minds on board (so, in short, Google assumed that in the 21st century all modern companies would be like Google itself)…

This approach is called Platform-as-a-service (PaaS). Google App Engine was not the first PaaS, because Zimki launched their PaaS for JavaScript in 2005, but then closed that in 2007.

But we went too deep into the history, while I promised that we would only use it as a means to logically understand what was shaping Clouds.

Note! If it was truly a history lesson, we should have also mentioned Microsoft Azure, which was announced in 2008 and became commercially available in 2010.

What is important for our discussion, is that we can see two, in some sense, opposite approaches to the Cloud:

AWS approach of IaaS
Google approach of PaaS

The first approach does not feel much different from using a VM in any other data center or even locally, which means anybody could’ve used it from the start. The second required special training, a different architectural approach as well as rewriting the application specifically for that platform.

So, why are both called Cloud? And if those two different concepts are Cloud — what is the real Cloud than?

Modern Clouds offer a very wide set of services, but we can use the definition from National Institute of Standards and Technology:

On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.
Broad network access. Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).
Resource pooling. The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). Examples of resources include storage, processing, memory, and network bandwidth.
Rapid elasticity. Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.
Measured service. Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.

Personally, I would also add this one:

Automatic restart/maintenance

The last one is very interesting because it requires a potential change of architecture in “designed for cloud” applications. Since resources are taken from a shared pool and Cloud Provider takes full responsibility for managing those resources, Cloud Provider reserves the right to restart any computational resource if needed. In some cases, like IaaS, Cloud Provider tries to do it less and only during a maintenance window. In other cases, like most PaaS, Cloud Provider reserves the right to do it at any moment (to optimize underlying resources), and application has to be upfront designed to be able to handle such restarts (which is in part what people mean when they say a cloud-native application)

Note! To be able to rapidly scale your resources based on self-service requests, Cloud Providers had to keep some resources “waiting”. In order not to waste them completely, most Cloud Providers would offer you the ability to rent non-utilized VMs with a significant discount. But your applications need to be ready that those resources can be taken away at any moment. In GCP, those are called Preemptible VMs and in AWS — Spot Instances. Of course, in order to properly run on such VMs, your application should be designed in the cloud-native way…

So, is AWS just an easy-to-use VM rental service, while GCP — only a PaaS solution? No, not anymore.

Google quickly realized that most businesses and their applications were not ready for the “cloud future”, and there was a huge demand for VMs (that can be used to run existing applications in the Cloud), and extended GCP with an IaaS offering, known as Google Compute Engine (GCE).

And while Amazon had (and still has) huge success with it’s IaaS solution (called EC2), and were able to capture and hold the 1st place in “cloud wars” (due to their ability to run existing applications, good understanding of Enterprise market as well as the fact that they were first on that market), it also quickly realized that PaaS solutions have a demand and added a PaaS offering, called AWS Elastic Beanstalk.

So, do all the modern Clouds have basically two solutions — IaaS and PaaS? Yes and No.

Yes — all modern Clouds have IaaS and PaaS.

But no — they also have FaaS and CaaS.

And next we will talk about those…

Suggested labs (you can choose one based on your prefer language):

Part 3 (CaaS) is here…

Intro into the Clouds — Part 2 — PaaS Compute

Written by Artem Nikulchenko