So you want to move to the cloud? Write good code.

Caleb Crawford
Cloud Everywhere
Published in
6 min readApr 8, 2021

This article is the first in the “So you want to move to the cloud?” series, a collection of articles that focuses on the “big three” public cloud vendors and covers new factors and perspectives to consider when including the public cloud in your broader IT architecture.

The tantalizing value proposition behind the public cloud’s “everything-as-a-service” model is that you will only pay for what you need. However, it’s probably safer to think of this as “you always pay for what you use, regardless of whether or not you actually need it.” One of the beauties of cloud economics is that it creates a financial incentive structure in which efficient, performant code is rewarded with tangible cost savings.

Let’s consider how this looks in 3 specific areas: serverless functions, data warehouse querying, and managed Kubernetes services.

“Serverless” functions
While the masses continue to learn, adopt, or adapt to containerization and orchestrating their containers, you can quickly have Google, Amazon, or Microsoft manage every single aspect of that orchestration through their managed “serverless” functions. The concept is simple: write some code, decide how much compute to allocate for that code to execute in the cloud, and then decide how and when your system should invoke that function (read: micro-services), only paying for the time it takes for that function to execute. Repeat ad infinitum.

In what is probably the most striking and measurable example of the benefits of good code (or, the cost of poor code), here are the actual costs from each major cloud provider for their respective serverless function offerings (there are different tiering options that affect pricing; the below comparison reflects standard rates without “free tier” factors or related data transfer charges).

*Note: GB-seconds are the number of seconds your function runs for, multiplied by the amount of RAM consumed. The below functions are generally charged for both run time — at a per millisecond rate — and the number of monthly invocations.

AWS Lambda:
$0.000017/GB-second for code execution time
$0.20 per million invocations

Azure Functions:
$0.000016/GB-second for code execution time
$0.20 per million invocations

Google Cloud Functions:
$0.0000025/GB-second for code execution time
$0.40 per million invocations

As you can see by the pricing model, these functions are intended for massive scale. Each unnecessary or inefficient line of code will be executed millions of times each month. To demonstrate the potential implications, we can use AWS Lambda as an example:

If a Lambda function with 256 MB of RAM is invoked 25,000,000 times per month, my monthly cost will be $57.08 for that single function if it takes 500 milliseconds to execute. Bump that to 750 milliseconds, and your monthly costs for this single function have increased 45% to $83.00. Multiply that excess by every serverless function in your cloud-first micro-services architecture, and the technical debt penalties can quickly add up.

Data Warehouse Queries
The architecture of your data ingest and storage deserves a dedicated post in and of itself (forthcoming!), but it’s worthwhile to actually consider how that data will be queried ahead of time. For now, we’ll assume you’ve followed best practices for storage tiering, torn down the walls between your organization’s data silos, and can easily manage access to all of your data in a well-architected data warehouse. Proprietary players like Snowflake and Databricks aside, here are the main public cloud data warehouse offerings — and how querying costs are calculated for each:

Amazon Redshift
$5 per TB of data processed (RedShift Spectrum, for S3 data only)
Or, $0.25 to $13.04 per hour per Redshift node

Azure Synapse Analytics
$5 per TB of data processed (serverless option, for Azure Data Lake data only)
Or, $1.20 to $360 per hour per “Data Warehouse Units,” which are compute nodes with pricing based on CPU, memory, and IOPS

Google BigQuery
$5 per TB of data processed

BigQuery is the only true serverless offering among these cloud vendors; however, each vendor has a serverless option for directly querying your data if it is held in Object Storage; generally in the form of large, schema-based files in formats like CSV, Avro, or Parquet.

This is where the cost savings come in. Analysts trained to use SQL will be familiar with the following (almost universal) start to a query:

> SELECT * FROM table….

However, this will require scanning every row of every column in a table. When you’re being charged for every byte of data processed in your query, this has both a performance and financial penalty — possibly further exacerbated if your query causes you to scan data in storage tiers intended for infrequent data access. That inefficient query might not have a significant or even noticeable impact when using on-premises infrastructure, but in the cloud, that cost will be reflected in your budget immediately.

Fortunately, this problem can be solved by efficient querying.

For instance, you can simply change your SQL and exclude columns from the query:

>SELECT * EXCEPT (column1, column3 column4) FROM table …

Additionally, you can horizontally partition a table by a date or timestamp value when it is created. That will allow you to limit the rows that you query as well.

There are countless additional examples of where analysts can find these efficiencies in their queries. To assist in this process, Amazon and Google provide options to preview the amount of data to be processed before executing the query:

Amazon Redshift offers a Query Plan feature, which requires manually inserting the EXPLAIN command before your queries.

Google’s BigQuery does this by default and gives you a preview before every query:

This BigQuery screenshot shows that the query will process 3.3 GiB

Unfortunately, Azure Synapse Analytics currently keeps you guessing about how much data you’re querying, and you can’t actually view that information until after your query has processed. Nonetheless, they do provide some tips to control costs in their documentation.

Kubernetes Clusters
And of course, no good cloud conversation is complete without at least a brief mention of Kubernetes. The code that is executed in your Kubernetes clusters will ultimately impact the size of your clusters and the costs that you pay for them. Let me explain what I mean.

In the public cloud, it is most likely that you will be using a managed Kubernetes service. Rather than having your DevOps team individually administer every single virtual machine necessary to run your applications, managed Kubernetes services offer you a higher-level role in administering your clusters. You can spin up an entire cluster with a single line of code, and your cloud provider will take care of ensuring each VM is networked together in whatever subnet you specify, and it will ensure other default or manually pre-specified configurations for your cluster.

Here are the managed Kubernetes services for each cloud provider, along with the specific nomenclature for their respective clusters of virtual machines:

Google Cloud: Google Kubernetes Engine (GKE), orchestrates your Managed Instance Groups

AWS: Elastic Kubernetes Service (EKS), orchestrates your Auto-Scaling Groups

Azure: Azure Kubernetes Services (AKS), orchestrates your Virtual Machine Scale Sets

While Kubernetes has always been the ideal for scaling the number of pods running on a single virtual machine/node, managed Kubernetes services take that a step further and will autoscale the number of virtual machines in your cluster. This scaling is generally based on CPU utilization.

As you may have guessed, the primary cost associated with these clusters is the cost around the nodes themselves. If you are running more nodes in a cluster, you will be paying more. This is where efficient code comes in. The more efficiently your applications run, the more efficiently each container will consume CPU resources, and the less often your managed Kubernetes service will provision additional virtual machines in your cluster. While the autoscaling aspect does improve overall efficiency, nothing will replace the importance of ensuring that we write scalable, efficient code in the applications themselves.

Summary
These considerations are just a small facet of the underlying mindset shift that organizations must adopt when moving to the cloud. As this series continues, please feel free to include your own questions and suggestions in the comments below, and keep an eye out for more posts soon.

--

--

Caleb Crawford
Cloud Everywhere

Dabbler and explorer of all things tech, data, cloud, finance, truth, human, nation, and machine. Writing to learn.