How to preserve your innovation speed and your budget with quotas APIs

guillaume blaquiere
Google Cloud - Community
8 min readMar 16, 2021

The cloud has many benefits and one of them is the innovation speed with the motto “Fail fast, iterate faster”. Indeed, the cloud providers propose tons of services to easily test and experiment, when the same would be expensive, or impossible, on premise environment.

  • Create a cluster with Hadoop or Kubernetes
  • Use graphic accelerator for AI training
  • Deploy a global application,…

The cloud platforms are wonderful sandboxes where you can spend hours to experiment and try out. However, resources aren’t free!

There is periodic bad news on specialized websites about bad uses (or misuse) that led to huge bills.

Even if free tiers are generous on Google Cloud, some cases can lead to expensive bills. This situation frightens the companies and top management and, for them, the easiest solution is to restrict the users authorizations to use only a subset of the platforms.
The bad side is that you limit the freedom and the capacity to innovate for your teams.

How to continue to test, experiment and innovate while controlling your budget and avoiding abuses?

This question is especially important in the sandbox/test projects that you provide to your users and/or R&D teams.

Google Cloud Quotas API

Instead of limiting the set of usable services with role and permissions, the other solution is to limit the expenses on each service. It’s the purpose of the quota API, to set an upper bound on the usable resources on each service.

I would like to focus on 2 use cases:

  • BigQuery
  • Compute engine

Automation of quotas

To enforce quota definitions automatically on your projects, especially the sandbox projects, you need to automate the quotas set when you create or update the projects.

You can choose to achieve this by API call or by terraform. I will present both to help to choose the best fit for your use cases and existing project creation/update processes.

API test and security concern

Before going deeper, you will see that I use gcurl command which is an alias of curl with a security header populated automatically (for easy use). You can find how to create it in the Quotas API documentation.

However, I dislike that solution because it requires us to create a service account key file; with all the possible security issues that it can imply. And despite my request to the Google Cloud team, that dangerous piece of doc is still here.

So I propose to use your user account like this (you need to be authenticated previously with a gcloud auth init or a gcloud auth login)

alias gcurl='curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"'

Or even to use a service account by impersonation (be sure that your current authenticated user has the right roles to impersonate the service account)

alias gcurl='curl \
-H "Authorization: Bearer $(gcloud auth print-access-token \
--impersonate-service-account=totoyoyo@gdglyon-cloudrun.iam.gserviceaccount.com)" \
-H "Content-Type: application/json"'

The best way to protect a secret, is to not have a secret!

So, when you can, don’t use service account key file (which contain private RSA key), except when you reach the limits of the IAM service

Quotas for BigQuery

BigQuery is an awesome toy: you can process petabytes of data in seconds!! Data scientists love it because they can now achieve queries that never were able to achieve in their previous on prem environment.

It’s wonderful and, because you pay the volume of data that you scan in each query, the cost at the end of the day can quickly skyrocket (I already saw $100k+ in one afternoon!).

Education is the right path but sometimes mistakes can occur. And there is no room for mistakes, you pay what you use. And that’s why quotas are very important to prevent this kind of situation!

BigQuery quota limitation by API

Before setting quotas on BigQuery, it’s important to know which quotas exist. For this you can list the current quota available in the Quota API.

We will start slowly, only to list the consumer quotas on the BigQuery service. I have to admit that this API is not so easy to understand at the beginning due to its generic nature and the need to adapt to any quotas for all the products.

gcurl https://serviceusage.googleapis.com/v1beta1/projects/<projectId>/services/bigquery.googleapis.com/consumerQuotaMetrics

You can see a lot of quotas that you can increase or decrease. For our use case, the Query usage quota is the right one.

{
"name": "projects/751286965207/services/bigquery.googleapis.com/consumerQuotaMetrics/bigquery.googleapis.com%2Fquota%2Fquery%2Fusage",
"displayName": "Query usage",
"consumerQuotaLimits": [
{
"name": "projects/751286965207/services/bigquery.googleapis.com/consumerQuotaMetrics/bigquery.googleapis.com%2Fquota%2Fquery%2Fusage/limits/%2Fd%2Fproject",
"unit": "1/d/{project}",
"metric": "bigquery.googleapis.com/quota/query/usage",
"quotaBuckets": [
{
"effectiveLimit": "9223372036854775807",
"defaultLimit": "9223372036854775807"
}
],
"allowsQuotaIncreaseRequest": true
},
{
"name": "projects/751286965207/services/bigquery.googleapis.com/consumerQuotaMetrics/bigquery.googleapis.com%2Fquota%2Fquery%2Fusage/limits/%2Fd%2Fproject%2Fuser",
"unit": "1/d/{project}/{user}",
"metric": "bigquery.googleapis.com/quota/query/usage",
"quotaBuckets": [
{
"effectiveLimit": "9223372036854775807",
"defaultLimit": "9223372036854775807"
}
],
"allowsQuotaIncreaseRequest": true
}

A few interesting parts here:

  • The Name, because we will reuse it later
  • The unit that provides information on the purpose of the quotas. We have here BigQuery query usage limit:
  1. Per day and per project
  2. Per day, per project and per user

The second one allows to limit the requester on a project without locking the others and the whole project itself. Interesting when you have a shared project with some users that take care of their spending and not the others!

You can also have a look on the effective and default limits, in bytes.

We now have the full name of the quotas, and it will be easier to use the ConsumerQuota API to create a custom limit.

For this, you need to create this body to define the new quota to set

{
"overrideValue":10000000000
}

Set quota to 10Gb per day, per user and per project

And to POST this body to the service usage API with the quota name (picked in the previous part). Use the gcurl command for that

gcurl -d <BODY> \ https://serviceusage.googleapis.com/v1beta1/<NAME>/consumerOverrides

If you never set this quota in BigQuery on your project, it won’t work because you change the quota by more than 10%. You need to add a force Boolean to true as query parameter of your URL, like this

gcurl -d <BODY> \ https://serviceusage.googleapis.com/v1beta1/<NAME>/consumerOverrides?force=true

Once the query is accepted, you are returned the ID of the operation. Perform a get query, as before and see the change.

"quotaBuckets": [
{
"effectiveLimit": "10000000000",
"defaultLimit": "9223372036854775807",
"consumerOverride": {
"name": "projects/751286965207/services/bigquery.googleapis.com/consumerQuotaMetrics/bigquery.googleapis.com%2Fquota%2Fquery%2Fusage/limits/%2Fd%2Fproject%2Fuser/consumerOverrides/Cg1RdW90YU92ZXJyaWRl",
"overrideValue": "10000000000"
}
}
],

You can see the effective limit now, and a new name for the consumerOverride. You will need this full name to update (PATCH) or to delete this override.

BigQuery quota limitation with Terraform

Terraform is a common tool used by DevOps teams to automate the infrastructure of the project (IaC: Infra as Code). You can also create projects and configure the environment with it.

Here, an example of the previous POST API call with Quota API terraform module

provider "google" {
project = "<PROJECT_ID>"
region = "us-central"
}
resource "google_service_usage_consumer_quota_override" "override" {
provider = google-beta
project = "<PROJECT_ID>"
service = "bigquery.googleapis.com"
metric = "bigquery.googleapis.com%2Fquota%2Fquery%2Fusage"
limit = "%2Fd%2Fproject%2Fuser"
override_value = "10000000000"
force = true
}

Because I use my own credential, I also don’t need a service account key file with terraform. On your workstation, perform a gcloud auth application-default login to set your credential in your runtime context.

As you can see, you need to know the metric and limit names. You can’t guess them and a discovery phase with the GET API is required to be sure of the quota fully qualified name.

Quotas for Compute Engine

Compute Engine isn’t as severe as BigQuery in terms of cost. It’s more progressive and costs accumulate over the time. However, I already saw bad actors creates dozens of large Compute Engine instances to mine BitCoins.

The bad actors got a service account key file of a development project, poorly protected, and used it to create the instances. That’s why, I strongly recommend not generating these sensitive files when you can!

Hopefully, Google Cloud has automatic monitoring and alerting to contact the project owner in case of suspicious activity. And the cost impact was low but it could be worse!

If you have a closer look at the Compute Engine CPU default quotas, a corporate account can use up to 2400 vCPUs (N1 family) in several regions! For testing purposes, only a few CPUs in only one region is enough. You can set a quota to limit that.

In the same time, this quota will reduce the (potential) attack impacts, the services misuses, and the inherent cost induced by the neglectful users that forget to stop resources…

Compute Engine quota limitation by API

As with BigQuery, to start with quotas on Compute Engine, a GET query to view and discover the consumer quotas is the good starting point.

gcurl https://serviceusage.googleapis.com/v1beta1/projects/<projectId>/services/compute.googleapis.com/consumerQuotaMetrics

There are tons of quotas, by region, by CPUs family, by Network, by GPUs family,… Like this, you can, for example, discard the use of certain CPUs families or GPUs on test projects

I will simply focus on the N1 CPU limit. You should be able to see this

"name": "projects/751286965207/services/compute.googleapis.com/consumerQuotaMetrics/compute.googleapis.com%2Fcpus",
"displayName": "CPUs",
"consumerQuotaLimits": [
{
"name": "projects/553150410541/services/compute.googleapis.com/consumerQuotaMetrics/compute.googleapis.com%2Fcpus/limits/%2Fproject%2Fzone",
"unit": "1/{project}/{zone}",
"isPrecise": true,
"metric": "compute.googleapis.com/cpus",
"quotaBuckets": [
{
"effectiveLimit": "-1",
"defaultLimit": "-1"
}
]
},
{
"name": "projects/751286965207/services/compute.googleapis.com/consumerQuotaMetrics/compute.googleapis.com%2Fcpus/limits/%2Fproject%2Fregion",
"unit": "1/{project}/{region}",
"isPrecise": true,
"metric": "compute.googleapis.com/cpus",
"quotaBuckets": [
{
"effectiveLimit": "24",
"defaultLimit": "24"
},
{
"effectiveLimit": "2400",
"defaultLimit": "2400",
"dimensions": {
"region": "asia-northeast1"
}
},
{
"effectiveLimit": "2400",
"defaultLimit": "2400",
"dimensions": {
"region": "us-east1"
}
},
......

As you can see, there isn’t a defined limit, and thus the limits by default are enforced, by region (look at the unit field): up to 2400vCPU N1 in many regions!

Let’s reduce that! Get a copy of the name field and define this new override body, this time with a dimensions: the region.

{
"overrideValue": "5",
"dimensions": {
"region": "us-east1"
}
}

Set the max CPU to 5 in the us-east1 region

And post it to the Quotas API.

gcurl -d <BODY> \ https://serviceusage.googleapis.com/v1beta1/<NAME>/consumerOverrides

Same pattern as the BigQuery part, only the name of the quota changed.

After a successful call (don’t forget the force query param), you can perform another get to the API to check the update. Only the us-east1 region has been changed.

{
"effectiveLimit": "5",
"defaultLimit": "2400",
"consumerOverride": {
"name": "projects/751286965207/services/compute.googleapis.com/consumerQuotaMetrics/compute.googleapis.com%2Fcpus/limits/%2Fproject%2Fregion/consumerOverrides/Cg1RdW90YU92ZXJyaWRlGhIKBnJlZ2lvbhIIdXMtZWFzdDE=",
"overrideValue": "5"
},
"dimensions": {
"region": "us-east1"
}
},

You can also enforce the quota in all regions. For that, remove the dimensions in the body to set a uniform quota to all regions. It’s very handful when you want to allow the use of only 1 region:

  1. Set 0 to all regions
  2. Set a different quota in a specific region to allow using only it.

Compute Engine quota limitation with Terraform

You can achieve the same thing with terraform.
Here again, you need to know exactly the service, metrics and the limit name that you want to override. The first discovery step by GET API request isn’t optional!

provider "google" {
project = "<PROJECT_ID>"
region = "us-central"
}
resource "google_service_usage_consumer_quota_override" "override" {
provider = google-beta
project = "<PROJECT_ID>"
service = "compute.googleapis.com"
metric = "compute.googleapis.com%2Fcpus"
limit = "%2Fproject%2Fregion"
override_value = "5"
dimensions = {
region = "us-east1"
}
force = true
}

No longer fear innovation, embrace it!

With Quotas API, you can now limit the amount of resources usable per service and per project without limiting the authorizations to just some services because you fear their possible cost!
If your data scientists need GPUs, no problem, but they will be able to get only a few in only 1 region, not several dozens per region!

You can even automate these quotas limits, per type of environment, per folder or whatever you want with the terraform module.

You no longer have a reason to fear innovation from your teams, you just have to encourage and embrace it!

--

--

guillaume blaquiere
Google Cloud - Community

GDE cloud platform, Group Data Architect @Carrefour, speaker, writer and polyglot developer, Google Cloud platform 3x certified, serverless addict and Go fan.