Quota Monitoring and Management Options on Google Cloud

Vipul Raja
Google Cloud - Community
7 min readMar 6, 2024

Introduction

This article is most relevant to enterprise that use Google Cloud at scale across various services and push GCP services beyond their threshold quota levels. If your pattern is to use shared projects for multiple teams, which further enhances the limits to which a GCP service might be used within a project, you are at the right place. This premise sets us up with the problem statement, which is the need to request Quota increases frequently to avoid service interruption / negative user experience.

In this article we will summarize some of the options that are available to use for monitoring and managing Quota / usage and the monitoring of these. The options range from simple to deploy and manage, to complex with overhead management needs.

I recommend evaluating all the options and following the “less is more” approach. Start with the OOB (Out Of Box) Quotas Page and the associated APIs, also, if you are an enterprise GCP customer then you likely also have a Google Cloud Technical Account Manager (TAM)working with you, they are a magic wand that’s available to you to assist with QIRs and Capacity management in general. If those no longer work for your use case, then evaluate the Quota Monitoring Solution.

Quotas Page OOB

This section describes the OOB (Out Of Box) capabilities that are present on the “all quotas” page within a project. This is of course the least implementation heavy option and can easily fulfill basic use cases. To reach the page, type “all quotas” in the top search bar within a project. This will bring you to “Quotas and system limits” page.

Some useful features that are available within the page are described below.

  1. OOB Service usage Dashboard: The page has a dashboard that is pre-populated with all of the Google Cloud services and their respective usage percentages. You can order the dashboard by descending order of usage percentage among other things.
  2. Alerting policies: The page provides you with 2 options for creating service usage alerts; 1) The first one is per service, you can hover on any service on the service usage dashboard and configure an alerting policy at a threshold of your choice with a notification channel of your choice. 2) The second one is the ability to create a blanket alerting policy for multiple services at once. To set this up, click on “manage alert policies”, which will take you to the alerting setup, here you can set up your alerts using the “builder” with predefined metrics or use (Prom)MQL to write your own query for alerting. For eg. Below code is a sample code for setting an alert policy for service usage for all services within a project and it alerts when the service usage spikes above 80%. To tweak the threshold, modify the “condition gt(ratio, 0.8 ‘1’)”

{

“displayName”: “Quota Monitoring Alerting”,

“documentation”: {

“subject”: “Quota usage threshold exceeded 80%, proactive action recommended”

},

“userLabels”: {},

“conditions”: [

{

“displayName”: “Quota Monitoring Chart”,

“conditionMonitoringQueryLanguage”: {

“duration”: “3600s”,

“trigger”: {

“count”: 1

},

“query”: “fetch consumer_quota\n| filter resource.service =~ ‘.*’\n| { t_0:\n metric ‘serviceruntime.googleapis.com/quota/allocation/usage’\n | align next_older(1d)\n | group_by [resource.project_id, metric.quota_metric, resource.location],\n [value_usage_max: max(value.usage)]\n ; t_1:\n metric ‘serviceruntime.googleapis.com/quota/limit’\n | align next_older(1d)\n | group_by [resource.project_id, metric.quota_metric, resource.location],\n [value_limit_min: min(value.limit)] }\n| ratio\n| every 1m\n| condition gt(ratio, 0.8 ‘1’)”

}

}

],

“alertStrategy”: {

“autoClose”: “604800s”

},

“combiner”: “OR”,

“enabled”: true,

“notificationChannels”: [

“projects/$project-name$/notificationChannels/$notifcation-channel$”

],

“severity”: “SEVERITY_UNSPECIFIED”

}

Terraform Implementation

A super neat capability that I personally love, is the ability to set all of this up via code (IaC). No need to manage this within your environment as an outlier from within the Google Cloud UI, you can make it live with your Terraform configuration and manage it centrally! Refer to the below sample Terraform configuration.

# Notification Channel

resource “google_monitoring_notification_channel” “default” {

display_name = “Quota Monitoring Alert”

type = “email”

project = var.project_id

labels = {

email_address = “example@yourenterprise.com”

}

}

# Alert

resource “google_monitoring_alert_policy” “default” {

display_name = “[${upper(var.environment)}] ${title(“Quota Monitoring”)}”

enabled = true

project = var.project_id

combiner = “OR”

conditions {

display_name = “main-condition”

condition_monitoring_query_language {

query = “fetch consumer_quota| filter resource.service =~ ‘.*’| { t_0: metric ‘serviceruntime.googleapis.com/quota/allocation/usage’ | align next_older(1d) | group_by [resource.project_id, metric.quota_metric, resource.location], [value_usage_max: max(value.usage)] ; t_1: metric ‘serviceruntime.googleapis.com/quota/limit’ | align next_older(1d) | group_by [resource.project_id, metric.quota_metric, resource.location], [value_limit_min: min(value.limit)] }| ratio| every 1m| condition gt(ratio, 0.8 ‘1’)”

duration = “60s”

trigger { count = “1” }

}

}

alert_strategy {

auto_close = “86400s”

}

notification_channels = [

google_monitoring_notification_channel.default.name

]

}

Documentation reference for more details: Monitor and alert with quota metrics | Documentation | Google Cloud

3. Quota Increase Requests: Another natively supported capability on the page is to submit and manage Quota Increase requests associated with a project. You can switch to the “Increase requests” tab to create new requests and manage previous ones.

Quota Monitoring Solution (QMS)

If you are looking for an automated and comprehensive way to manage quotas over a large number of projects, we have an internally built Quota Monitoring Solution. It has been adopted by more than a 100 customers and is supported by an implementation and management team at Google Cloud.

This solution is more complex and demands additional overhead. This is a more aggregated and comprehensive solution that should be managed by a team that is responsible for managing multiple folders, projects and likely an organization within GCP.

Some key features:

  • A dashboard for real-time quota visibility.
  • Aggregated folder / organization quotas.
  • Alerting via emails or other tools.
  • Easy and scalable deployment.

Read more about Quota Monitoring Solution.

Check out the code to deploy it here.

Solution roadmap.

Quotas associated API’s

Service Limit / Quota Recommender

The service limit recommender analyzes usage of service quotas by projects in your organization and provides recommendations that help you identify resources that may be getting close to their quota limits. Service limit recommender analyzes your quota utilization and provides you with the following features to help you catch potential bottlenecks before they become an issue:

  • Recommendations to review quotas with high utilization
  • Usage insights for each quota with high utilization

The service limit recommender analyzes usage over rate, allocation, and concurrent quotas over the last 30 days. If at any point during those 30 days your utilization has hit 80% of your current limit, a recommendation will be generated.

Before you can view the insights and recommendations, you must do the following:

  • You must enable the Recommender API. You only need to enable the API on a single project. You can then use this same project to examine recommendations and insights for other projects by using the — billing-project functionality of gcloud/API.
  • Make sure that you have one of these required roles assigned:
    View recommendations: recommender.serviceLimitViewer
    View and update recommendations: recommender.serviceLimitAdmin

Learn more about the Service limit (quota) recommender.

Quota Adjuster

Note that Quota Adjuster is Pre-GA (at the time of writing) and is subject to the “Pre-GA Offerings terms”.

Once you enable quota adjuster on a project, it begins monitoring all applicable quotas and applies the following logic:

  • Quota adjuster checks if the peak usage has approached the quota value during a specified duration.
  • If so, quota adjuster attempts to increase the quota value (typically around 10–20%).

If it’s possible to increase the quota value, the increase is approved and the value adjusted. It is a good call out that the QIR’s made by the Adjuster can be rejected similar to the ones you make manually. You can still manually request increases to quota values at any time, whether or not quota adjuster is enabled. The history of all Quota Increase Requests made by quota adjuster is available in the Quotas page of Google Cloud console. You can also set alerts to monitor any changes initiated by Quota Adjuster.

During this preview, the following Compute Engine quotas are available for automated increases using quota adjuster:

  • CPUs
  • N2 CPUs
  • N2D CPUs
  • C2 CPUs
  • C2D CPUs
  • Persistent Disk Standard (GB)

To enable quota adjuster on your Google Cloud project:

  1. Navigate to the IAM & Admin > Quotas page.
  2. Click the Configurations tab.
  3. Click the Enable switch.

After the preview period, Quota Adjuster is planned to be a default configuration in all new projects, with an option to opt-out.

Cloud Quotas API

The Cloud Quotas API lets you programmatically adjust quotas and automate quota adjustments in your Google Cloud projects, folders, or organizations. Adjustments can be made to increase or decrease quota values.

The Cloud Quotas API can be used to:

Automate quota adjustments

You can use the Cloud Quotas API to request quota increases when certain conditions are met. For example, to avoid quota exceeded errors, you can use the API to programmatically request a quota increase when Compute Engine resources reach 80% of the available quota.

Scale quota configurations across projects

The Cloud Quotas API can clone your quota configurations from project to project. If there is a known set of quotas that need to be increased for every new Google Cloud project, you can integrate the API into the creation logic of your project to automatically issue a QIR. All quota increases are subject to Google Cloud review and approval.

Serve customer quota requests

If you are a SaaS provider integrated with Google Cloud, you may receive quota increase requests through a customer-facing portal other than Google Cloud console. These requests must be forwarded to Google Cloud for processing. The Cloud Quotas API can automatically forward such customer requests.

Enable client configuration version control

The Cloud Quotas API is declarative, you can treat quota configs as code and store configurations in your own version controlled system for history and rollback.

Learn more about the Cloud Quotas API and managing quotas using API.

In conclusion, there are quite a few Quota Monitoring and Management options like I described above. If you are just attempting to solve for this problem, start with the basics — All quotas page, Quotas API and your TAM (Technical Account Manager), then work your way up as and when needed to the Quota Monitoring Solution with which your Google Cloud TAM can help out as well.

--

--

Vipul Raja
Google Cloud - Community

Technical Solutions Consultant @Google, focused on DevOps and ML Infra