Right way to alert on aggregated logs in Google Cloud

minherz
Google Cloud - Community
4 min readMay 6, 2023
Photo by Catherine Hughes on Unsplash

Alerting in Google Cloud is different from other platforms. Alert policies define conditions that trigger incidents which are confusingly also referenced as alerts. The incidents are tracked in Cloud Console. In addition, an alert policy can be configured to send notification via one or more notification channels. Using notification channels you can send a message to an email, a phone or Slack. It also allows you to create actionable alerts.

Alert policies are usually defined to be triggered when values of one or more monitored metrics meet a (un)desired condition during a defined period of time. There are use cases when alert triggering should be conditioned based on logs. Google Cloud documentation references this type of alerts as log-based alerts. The log-based alerts can be created from Log Explorer window in Google Cloud Console.

Using this UI you will be able to describe the alert condition using log filter. The UI will notify you that the scope of logs that the filter is applied to is limited to the current project (see the post about log scopes to understand what log scopes mean).

It can be a problem when you want to define an alert on a large collection of logs from multiple projects. Many companies follow compliance requirements and Google Cloud recommendations to aggregate logs in a single location for further investigative analytics and audit purposes. They configure Log sinks to route all logs in the organizations to one or more Log buckets in the single project that is designated for log retention. In the past it was implemented by creating a Log sink with destination set to a Log bucket. Unfortunately, it is impossible to set alerts on the logs that are routed this way. It is because these logs aren’t shown in the project scope of the destination project (where the log bucket is hosted). The only available option in the past was to use log-based metrics and to define the metric-based alert on these metrics. This workaround creates undesired overhead for DevOps teams. The solution is a recently (April 2023) released new sink service called Other project. Instead of selecting Logging bucket as a sink service you can select Other project and provide the designated project’s ID instead of the [PROJECT_ID] in the following pattern logging.googleapis.com/projects/[PROJECT_ID].

Use of the Other project sink service will let you define a log-based alert on the destination project for the sink because the scope for all routed logs will be set to the destination project.

Note of caution when routing audit logs

All logs routed using the Other project sink will be ingested to the _Default log bucket in the destination project. As such they are a subject to the _Default sink filters (see default bucket doc). The _Default sink has a predefined exclusion filter.

  NOT LOG_ID("cloudaudit.googleapis.com/activity") AND NOT \
LOG_ID("externalaudit.googleapis.com/activity") AND NOT \
LOG_ID("cloudaudit.googleapis.com/system_event") AND NOT \
LOG_ID("externalaudit.googleapis.com/system_event") AND NOT \
LOG_ID("cloudaudit.googleapis.com/access_transparency") AND NOT \
LOG_ID("externalaudit.googleapis.com/access_transparency")

The conditions in the exclusion filter prevent majority of audit logs to be ingested. If you plan to route audit logs using the Other project sink, remember to edit the exclusion filter of the _Default sink in the destination project to allow the audit log ingestion. See how to configure default settings documentation to apply the change globally.

Naive hands-on example

The following example shows how to define an alert to notify users when a project in an organization has been created or deleted. The commands assume that the organization ID is ORGANIZATION_IDand the project ID where the logs will be aggregated is PROJECT_ID.

Step 1 is to define the Other project sink that will aggregate audit logs about creating and deleting projects from all organization into a single place:

gcloud logging sink create sample-sink-1 \
logging.googleapis.com/projects/PROJECT_ID \
--organization=ORGANIZATION_ID \
--include-children \
--log-filter='logName:"cloudaudit.googleapis.com%2Factivity" AND protoPayload.methodName=("CreateProject" OR "DeleteProject")'

Step 2 defines a log-based alert to be triggered each time a project is deleted. You can capture additional data about the log entries that trigger the alert in the documentation.

gcloud alpha monitoring policies create --policy-from-file="alert-policy.json"

The content of the alert-policy.json will be as following:

{
"displayName": "Alert on project deletion",
"alertStrategy": {
"autoClose": "1800s",
"notificationRateLimit": {
"period": "300s"
}
},
"combiner": "OR",
"conditions": [
{
"conditionMatchedLog": {
"filter": "logName:\"cloudaudit.googleapis.com%2Factivity\" AND protoPayload.methodName=\"DeleteProject\""
},
"displayName": "Deleted projects",
}
],
"enabled": true,
}

The gcloud CLI command accepts JSON file that describes the log-based alert policy with the filter that match audit logs from any source and protoPayload field methodName to match the DeleteProject call:

logName:"cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName="DeleteProject"

The policy defines minimum time between notifications to be 5 minutes and each incident to automatically closed after 30 min.

--

--

minherz
Google Cloud - Community

DevRel Engineer at Google Cloud. The opinions posted here are my own, and not those of my company.