Downsampling and Exporting Google Cloud Monitoring Data

Published in

Google Cloud - Community

6 min readMar 27, 2019

Update: This post has been updated to include changes to the platform since this was first posted:

Stackdriver was rebranded to Google Cloud Monitoring.
There were breaking changes to the Cloud Monitoring Python API. See the Migration Guide for more details.

Google Cloud Monitoring contains a wealth of information about cloud resource usage, both for Google Cloud Platform (GCP) and and other sources. This post will explain how to use the Cloud Monitoring API to read, downsample, and export data to BigQuery. Pub/Sub metrics will be used to demonstrate this. You can change the metric to a type more relevant to you or follow the [Pub/Sub Quickstart](https://cloud.google.com/pubsub/docs/quickstart-cli) to generate the metric type used in the example here.

You may want to export monitoring data for a number of reasons, including

Ad-hoc queries of data, say for shifting and comparing of time series or to analyze data in a dimension other than time. This can be especially useful for identifying waste, like virtual machines with low CPU, wasted disk space, or data that is never accessed. Or you might have made a change to your application and want to compare efficiency and performance.
To keep metrics data for longer than the standard retention period. The post will also explain how to downsample the Timeseries data to reduce the volume of older data from the default 1 minute intervals that Cloud Monitoring uses to 1 hour intervals. Downsampling reduces data storage costs. Data over the extended period data can be used for long range forecasting or or analyzing long term trends. You may want to overlay external data, such as economic conditions or seasonal events.
To save data from a special event, like a performance test or a marketing launch

Colab is a variant of the iPython provided by Google. Colab is an ideal tool to implement our goal because it allows scripting with the Python client API to experiment and explore data. Colab also has many for Google APIs built in and supports executing some shell commands. When you get the exporting of your data working in Colab you might want to move it to a regular job driven by cron or Google Cloud Scheduler.

To get started, import the monitoring_v3 APIs into Colab with the command

!pip install --upgrade google-cloud-monitoring

The command uses the Colab ! directive to execute a shell command.

Authenticate to GCP with the Python statements

from google.colab import auth
auth.authenticate_user()

This will open a new browser window for authentication.

Import the monitoring API and create a client object with the following statements

from google.cloud import monitoring_v3
client = monitoring_v3.MetricServiceClient()

Let’s set up some variables to hold input values:

import datetimetarget_project = '[Your project]'
start = datetime.datetime(2019,3,5, 0, 0, 0, 0)
end = datetime.datetime(2019,3,6, 0, 0, 0, 0)
topic_bytes = 'pubsub.googleapis.com/topic/byte_cost'

We are going to collect data from the target project for the one day period beginning on 2020–11–10 at 1:00 am and ending on 2020–11–10 at 2:00 am UTC. Suppose we have many projects, then we can change the value of target_project and export the data to a home project. That will enable collection of metrics data from many projects into a central location.

One of the Pub/Sub metrics that we will export is topic/byte_cost. There are many more GCP Metrics that you could choose from, as well as other clouds and open source metrics.

We can find out about the metrics with the function below

def list_metric_descriptors(client, project_resource, metric):
  resource_descriptors = client.list_metric_descriptors(
      project_resource,
      'metric.type="{}"'.format(metric))
  for descriptor in resource_descriptors:
    print(descriptor)

This will give details on the metric kind (DELTA in this case), value kind (int64), description, and other details. If we look at the metric in a monitoring dashboard we will see something like the chart below for a low traffic application:

From this chart we can see that a volume of approximately 4.88 KB are sent over the Pub/Sub service every 5 minutes.

We can downsample the time series with the function below

from google.cloud.monitoring_v3.types.metric_service import ListTimeSeriesRequest
from google.protobuf import duration_pb2 as duration
from google.protobuf import timestamp_pb2 as timestampdef to_csv_delta_metric(client, project_resource, filter, start, end, frequency, colname):  """Exports downsampled metrics to a string buffer in CSV format
  Exports the timeseries created between start and end with given
  frequency in seconds to a string buffer in comma separated
  variable format
  """
  end_time = timestamp.Timestamp(seconds=int(end.timestamp()))
  start_time = timestamp.Timestamp(seconds=int(start.timestamp()))
  interval = monitoring_v3.types.TimeInterval(end_time=end_time,
                                              start_time=start_time)
  aggregation = monitoring_v3.types.Aggregation(
      alignment_period = duration.Duration(seconds=frequency),
      per_series_aligner =   
        monitoring_v3.Aggregation.Aligner.ALIGN_DELTA,
      cross_series_reducer =
        monitoring_v3.Aggregation.Reducer.REDUCE_SUM
  )
  req = ListTimeSeriesRequest(
      name = project_resource,
      filter = filter,
      interval = interval,
      view = monitoring_v3.ListTimeSeriesRequest.TimeSeriesView.FULL
  )
  results = client.list_time_series(req)
  csv = '{0},{1}\n'.format('time', colname)
  t = int(start.timestamp())
  ts_array = []
  for ts in results:
    ts_array.append(ts)
    if len(ts_array) > 0:
      ts = ts_array[0]
      for p in ts.points:
        t += frequency
        v = p.value.int64_value
        csv += '{0},{1}\n'.format(t, v)
      return csv
    else:
      print('Did not get any results back')

A filter is applied to select the relevant metrics, so more than one time series could be retrieved. This downsampling function is appropriate for ‘delta’ type metrics. See Kinds of Metrics for details on delta and other metric kinds. The series are aligned with a delta reducer. Delta type metrics are typically summed, which is what we do with the Aggregation object above. The results are put in a string buffer in comma separated form that we will load into Google Cloud Storage (GCS) below.

The buffer can be uploaded to GCS with the function

def upload_gcs(bucket, buf, filename):
  fname = '/tmp/{}'.format(filename)
  with open(fname, 'w') as f:
    f.write(buf)
  print('head {}:'.format(fname))
  !gsutil cp {fname} gs://{bucket}/

These functions can be called with the code below

metric_names = ['topic/byte_cost',
    'subscription/byte_cost',
    'topic/send_request_count',
    'topic/message_sizes']
colnames = ['topic_bytes',
    'sub_bytes',
    'send_request_count',
    'message_sizes']
frequency = 3600 # 1 hour
for i in range(len(metric_names)):
  filter =  'metric.type="pubsub.googleapis.com/{0}"'.format(metric_names[i])
  filename = '{0}.csv'.format(colnames[i])
  csv_buffer = to_csv_delta_metric(client,
      project_resource,
      filter,
      start,
      end,
      frequency,
      colnames[i])
  upload_gcs(bucket, csv_buffer, filename)

This will export time series for the four Pub/Sub metrics given downsampled to one hour intervals. To find out more about extracting metrics using the monitoring_v3 API see Reading Metric Data.

The data can be loaded into BigQuery with the statements below

for i in range(len(colnames)):
  filename = '{0}.csv'.format(colnames[i])
  tablename = colnames[i]
  !bq --project_id={home_project_id} \
      --location=US load \
      --autodetect \
      --source_format=CSV {dataset}.{tablename} \
      gs://{bucket}/{filename}

Once the data is loaded into BigQuery we can query on it. We could have done queries on it above after reading the timeseries from Stackdriver above but, in general, we want to load the data into BigQuery on a regular basis and then come back some time later and do ad-hoc queries on it.

Let’s do a simple query to verify that the downsampling is correct. The data can be queried from BigQuery in Colab with the BigQuery client API.

from google.cloud import bigquerybq_client = bigquery.Client(project=home_project_id)
df = bq_client.query('''
    SELECT
      time, topic_bytes
    FROM `{0}.topic_bytes`'''.format(dataset)).to_dataframe()print(df)

The data from the query is read into the Pandas Dataframe df.

The data can be viewed using a Python graphics utility, such as matplotlib. Pandas can also be used to create a simple chart with the statement below

df.plot.bar(y='topic_bytes',
            title='Topic Byte Cost',
            color=['darkgrey'])

This results in the chart

Downsampled data (x axis is time, y is bytes)

Let’s check our math. Notice that we downsampled for frequency of one hour over a total interval of one day. There are 24 bars in the chart above, so the time adds up to the original period. Eyeballing the average hourly value is slightly below 60,000 bytes. From the chart above, we had 4.88 kB in a 5 minute interval. 12 * 4.88 = 58.56 kB per hour, which is close to the expected value (that is a relief).

Downsampling and Exporting Google Cloud Monitoring Data

Written by Alex Amies