Getting started with Google Cloud Monitoring APIs— Part 1
If you have your application deployed, you particularly want to focus on pushing features as soon as possible and not spend time checking if your services are down, right? This is where monitoring comes into picture.
Google cloud operations suite, earlier known as Stackdriver provides a built-in monitoring dashboard for you to check various metrics from your projects on either GCP or in hybrid environments.
Let’s say you want to make your custom monitoring dashboards on GCP, and want to automate this process (or just that you are a developer more comfortable in the terminal/editor and don’t want to use the UI ). You can use google-cloud-monitoring
— a client library for the monitoring APIs.
In this blog, I’m going to create a custom metric first and then use it to create our dashboard, without leaving our editor! ✨
Our focus will be on two methods specifically — create_metric_descriptor
and create_time_series
. You don’t necessarily have to know how APIs work but some python knowledge is a good-to-have.
What is a metric anyway?
Metrics on GCP have the three elements listed below:
- A set of data points.
- Metric-type information, which tells you what the data points represent.
- Monitored-resource information, which tells you where the data points originated. This can be anything from your VMs to K8s clusters or any task that has been defined by you.
You can try out creating custom metrics in a Jupyter notebook on the AI platform in your project on GCP or follow these steps for authentication to use the monitoring APIs — Setting up authentication.
First things first! If you are a a python developer you know the drill already :D
pip install google-cloud-monitoring
Creating custom metrics and time series
Your custom metric should have a string identifier unique among all the metric names in your google cloud project, and prefixed with custom.googleapis.com
. So if your metric is called my-metric
, the identifier will look like custom.googleapis.com/my-metric
.
Note: Because these metric identifiers have to be unique, you can understand that if you try to write data to an existing metric i.e. one that already has a defined source, you will receive an error. I was stuck at a 403 for some time! A tip that always works for me is to refer to the library’s source code to see what is happening how! Or simply keep printing out stuff at every step 😛
Now coming to monitored resource, let’s say this custom metric of ours is being generated by a virtual machine, so your resource type is gce_instance
. You can find a list of resource types here. You can add a field ‘labels’ to specify any labels you want for the resource. For example, instance_id
or zone
. Remember that this is additional information about our data source.
.create_metric_descriptor()
If you write metric data when a metric descriptor for that custom metric doesn’t yet exist, then a metric descriptor is created automatically. There is a limit to auto-creation of metric descriptors, so prefer adding this data yourself.
Of the two components for the descriptor — metric_kind
and value_type
, I was particularly confused with the metric kind
. The kind of metric data tells you how to interpret the values relative to each other. It can be either GAUGE
, which represents the value of a metric at a particular point, unlike DELTA
, where value measures the change since it was last recorded.
descriptor = client.create_metric_descriptor(name=project_name, metric_descriptor=descriptor)
Now that we have a custom metric, we can write our data points to display them on the monitoring dashboard.
You write data points by passing a list of TimeSeries
objects to timeSeries.create
. The maximum list size is 200. Each TimeSeries
object must contain only a single Point
object. If you want to write more than one point to the same time series, then use a separate call to the timeSeries.create
.
.create_time_series()
You can thus push values against an interval for a particular time series, by changing the value for key double_value
and running the code a few times. This can be used to push data as soon as a change in metric value happens. You can create triggers using async tasks in celery or make use of cloud pub/sub. Don’t write data to a single time series faster than one point each 10 seconds.
Check the monitoring dashboard under metrics explorer to watch your time series making waves! 🥳
For more information, refer to https://cloud.google.com/monitoring/custom-metrics/creating-metrics, or ping me on twitter at @arpana_naa!