Application Telemetry with Prometheus

In my last blog, I’ve explained about deployment pipeline I built for Continuous Integration and Continuous Delivery in ABAP. Another thing that our team has built was the application monitoring.

For our Java Spring Boot services, it can seamlessly integrate Prometheus module in the POM file and then your application is ready to be monitored with built-in metrics.

I managed to create Prometheus client for ABAP. I will explain how you can setup one for yours below.

What is Prometheus?

Prometheus is a time series database. It stores your data stream and it also has a web interface so you can query and visualize the data e.g. into a graph.

Imagine a temperature measured from a thermometer at regular interval. This is a time series data. In IT operations, this can be like CPU usage, memory allocated, etc. But in our case, we just want to monitor our application.

The concept I like is that the server will poll the data from applications instead of getting applications sending the data to the server. This ensures that monitoring will never break your application. If the monitoring server down, your applications are still running fine.

Visit their website to learn more.

Why do we need to monitor our application?

Well, have you ever wondered about the feature that you build, how much is it used by the users? How fast or slow is it? What if you want to experiment which button color would attract the customer better?

To answer these questions, you need some kind of monitoring, some kind of metric that you can measure.

Let’s start!

In this example, I will show you how we can monitor usage and response time of ABAP HelloWorld RESTful APIs.

Cloning repository

If you have a problem cloning the class and Shared Memory Area ZCL_SHR_PROMETHEUS_AREA then you can create it on your own as this class is automatically generated from the transaction SHMA. Make sure you configure it precisely as shown below.

Recording metrics in your application

Prepare runtime for duration metric.

Create a new instance attribute for the timer instance.

DATA runtime TYPE REF TO if_abap_runtime.

In the constructor, create the timer instance.

METHOD constructor. 
super->constructor( ).
me->runtime = cl_abap_runtime=>create_hr_timer( ).
ENDMETHOD.

Create a new method in the REST resource class (ZCL_REST_RESOURCE).

METHODS record_metric 
IMPORTING
i_method TYPE string
i_response TYPE i.

The method accepts HTTP method and response status code which we will put in the metrics.

METHOD record_metric. 
TRY.
zcl_prometheus=>set_instance_from_request( me->mo_request ).
zcl_prometheus=>write_multiple( VALUE #(
( key = |hello_count\{method="{ i_method }",status="{ i_response }"\}| value = '1' command = zif_prometheus=>c_command-increment )
( key = |hello_duration\{method="{ i_method }",status="{ i_response }"\}| value = me->runtime->get_runtime( ) ) ) ).
CATCH cx_root.
ENDTRY.
ENDMETHOD.

set_instance* methods will set the Shared Memory Area instance that you can see in the transaction SHMM. So you can use one instance for each application and the metrics can be kept and queried separately.

set_instance_from_request will set the instance from the root endpoint by default (i.e. hello) if not explicitly specified by instance attribute or query parameter.

There are three fields you need to pass when you want to record metrics (via write* methods)

  • key is the metric name and its label according to Prometheus data model and metric naming convention.
  • value is the value to record
  • command is optional and is ‘overwrite’ by default. Using increment will add value to the current so you don’t need to read and write on your own.

Surround the method with try…catch… so this codes will (almost) never break your application.

Then you call this method at the end of each REST handler. Don’t forget to start the timer at the beginning.

For example:

METHOD if_rest_resource~get.
me->runtime->get_runtime( ).
.
.
.
record_metric( i_method = mo_request->get_method( ) i_response = cl_rest_status_code=>gc_success_ok ).
ENDMETHOD.

Creating the metrics endpoint

Next, you need an endpoint for Prometheus server to call and grab the metrics data.

In REST handler class (ZCL_REST_HANDLER), insert the following routing string in the method if_rest_application~get_root_handler.

lo_router->attach( iv_template = '/hello/metrics' iv_handler_class = zcl_prometheus_rest_resource=>c_class_name ).

Testing your endpoint

Open your browser and open your /hello/metrics endpoint. You should see a blank page.

Try using your application so the metrics are recorded.

In case you’re using Postman to test the APIs you can import my Postman files from here.(Please note Postman is a commercial product but there’s a free version available)

After one GET and one POST request, refresh the metrics page and you should start seeing the data.

Now, your application is ready to be monitored

Setting up Prometheus Server

Download Prometheus from this page. Extract the package and you can run the executable without installing.

Note: If you want to install as a window service, you can use NSSM.

Open browser and go to http://localhost:9090. and you should see its web UI.

Adding a new job to monitor your application

Now, your Prometheus server does not yet recognize your application so you need to configure it first.

Edit the file prometheus.yml and add the following lines:

 - job_name: npl-test
params:
sap-client: ['001']
sap-language: ['EN']
metrics_path: /test/hello/metrics
basic_auth:
username: DEVELOPER
password: yourpassword
static_configs:
- targets:
- vhcalnplci.dummy.nodomain:8000

In your secured environment, you may want to use HTTPS like this:

  - job_name: npl-test

scheme: https
params:
sap-client: ['001']
sap-language: ['EN']
metrics_path: '/test/hello/metrics'
basic_auth:
username: DEVELOPER
password: yourpassword
tls_config:
insecure_skip_verify: true
static_configs:
- targets: ['vhcalnplci.dummy.nodomain:44300']

Please note that you should add one job for each application server. Don’t use load balance URL as the metrics are bound for each application server. Go to transaction SM51 to see the list of all application servers on the system.

After saving, restart the service and access the web UI. Query for up metric and click Execute. If your job is setup properly, you should see the value 1.

You may try querying your application metric e.g. hello_count.

Setting up Grafana

Prometheus is good at collecting and querying time series data but to have a better a visualization you may need Grafana.

Grafana Sample Dashboard

What is Grafana?

Grafana is a tool for data visualization & monitoring with support for Graphite, InfluxDB, Prometheus, Elasticsearch and many more databases. In short, Grafana will pull the data from Prometheus and visualize them on their dashboard web interface.

We setup a desktop PC with two monitoring screens and put it where the team can see easily.

Installation

Download from here. Extract and run it the same way you do for Prometheus.

Setting up data source

First, we need to setup Grafana to recognize Prometheus server.

Go to http://localhost:3000 and log in with default username and password (i.e. admin:admin). Select Data Source from the menu and click Add Data Source. Fill in the connection to your Prometheus server like below:

Setting up dashboard

We’re going to setup 2 graphs to monitor API usage count and response time.

Select Dashboards from the menu, click Home and click + New Dashboard.

Select Graph. Click the graph title and select Edit.

On tab General, name your graph title as you wish.

On tab Metrics, you will specify which metric data will be displayed on this graph. Fill in hello_count and put {{method}} ({{status}}) in Legend format.

On tab Axes, you can customize the graph axes.

Once done customizing, click Back to dashboard on the top-right.

Click + ADD ROW to add a new row and create a new graph.

Configure tab Metrics with hello_duration with the same Legend format.

Configure tab Axes like this:

Once done, you should see your dashboard like this.

Don’t forget to set the time range and refresh rate so your monitor screen keeps refreshing with the latest data.

Try to use your application and see how the graph reflect.


Like what you read? Give Chairat Onyaem a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.