Stories by Ingrid Jardillier on Medium

Create a nice bar chart in Kibana Vega step by step from Elasticsearch data

Ingrid Jardillier — Wed, 29 Jan 2025 07:50:14 GMT

In a previous article (Using transformations in Kibana Vega to adapt data from query DSL), we saw how to retrieve data from Elasticsearch, enrich it with static data sources, and transform it to adapt it for simplified exploitation. This data represented JVM memory by cluster (environment) and role (tier), derived from monitoring data.

For the record, the main data source (named ”jvm”) after transformation gave the following results:

In this article, we’ll see how to use our reworked data source to produce a bar chart visualization in Kibana Vega. The expected result is the following:

Bar chart visualization

Definition

The bar chart graph we want to set up will therefore have:

on the X-axis: the JVM memory in Gb
on the Y-axis: the clusters (environments)

Each environment will also have a series by role (tier).

Set up in Vega

As we already have defined the data for Vega in the previous article, we will only care about visualization creation, ie the scales, axes, marks, and legends in the Vega definition.

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "title": {
    "text": "JVM by cluster and role", 
    "color": "black"
  },
  "description": "Information about JVM on clusters",
  "padding": 15,
  "background": "#FFFFFF",
  "config": {
    "title": { "fontSize": 20 }
  },
  "data": [
    // already done in previous article
  ],
  "scales": [
    // TODO
  ],
  "axes": [
    // TODO
  ],
  "marks": [
    // TODO
  ],
  "legends": [
    // TODO
  ]
}

Axes definition

We will start by defining our 2 axes. These axes must be scaled relative to the values we wish to display.

X-axis

The X-axis will be displayed at the bottom of our graph (“orientation”) and the value labels (“labelColor”) will be black. So we will have to base ourselves on a scale named “xscale” which we will define just after. What gives:

"axes": [
  {
    "orient": "bottom", 
    "scale": "xscale", 
    "labelColor": "black"
  }
]

The scale to be implemented will therefore be of the “linear” type, function of the value of the JVM memory (data source “jvm”), the scale has to go from 0 to the value “total_jvm” available in the data source and which occupies all the available width (“range”).

{
  "name": "xscale",
  "type": "linear",
  "domain": {
    "data": "jvm", 
    "field": "total_jvm"
  },
  "range": "width"
}

This implementation of the X-axis gives:

Y-axis

The Y-axis will be displayed on the left and will show the names of the clusters (environments).

"axes": [
  // ...
  {
    "orient": "left", 
    "scale": "yscale", 
    "labelColor": "black", 
    "tickSize": 0, 
    "labelPadding": 25
  }
]

The “tickSize” to 0 makes the tick on this axis disappear. And we add some padding after the label with “labelPadding”.

The scale for this axis will this time be of type “band” to allow us to group according to our clusters, it is the name of the cluster (“cluster_name”) that will be displayed as a label on this axis but the clusters on this axis will be sorted according to a predefined order contained in the field “cluster_id”, The axis will take all the space available in height (“range”).

"scales": [
    // ...
    {
      "name": "yscale",
      "type": "band",
      "domain": {
        "data": "jvm", 
        "field": "cluster_name", 
        "sort": {
          "op": "median", 
          "field": "cluster_id", 
          "order": "descending"
        }
      },
      "range": "height"
    }
]

This implementation of the Y-axis gives:

Legends

Now we’re going to look at how to add a caption. Why not do it last? To go step by step through the difficulty 😉 in this article but, in fact, it is possible to do it at the very end or as needed.

Legend values

First, we will create the legend with only the values of the roles (identifier allowing to order them). We want symbols (“type”) in the shape of a “square”. The title and labels will be in black (“titleColor” and “labelColor”). The legend will be placed at the bottom (“orient”), horizontally (“direction”). The “tickMinStep” will ensure we have only integer values.

"legends": [
  {
    "type": "symbol",
    "symbolType": "square",
    "fill": "color",
    "labelColor": "black",
    "title": "Roles",
    "titleColor": "black",
    "orient": "bottom",
    "direction": "horizontal",
    "tickMinStep": 1
  }
]

To limit the legend values (exclude an unused one, like data_cold), we could use the “values” attribute instead of “tickMinStep”.

"values" : [ 1, 3, 4] // remove 2 (data_cold)

The “fill” field is associated with a new scale (“color”), based on the “role_id” field and whose “range” will allow us to define the colors we want to use for our legend, for each value (we could have used predefined ranges).

"scales": [
  //...
  {
    "name": "color",
    "type": "linear",
    "domain": {
      "data": "jvm", 
      "field": "role_id", 
      "sort": {
        "op": "median", 
        "field": "role_id", 
        "order": "ascending"
      }
    },
    "range": ["#c0392b", "#f1c40f", "#27ae60", "#3498db"]
  }
]

Legend labels

Now that we have a good base for our legend, we will improve it by associating labels with the roles' names to each sorted “role_id.” For this, we will need another “scale” to display our “role” field (“range”) according to the “role_id” defined in the “domain.”

"scales": [
  //...
  {
    "name": "scale_legend_values",
    "type": "ordinal",
    "domain": {"data": "jvm", "field": "role_id"},
    "range": {"data": "jvm", "field": "role"}
  }
]

We need to update our legend to indicate that we want to display the role label defined in the scale created previously instead of the role value. To do this, we need to use a new property that we haven’t covered yet, namely “encode” which allows us to customize some properties, such as the title, labels, symbols, etc.

In our case, we, therefore, want to update the text of the “labels”, using the lookup table defined in the “scale_legend_values” scale.

"legends": [
    {
      //...
      "encode": {
        "labels": {
          "update": {
            "text": {
              "signal": "scale('scale_legend_values', datum.value)"
            }
          }
        }
      }
    }
  ]

This implementation of the legends gives:

Graphical Marks

We will cover the last part of this article, which discusses how to display the JVM memory value by cluster (environment) and role (tier). The goal is to use a bar chart visualization. In Vega, marks are a visualization's basic visual building block, providing basic shapes whose properties can be set.

Facetting by cluster

To create our bar chart, we first need to define a grouping because we want to group our data by cluster (environment). This will be done by using the “group” brand type and creating a “facet” on the name of our clusters.

"marks": [
  {
    "type": "group",
    "from": {
      "facet": {
        "data": "jvm",
        "name": "facet",
        "groupby": "cluster_name"
      }
    }
    // TODO
  }
]

We also need to make each cluster (environment) have its range on the Y-axis, but this time, for the bar chart, hence a starting point for the value of “y” which is based on the same “yscale” defined in a previous section for the axis display.

"marks": [
  {
    "type": "group",
    "from": {
      // ...
    },
    "encode": {
      "enter": {
        "y": {"scale": "yscale", "field": "cluster_name"}
      }
    }
    // TODO
  }
]

The added lines of code allow us to properly display our clusters (environments) on the Y-axis (one band reserved by cluster).

Scaling by role inside each cluster

Inside each band reserved for our clusters, we must redefine a scale that will allow us to correctly place the bar associated with each role (defined by the field “role_id”). This scale will be of type “band” and as for the environments, we will use the “range” property and set it to “height” to have a band of a fixed height per role. However, this time, we need a height that depends on the “yscale” scale and allows us to divide the height allocated to each environment by the number of roles. Therefore, we must use the “signals” functionality, ie, dynamic variables that parameterize a visualization and can drive interactive behaviors.

"marks": [
  {
    "type": "group",
    "from": {
      // ...
    },
    "encode": {
      // ...
    },
    "signals": [
      {
        "name": "height", 
        "update": "bandwidth('yscale')"
      }
    ],
    "scales": [
      {
        "name": "role",
        "type": "band",
        "range": "height",
        "domain": {"data": "facet", "field": "role_id", "sort": true}
      }
    ]
    // TODO
  }
]

Displaying bars

We can finally do what is necessary to display our bars representing JVM memory by cluster and role.

To do this, we will create “rect” type mark linked to the “facet” created previously. The following properties will need to be specified in the “encode” to be able to customize them.

“x” and “x2” are used to define the min (0) and max (based on “total_jvm” field) for the length of the rectangle on the X-axis.
“y” allows us to define where to place the rectangle on the Y-axis, in relation to our “role” subscale, based on the “role_id” field.
“height” will indicate that the height of the rectangle will be the height of “1” band allocated for a role.
“fill” is used to use our “color” scale for each role.

"marks": [
  {
    "type": "group",
    "from": {
      // ...
    },
    "encode": {
      // ...
    },
    "signals": [
      // ...
    ],
    "scales": [
      // ...
    ],
    "marks": [
      {
        "name": "bars",
        "from": {"data": "facet"},
        "type": "rect",
        "encode": {
          "enter": {
            "y": {"scale": "role", "field": "role_id"},
            "height": {"scale": "role", "band": 1},  
            "x2": {"scale": "xscale", "field": "total_jvm"},
            "x": {"scale": "xscale", "value": 0},
            "fill": {"scale": "color", "field": "role_id"}
          }
        }
      }
      // TODO
    ]
  }
]

This implementation of the “bars” mark gives:

The last thing to set up is to display the JVM memory value to the right of the rectangle. To do this, we will create another mark of type “text” based on our previous mark “bars” to display our text relative to our bar and with the associated data.

This time, we will define the following properties:

“x” will place us at the level of “x2” of the “bars”, to position ourselves to the right of our bars.
“y” allows us to indicate that we want to start from the “y” of our “bars” but with an “offset” allowing us to center on the “height” of the bar.
“fill” to use “black” as text color.
“align” and “middle” to center text.
“text” to set the text to the “total_jvm” value

These last lines allow us to arrive at the final version of our visualization:

We were thus able to create a step-by-step visualization with data from Elasticsearch from several data sources, and fully customized to best meet our needs.

This requires a little practice but the result is rather interesting. However, be careful: when you are looking for resources on the Internet, you can find some related to Vega and some others related to Vega-Lite. Kibana supports both, but some features are missing or different between the two.

Create a nice bar chart in Kibana Vega step by step from Elasticsearch data was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Using transformations in Kibana Vega to adapt data from query DSL

Ingrid Jardillier — Wed, 15 Jan 2025 07:54:23 GMT

When using Kibana to create visualizations, we often need to work on several indices/datastreams in order to retrieve all the data needed to build our visualization. However, this is not possible with traditional tools like Lens (the goal is not to create several layers with our different indices/datastreams but to aggregate them to make a single source containing all the necessary information).

Kibana Custom visualization : Vega

One solution is to use Kibana’s Custom visualizations, such as Vega, which allows you to use multiple data sources from static data or Query DSL queries. In the latter case, it can be difficult to exploit the result, the structure is not a simple table but a fairly complex JSON, especially when using aggregations.

In this article, we will see how to create several data sources in Vega, but above all how to transform them to have something simple as an output (table) containing all the relevant information that we want to display.

Example based on Monitoring metrics

To discuss the necessary steps, we will take an example based on data known by Elasticsearch users, namely the data contained in .monitoring-es-*. These indices contain all the metrics needed to monitor the different clusters managed by a team/company.

We will start with a simple example, namely, to sum up the total amount of JVM by tier, and this, for each cluster (environment) that we manage. These clusters are managed in Elastic Cloud and are therefore defined by a UUID.

The tiers are:

Hot
Warm (not used in our case)
Cold
Frozen

The managed environments are:

Production
PréProduction (Pre-Production)
Recette (Staging)
Intégration (Integration)
Monitoring

Creating the main query

Defining the main Query DSL

Our main query DSL will aim to retrieve the total JVM memory by cluster (cluster_uuid) and role, from the data in the node_stats dataset. A role can contain several nodes and the metric used to obtain the JVM is at the node level, so we must go down to the node level and then sum this JVM for all nodes in the same role, which gives:

POST /.monitoring-es-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "event.dataset": "elasticsearch.node.stats"
          }
        }
      ]
    }
  }, 
  "aggs": {
    "cluster": {
      "terms": {
        "field": "cluster_uuid",
        "size": 10
      },
      "aggs": {
        "role": {
          "terms": { 
            "field": "elasticsearch.node.roles",
            "include": "data_.*",
            "exclude": "data_content",
            "min_doc_count": 0
          },
          "aggs": {
            "node": {
              "terms": {
                "field": "elasticsearch.node.name",
                "size": 10
              },
              "aggs": {
                "max_jvm": {
                  "max": {
                    "field": "elasticsearch.node.stats.jvm.mem.heap.max.bytes"
                  }
                }
              }
            },
            "sum_jvm": {
              "sum_bucket": {
                "buckets_path": "node>max_jvm" 
              }
            }
          }
        }
      }
    }
  }
}

To make it easier to read, we can filter the output by using the “filter_path” attribute:

POST /.monitoring-es-*/_search?filter_path=aggregations,-**.doc_count,-**.doc_count_error_upper_bound,-**.sum_other_doc_count,-**.node

This will only keep the data useful for the rest by removing any superfluous and intermediate fields in the calculation of the JVM by tier.

This query therefore gives in output:

{
  "aggregations": {
    "cluster": {
      "buckets": [
        {
          "key": "PDIJyZMaSQOtFB2LEz9kwA",
          "role": {
            "buckets": [
              {
                "key": "data_hot",
                "sum_jvm": {
                  "value": 47764733952
                }
              },
              {
                "key": "data_cold",
                "sum_jvm": {
                  "value": 31893487616
                }
              },
              {
                "key": "data_frozen",
                "sum_jvm": {
                  "value": 15921577984
                }
              }
            ]
          }
        },
        {
          "key": "fN5Y2U2HRsK1HKw1X_H77A",
          "role": {
            "buckets": [
              {
                "key": "data_hot",
                "sum_jvm": {
                  "value": 5876219904
                }
              },
              {
                "key": "data_cold",
                "sum_jvm": {
                  "value": 884998144
                }
              },
              {
                "key": "data_frozen",
                "sum_jvm": {
                  "value": 0
                }
              }
            ]
          }
        },
        {
          "key": "ARezM52EQhGoxHoFJrm1oA",
          "role": {
            "buckets": [
              {
                "key": "data_hot",
                "sum_jvm": {
                  "value": 5876219904
                }
              },
              {
                "key": "data_cold",
                "sum_jvm": {
                  "value": 0
                }
              },
              {
                "key": "data_frozen",
                "sum_jvm": {
                  "value": 0
                }
              }
            ]
          }
        },
        {
          "key": "OtrL3B00RsuFpcOVCAQ08Q",
          "role": {
            "buckets": [
              {
                "key": "data_hot",
                "sum_jvm": {
                  "value": 15728640000
                }
              },
              {
                "key": "data_cold",
                "sum_jvm": {
                  "value": 0
                }
              },
              {
                "key": "data_frozen",
                "sum_jvm": {
                  "value": 0
                }
              }
            ]
          }
        },
        {
          "key": "erIavvd3TG6naKWwGag2TQ",
          "role": {
            "buckets": [
              {
                "key": "data_hot",
                "sum_jvm": {
                  "value": 884998144
                }
              },
              {
                "key": "data_cold",
                "sum_jvm": {
                  "value": 884998144
                }
              },
              {
                "key": "data_frozen",
                "sum_jvm": {
                  "value": 0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

we realize that it is not easy to exploit the result JSON to create a visualization..

Integrating the main query into a Vega datasource

Let’s now integrate our Query DSL into a Vega datasource. We will therefore use the data section, respecting the Vega syntax:

give a name to the datasource in order to be able to use it later in the creation of our visualization or debug it easily
in the url part, specify the query and the body with the associated parts of our Query DSL
set the format to make it easier to access fields in the resulting JSON
prepare the transform attribute for future transformations

Which gives:

{
      "name": "jvm",
      "url": {
        "%context%": true,
        "%timefield%": "@timestamp",
        "index": ".monitoring-es-*",
        "query": {
          "bool": {
            // ...
          }
        }, 
        "body": {
          "aggs": {
            // ...
          },
          "size": 0
        }
      }
      "format": {"property": "aggregations.cluster.buckets"}
      "transform": [

      ]
    }

Debugging the data source in Vega

Kibana provides very handy inspection and debugging tools for Vega. Everything is done through the Inspect button in the toolbar.

The inspection pane provides 2 distinct views:
- View: Requests to visualize requests, responses and statistics on DSL requests
- View: Vega debug to inspect datasources as they are implemented

It is the latter that will interest us:

As we have set up a format, we see that it has been taken into account since we start to see the data from the sub-attribute “aggregations / cluster / buckets”. The first column displayed (“key”) is none other than the key of our highest level aggregation: “cluster”, therefore, the cluster_uuid. The second column indicates the number of documents that allowed the calculation of this aggregation and the last one is a JSON representing the value of our aggregation, i.e. the buckets from the sub-aggregations.

Transforming the main data source to make it usable

Contextualization of cluster information

The cluster UUID is a good starting point to know which cluster it is, but when displaying, we will prefer to have a cluster name which will provide the associated environment and perhaps also a clue allowing us to order our clusters according to an order of importance.

To do this, we will add a new static data source with the information allowing us to complete the missing information and make the link with our UUID:

{
      "name": "clusters",
      "values": [
        {"uuid": "erIavvd3TG6naKWwGag2TQ", "id": 1, "name": "Monitoring"}, 
        {"uuid": "OtrL3B00RsuFpcOVCAQ08Q", "id": 2, "name": "Recette"}, 
        {"uuid": "fN5Y2U2HRsK1HKw1X_H77A", "id": 3, "name": "Intégration"}, 
        {"uuid": "ARezM52EQhGoxHoFJrm1oA", "id": 4, "name": "PréProduction"}, 
        {"uuid": "PDIJyZMaSQOtFB2LEz9kwA", "id": 5, "name": "Production"}
      ]
}

We will now integrate the fields that interest us directly into our initial data source (jvm), using a transformation. To do this, we will use the lookup transformation:

{
    "type": "lookup",
    "from": "clusters",
    "key": "uuid",
    "fields": ["key"],
    "values": ["id", "name"],
    "as": ["cluster_id", "cluster_name"]
}

This transformation will use the “clusters” datasource, which key is the “uuid” field, make it correspond to the key of our current data source (“jvm”) which is the “key” field, and use this new data source to add the 2 fields named “id” and “name” to our current datasource with the specified names “cluster_id” and “cluster_name”.

At this step, the jvm dataset becomes the following:

Flattening tier data (roles)

We will now simplify the representation of the roles associated with each environment (cluster) by flattening the data, therefore, creating one line per output role (and this for each cluster). This can be done very easily using a flatten transformation:

{
  "type": "flatten", 
  "fields": ["role.buckets"],
  "as" : ["role"]
}

We use the field buckets (resulting of the aggregation) of the role column to flatten the data and erase the previous value of the column as we keep the same name in the “as” parameter:

The role colum now has this kind of value:

{
  "key": "data_hot",
  "doc_count": 25933,
  "node": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
      {
        "key": "instance-0000000031",
        "doc_count": 8649,
        "max_jvm": {
          "value": 15921577984
        }
      },
      {
        "key": "instance-0000000029",
        "doc_count": 8646,
        "max_jvm": {
          "value": 15921577984
        }
      },
      {
        "key": "instance-0000000030",
        "doc_count": 8638,
        "max_jvm": {
          "value": 15921577984
        }
      }
    ]
  },
  "sum_jvm": {
    "value": 47764733952
  }
}

Retrieving the JVM metric for each cluster / role

Now that we have one line per cluster and role as output, we will be able to easily access the field that interests us from a metric point of view “sum_jvm” (at the end of each role value) and rework it to convert it from bytes to Gb.

To do such a conversion, we will use the formula transformation:

{
  "type": "formula", 
  "as": "total_jvm", 
  "expr": "ceil(datum.role.sum_jvm.value / 1024 / 1024 / 1024)"
}

We create a new field named “total_jvm”, using the expression “expr”, ie, for each row, we use the current (“datum”) value field “role.sum_value.value”, convert it to Gb and rounded it, which now gives:

Keep only role name in the role column

As the remaining buckets per node in the role is just here to made it able to calculate the sum of JVM by role, we won’t use them further, so we can replace it by only the name of the role by also using a formula transformation:

{
  "type": "formula", 
  "as": "role", 
  "expr": "datum.role.key"
}

This formula will take each line of results, and apply the expression, so set the role value to the current role.key field, is the role name.

We now have a nice and usable output to be able to process it in a beautiful and easy Vega visualization.

Latest improvement for the order of roles

In the same way that we added a data source for clusters to easily order them in our future visualization, we will do the same for roles because we want to be able to order them in a logical order.

Let’s add a new data source for roles:

{
  "name": "roles",
  "values": [
    {"id": 1, "name": "data_hot"}, 
    {"id": 2, "name": "data_warm"}, 
    {"id": 3, "name": "data_cold"}, 
    {"id": 4, "name": "data_frozen"}
  ]
}

We then use the same kind of lookup transformation to add a new role_id field, with an ordering value:

{
  "type": "lookup",
  "from": "roles",
  "key": "name",
  "fields": ["role"],
  "values": ["id"],
  "as": ["role_id"]
}

The final result is the following one:

A little simpler to exploit than our initial JSON, right?

Conclusion about transformations in Kibana Vega

When using Vega in a Kibana context, ie using Query DSL to retrieve Elasticsearch data, the resulting output is not always easy to manipulate. Therefore, we may need to query multiple indices or merge information in other static data sources to get all information.

In these cases, transformations are a good way to adapt the output to better exploit the data in visualizations.

Take care that transformations are applied by Kibana, so on client side !

In a further article, we will create a Vega visualization based on the previous transformed data. Stay tune !

Using transformations in Kibana Vega to adapt data from query DSL was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

How to test a Ruby filter in Logstash

Ingrid Jardillier — Wed, 22 May 2024 07:51:44 GMT

In a previous article, we’ve seen how to share code in Logstash and create a module, in a ruby filter. In this one, we’ll show how to test our filter in order to verify that the resulting events are the expected ones.

About our previous code

Just for memory, the code was the following:

require './script/denormalized_by_prizes_utils.rb'

# The value of `params` is the value of the hash passed to `script_params` in the logstash configuration.
def register(params)
    @keep_original_event = params["keep_original_event"]
end

# The filter method receives an event and must return a list of events.
# Dropping an event means not including it in the return array.
# Creating new ones only requires you to add a new instance of LogStash::Event to the returned array.
def filter(event)

    items = Array.new

    # Keep original event if asked
    originalEvent = LogStash::Util::DenormalizationByPrizesHelper::getOriginalEvent(event, @keep_original_event);
    if not originalEvent.nil?
        items.push originalEvent
    end

    # Get prizes items (to denormalize)
    prizes = LogStash::Util::DenormalizationByPrizesHelper::getPrizes(event);
    if prizes.nil?
        return items
    end
   
    # Create a clone base event
    eventBase = LogStash::Util::DenormalizationByPrizesHelper::getEventBase(event);

    # Create one event by prize item with needed modification
    prizes.each { |prize| 
        items.push LogStash::Util::DenormalizationByPrizesHelper::createEventForPrize(eventBase, prize);
    }

    return items;
end

And the whole code of the denormalized_by_prizes_utils.rb:

module LogStash::Util::DenormalizationByPrizesHelper
    include LogStash::Util::Loggable

    # Keep original event if asked
    def self.getOriginalEvent(event, keepOriginalEvent)
        logger.debug('keepOriginalEvent is :' + keepOriginalEvent.to_s)
        if keepOriginalEvent.to_s == 'true'
            event.set('[@metadata][_index]', 'prizes-original');
            return event;
        end
        return nil;
    end

    # Get prizes items (to denormalize)
    def self.getPrizes(event)
        prizes = event.get("prize");
        if prizes.nil?
            logger.warn("No prizes for event " + event.to_s)
        end
        return prizes;
    end

    # Create a clone base event
    def self.getEventBase(event)
        eventBase = event.clone();
        eventBase.set('[@metadata][_index]', 'prizes-denormalized');
        eventBase.remove("prize");
        return eventBase;
    end

    # Create a clone event for current prize item with needed modification
    def self.createEventForPrize(eventBase, prize)
        eventPrize = eventBase.clone();
        # Copy each prize item value to prize object
        prize.each { |key,value|
            eventPrize.set("[prize][" + key + "]", value)
        }
        return eventPrize;
    end

end

Common syntax

In this section, we will show how to write functional tests, checking that resulting events are the expected ones.

We can write one or more test cases, and for each test case, as many tests as needed. These tests should be written at the end of the ruby filter file, ie, our main file, containing the filter with the register / filter functions.

A filter test should follow a specific syntax:

test "Test case name" do

    parameters do
    { 
        # The parameters to pass to the filter
    }
    end
    
    in_event { 
        # The event arriving in the filter process
    }

    # The tests with expect methods

end

Implement tests on our Ruby filter

In our example, we implemented denormalization, so in our tests, we will verify that we have well denormalized our original event, in different cases (keeping original event or not, one prize or two prizes in the prize list for example).

Test cases

So, we need the four test cases as presented below:

test "Case 1: one prize in event / don't keep original event" do

    parameters do
    { 
        "keep_original_event" => false
    }
    end

    in_event { 
        { 
            "id"        => 1, 
            "firstname" => "Pierre", 
            "surname"   => "Curie",
            "gender"    => "male",
            "prize"     => [
                {
                    "year" => 1903,
                    "category" => "physics"
                }
            ]
        } 
    }

    # The tests with expect methods

end

test "Case 2: one prize in event / keep original event" do

    parameters do
    { 
        "keep_original_event" => true
    }
    end

    in_event { 
        { 
            "id"        => 1, 
            "firstname" => "Pierre", 
            "surname"   => "Curie",
            "gender"    => "male",
            "prize"     => [
                {
                    "year" => 1903,
                    "category" => "physics"
                }
            ]
        } 
    }

    # The tests with expect methods

end

test "Case 3: two prizes in event / don't keep original event" do

    parameters do
    { 
        "keep_original_event" => false
    }
    end

    in_event { 
        { 
            "id"        => 2, 
            "firstname" => "Marie", 
            "surname"   => "Curie",
            "gender"    => "female",
            "prize"     => [
                {
                    "year" => 1903,
                    "category" => "physics"
                },
                {
                    "year" => 1911,
                    "category" => "chemistry"
                }
            ]
        } 
    }

    # The tests with expect methods

end

test "Case 4: two prizes in event / keep original event" do

    parameters do
    { 
        "keep_original_event" => true
    }
    end

    in_event { 
        { 
            "id"        => 2, 
            "firstname" => "Marie", 
            "surname"   => "Curie",
            "gender"    => "female",
            "prize"     => [
                {
                    "year" => 1903,
                    "category" => "physics"
                },
                {
                    "year" => 1911,
                    "category" => "chemistry"
                }
            ]
        } 
    }

    # The tests with expect methods

end

Functional test implementation

In this article, we will only implement the more complex test case (the last one). For the other ones, the principle is the same, but as we test different test cases, expected results won’t be the same.

So, for the last test case, we will check that:

The original is indeed in the output, without any changes
Each item in the “prize” array will generate one document, so two items must generate two documents
Each generated item contains the good common fields and the good prize fields
So we will have 3 events in the output, each one in dedicated index

Our test can be thus written:

test "Case 4: two prizes in event / keep original event" do

    parameters do
    { 
        "keep_original_event" => true
    }
    end

    in_event { 
        { 
            "id"        => 2, 
            "firstname" => "Marie", 
            "surname"   => "Curie",
            "gender"    => "female",
            "prize"     => [
                {
                    "year" => 1903,
                    "category" => "physics"
                },
                {
                    "year" => 1911,
                    "category" => "chemistry"
                }
            ]
        } 
    }

    expect("Count of events") do |events|
        events.length == 3
    end

    expect("Each event has same shared fields") do |events|
        result = true
        events.each { |event|
            result &= event.get("[id]") == 2
            result &= event.get("[firstname]") == "Marie"
            result &= event.get("[surname]") == "Curie"
            result &= event.get("[gender]") == "female"
        }
        result
    end

    expect("Each event has good _index") do |events|  
        result = true
        result &= events[0].get("[@metadata][_index]") == "prizes-original"
        result &= events[1].get("[@metadata][_index]") == "prizes-denormalized"
        result &= events[2].get("[@metadata][_index]") == "prizes-denormalized"
        result
    end

    expect("Each event has good prize fields") do |events| 
        result = true 
        result &= events[0].get("[prize][0][year]") == 1903
        result &= events[0].get("[prize][0][category]") == "physics"
        result &= events[0].get("[prize][1][year]") == 1911
        result &= events[0].get("[prize][1][category]") == "chemistry"
        result &= events[1].get("[prize][year]") == 1903
        result &= events[1].get("[prize][category]") == "physics"
        result &= events[2].get("[prize][year]") == 1911
        result &= events[2].get("[prize][category]") == "chemistry"
        result
    end

end

Take care on syntax when the expect method has multiple assertion, you should use the && or &= operator to combine assertions result.

Our test case implementation is ready. You should be advised that all test cases are run on Logstash startup, when corresponding pipeline is created. Indeed, Logstash is able to discover all tests written in ruby filters. And you will see all test results in the Logstash logs.

All tests passed

One test failed

In case a test failed, you will see it clearly, with test case name and all information needed (parameters, in_event, results). If at least one test fails, associated pipeline won’t start.

How to test a Ruby filter in Logstash was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

How to share Ruby code in Logstash

Ingrid Jardillier — Wed, 22 May 2024 07:27:28 GMT

In the previous article, we’ve seen how to denormalize documents, by writing a ruby filter. In this one, we’ll show how to improve our code and potentially share it between filters.

About our previous code

For memory, the code was the following:

# The value of `params` is the value of the hash passed to `script_params` 
# in the logstash configuration.
def register(params)
    @keep_original_event = params["keep_original_event"]
end

# The filter method receives an event and must return a list of events.
# Dropping an event means not including it in the return array.
# Creating new ones only requires you to add a new instance of LogStash::Event to the returned array.
def filter(event)

    items = Array.new

    # Keep original event if asked
    logger.debug('keep_original_event is :' + @keep_original_event.to_s)

    if @keep_original_event.to_s == 'true'
        event.set('[@metadata][_index]', 'prizes-original');
        items.push event
    end

    # Get prizes items (to denormalize)
    prizes = event.get("prize");
    if prizes.nil?
        logger.warn("No prizes for event " + event.to_s)
        return items
    end
   
    # Create a clone base event
    eventBase = event.clone();
    eventBase.set('[@metadata][_index]', 'prizes-denormalized');
    eventBase.remove("prize");

    # Create one event by prize item with needed modification
    prizes.each { |prize| 
        eventPrize = eventBase.clone();

        # Copy each prize item value to prize object
        prize.each { |key,value|
            eventPrize.set("[prize][" + key + "]", value)
        }

        items.push eventPrize
    }

    return items
end

As we can see, we only have 2 functions, the register one to describe parameters and the other to implement the filter’s feature. But implementing the whole feature in only one method is not the best choice for many reasons: readability, maintainability, testability, …

Sharing Ruby code

A first way to share code is to externalize some functions in another Ruby file and call these functions in our Ruby filter.

For example, we can externalize some piece of code in simple functions:

One to get original event (the current event if we want to keep it)
One to get prizes array from event
One to construct the event base (which will be cloned for each prize)
One to create each event for each prize

# Keep original event if asked
def getOriginalEvent(event)
    logger.debug('keep_original_event is :' + @keep_original_event.to_s)
    if @keep_original_event.to_s == 'true'
        event.set('[@metadata][_index]', 'prizes-original');
        return event;
    end
    return nil;
end

# Get prizes items (to denormalize)
def getPrizes(event)
    prizes = event.get("prize");
    if prizes.nil?
        logger.warn("No prizes for event " + event.to_s)
    end
    return prizes;
end

# Create a clone base event
def getEventBase(event)
    eventBase = event.clone();
    eventBase.set('[@metadata][_index]', 'prizes-denormalized');
    eventBase.remove("prize");
    return eventBase;
end

# Create a clone event for current prize item with needed modification
def createEventForPrize(eventBase, prize)
    eventPrize = eventBase.clone();
    # Copy each prize item value to prize object
    prize.each { |key,value|
        eventPrize.set("[prize][" + key + "]", value)
    }
    return eventPrize;
end

The previous code is written in a file named denormalized_by_prizes_utils.rb.

The main code of the filter will then be the following:

require './script/denormalized_by_prizes_utils.rb'

# The value of `params` is the value of the hash passed to `script_params` in the logstash configuration.
def register(params)
    @keep_original_event = params["keep_original_event"]
end

# The filter method receives an event and must return a list of events.
# Dropping an event means not including it in the return array.
# Creating new ones only requires you to add a new instance of LogStash::Event to the returned array.
def filter(event)

    items = Array.new

    # Keep original event if asked
    originalEvent = getOriginalEvent(event);
    if not originalEvent.nil?
        items.push originalEvent
    end

    # Get prizes items (to denormalize)
    prizes = getPrizes(event);
    if prizes.nil?
        return items
    end
   
    # Create a clone base event
    eventBase = getEventBase(event);

    # Create one event by prize item with needed modification
    prizes.each { |prize| 
        items.push createEventForPrize(eventBase, prize);
    }

    return items;
end

The code is much easier to read than the previous and we directly see the different steps of the filter ‘s feature. Maintanability will be improved with small functions, well cut and easy to understand.

But in some cases, if you have multiple files sharing code and a filter requiring multiple files, we can have some collisions or even partially decrease this maintainability.

Creating a module

Another way to share code is to create a module. This module will group some pieces of code of a same functional scope. No collision will be possible as we need to indicate the module name before each use of shared function.

The previous shared functions will become:

module LogStash::Util::DenormalizationByPrizesHelper
    include LogStash::Util::Loggable

    # Keep original event if asked
    def self.getOriginalEvent(event, keepOriginalEvent)
        logger.debug('keepOriginalEvent is :' + keepOriginalEvent.to_s)
        if keepOriginalEvent.to_s == 'true'
            event.set('[@metadata][_index]', 'prizes-original');
            return event;
        end
        return nil;
    end

    # Get prizes items (to denormalize)
    def self.getPrizes(event)
        prizes = event.get("prize");
        if prizes.nil?
            logger.warn("No prizes for event " + event.to_s)
        end
        return prizes;
    end

    # Create a clone base event
    def self.getEventBase(event)
        eventBase = event.clone();
        eventBase.set('[@metadata][_index]', 'prizes-denormalized');
        eventBase.remove("prize");
        return eventBase;
    end

    # Create a clone event for current prize item with needed modification
    def self.createEventForPrize(eventBase, prize)
        eventPrize = eventBase.clone();
        # Copy each prize item value to prize object
        prize.each { |key,value|
            eventPrize.set("[prize][" + key + "]", value)
        }
        return eventPrize;
    end

end

We need to include the Loggable Util module to be able to use the logger instance.

The main code will the be:

require './script/denormalized_by_prizes_utils.rb'

# The value of `params` is the value of the hash passed to `script_params` in the logstash configuration.
def register(params)
    @keep_original_event = params["keep_original_event"]
end

# The filter method receives an event and must return a list of events.
# Dropping an event means not including it in the return array.
# Creating new ones only requires you to add a new instance of LogStash::Event to the returned array.
def filter(event)

    items = Array.new

    # Keep original event if asked
    originalEvent = LogStash::Util::DenormalizationByPrizesHelper::getOriginalEvent(event, @keep_original_event);
    if not originalEvent.nil?
        items.push originalEvent
    end

    # Get prizes items (to denormalize)
    prizes = LogStash::Util::DenormalizationByPrizesHelper::getPrizes(event);
    if prizes.nil?
        return items
    end
   
    # Create a clone base event
    eventBase = LogStash::Util::DenormalizationByPrizesHelper::getEventBase(event);

    # Create one event by prize item with needed modification
    prizes.each { |prize| 
        items.push LogStash::Util::DenormalizationByPrizesHelper::createEventForPrize(eventBase, prize);
    }

    return items;
end

It does not need a lot of modification in the main code, only prefixing the function calls with the module name. So, if you have multiple functions named getEventBase or whatelse in different modules integrated in your filter’s feature, you will be able to do it without collisions and readability is improved because you explicitly set the module to be used in each case.

In a future article, we will speak about testing our filter’s code…

How to share Ruby code in Logstash was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Logstash — Denormalize documents (Part 3)

Ingrid Jardillier — Thu, 02 May 2024 07:00:16 GMT

Logstash — Denormalizing documents — Implementation

As we saw in a previous article, one of the solution to improve exploitation of documents with arrays is to use Logstash to denormalize documents. In this article, we will implement this denormalization for a simple example.

Principle

We spoke a lot about denormalization but what does it mean in our case?

As we saw in previous articles, the default ingestion of our JSON objects result in prize’s fieds as arrays.

prizes-original index

Denormalization process will clone existing documents with multiple prizes and flat the prize’s fields. So, for the document with id 2, it will create 2 documents, one for 1903 physics prize and one for 1911 chemistry one.

The result will be the following one:

prizes-denormalized index

Implementation

To implement our denormalization, we just have to change our logstash configuration to add a ruby filter, which will process the denormalization.

And, as we want to keep the two types of documents (original and denormalized), we will set the index name in the @metadata object and use it in the elasticsearch output. And we’ll use the keep_original_event boolean parameter to indicate if we want to keep the original document or not.

input {
  file {
    id => "prizes"
    path => "/usr/share/logstash/pipeline/file/prizes.json"
    mode => "read"
    codec => "json"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
 json {
  source => message
  remove_field => message
 }
 mutate {
  remove_field => ["@timestamp", "@version", "event", "host", "log"]
 }
}

filter {
 ruby {
        id => "denormalized-by-prizes"
        path => "/usr/share/logstash/pipeline/file/denormalized_by_prizes.rb"
        script_params => {
          "keep_original_event" => true
        }
 }
 mutate {
  remove_field => ["@timestamp", "@version", "event", "host", "log"]
        
 }
}

output {
 stdout {
  codec => rubydebug { metadata => true }
 }
}

output {
 elasticsearch {
  index => "%{[@metadata][_index]}"
  hosts => ["https://es01:9200","https://es02:9200","https://es03:9200"]
  ssl_certificate_authorities => ["/usr/share/logstash/certs/ca/ca.crt"]
  user => "elastic"
  password => "${ELASTIC_PASSWORD}"
 }
}

The code of the ruby plugin is then:

# The value of `params` is the value of the hash passed to `script_params` 
# in the logstash configuration.
def register(params)
    @keep_original_event = params["keep_original_event"]
end

# The filter method receives an event and must return a list of events.
# Dropping an event means not including it in the return array.
# Creating new ones only requires you to add a new instance of LogStash::Event to the returned array.
def filter(event)

    items = Array.new

    # Keep original event if asked
    logger.debug('keep_original_event is :' + @keep_original_event.to_s)

    if @keep_original_event.to_s == 'true'
        event.set('[@metadata][_index]', 'prizes-original');
        items.push event
    end

    # Get prizes items (to denormalize)
    prizes = event.get("prize");
    if prizes.nil?
        logger.warn("No prizes for event " + event.to_s)
        return items
    end
   
    # Create a clone base event
    eventBase = event.clone();
    eventBase.set('[@metadata][_index]', 'prizes-denormalized');
    eventBase.remove("prize");

    # Create one event by prize item with needed modification
    prizes.each { |prize| 
        eventPrize = eventBase.clone();

        # Copy each prize item value to prize object
        prize.each { |key,value|
            eventPrize.set("[prize][" + key + "]", value)
        }

        items.push eventPrize
    }

    return items
end

In this filter, the principle is the following:

we create an items array that will contain all documents that we want to have in the output (the original one if the keep_original_event is set to true and the denormalized ones).
we keep in memory the prizes object of the current event.
We create a clone base event. This step is optional if events are lights (all can be done in the each loop), but can be better for heavy events for performance considerations).
We loop on prize object, clones the base event and set all the prize’s field in a prize object. We then push the cloned event.

Querying on this field

Now, when we add a KQL filter, as seen in the previous part, but this time on the prizes-denormalized index:

prize.year : 1903 and prize.category : "chemistry"

This doesn’t return any result, as expected!

We have to use a relevant filter to obtain results, for example:

prize.year : 1903 and prize.category : "physics"

will return:

Warning: Be advised that cloning events can be an expensive process. You will have to add performance tests to check that event process duration are conforms to your needs.

In future article, we will show how to improve our code for readability and how to test our filter.

Logstash — Denormalize documents (Part 3) was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Logstash — Denormalize documents (Part 2)

Ingrid Jardillier — Thu, 02 May 2024 06:58:24 GMT

Logstash — Denormalizing documents — In which case to use it

In this article, we will use the previous example (described in a previous article) to expose the problematic of not using denormalization.

Problematic

If you create the prizes-* data view and go to the Discover App, you can have a look at our ingested data:

Documents in prizes-original

We can see that the JSON object with two prizes is rendered there with prize.year and prize.category as arrays.

If you want to look for prizes in 1903 for the “chemistry” category, you can add a query (using the KQL syntax):

prize.year : 1903 and prize.category : "chemistry"

This request returns a single result:

But if we check our original JSON file, we had this:

[
  {
    "year" : 1903,
    "category" : "physics"
  },
  {
    "year" : 1911,
    "category":"chemistry"
  }
]

So the prize got in 1903 was for “physics” an the one in 1911 for “chemistry”.

So, when we query for year 1903 and category “chemistry”, we should not obtain any result!

But Elasticsearch doesn’t keep link of the different items indices in arrays.

For Elasticsearch, the field prize.year contains 1903 and the field prize.category contains “chemistry”, so the document matches the query.

Resolution

On method to resolve this problematic is to use the nested type but there are some limitations and it can easily break down performance and it is not fully implemented in Kibana so only interesting if you are using Elasticsearch API to query your documents.

The second method is to denormalize documents in order to create one document per prize. This can be done with Logstash and that is what we will describe in our next article.

Logstash — Denormalize documents (Part 2) was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Logstash — Denormalize documents (Part 1)

Ingrid Jardillier — Thu, 02 May 2024 06:55:36 GMT

Logstash — Denormalizing documents — The concept

In this article, we will take a simple example which highlights the need of denormalization.

Feel free to use my ELK docker compose to reproduce this example: https://github.com/ijardillier/docker-elk

Why denormalization?

When we ingest data, we may need to transform them in order to be full usable and relevant. Denormalization is a way of creating as many documents as items in an array field. By this way, we will improve querying on this flattened field.

That’s what we will explain in the different parts of this article.

Simple example

Index template

This index template will be used to store prizes with a few fields, just to understand what happens without denormalization:

POST _index_template/prizes
{
  "index_patterns": ["prizes-*"],
  "template": {
    "mappings": {
      "properties": {
        "id": {
          "type": "long"
        },
        "firstname": {
          "type": "keyword",
          "ignore_above": 256
        },
        "surname": {
          "type": "keyword",
          "ignore_above": 256
        },
        "gender": {
          "type": "keyword",
          "ignore_above": 256
        },
        "prize": {
          "properties": {
            "category": {
              "type": "keyword",
              "ignore_above": 256
            },
            "year": {
              "type": "integer"
            }
          }
        }
      }
    }
  }
}

Data

We have the following prizes.json file containing all our prizes:

{"id":1,"firstname":"Pierre","surname":"Curie","gender":"male","prize":[{"year":1903,"category":"physics"}]}
{"id":2,"firstname":"Marie","surname":"Curie","gender":"female","prize":[{"year":1903,"category":"physics"},{"year":1911,"category":"chemistry"}]}
{"id":3,"firstname":"Frédéric","surname":"Joliot","gender":"male","prize":[{"year":1935,"category":"chemistry"}]}
{"id":4,"firstname":"Irène","surname":"Joliot-Curie","gender":"female","prize":[{"year":1935,"category":"chemistry"}]}

You can see that one of our json object contains 2 prizes, in two different categories, and not for the same year.

Logstash configuration

This logstash configuration will just read this json file as a json content and send it to Elasticsearch:

input {
  file {
    id => "prizes"
    path => "/usr/share/logstash/pipeline/file/prizes.json"
    mode => "read"
    codec => "json"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
 json {
  source => message
  remove_field => message
 }
 mutate {
  remove_field => ["@timestamp", "@version", "event", "host", "log"]
 }
}

output {
 stdout {
  codec => rubydebug { metadata => true }
 }
}

output {
  elasticsearch {
    index => "prizes-original"
    hosts => ["https://es01:9200","https://es02:9200","https://es03:9200"]
    ssl_certificate_authorities => ["/usr/share/logstash/certs/ca/ca.crt"]
    user => "elastic"
    password => "${ELASTIC_PASSWORD}"
  }
}

In this configuration, we:

read from the beginning and don’t use sincedb (each time you restart logstash, it will re-read the file)
parse the json content to extract fields
keep only usefull fields to concentrate us on important stuff
send documents to stdout and to elasticsearch, in a prizes-original index.

A next article will highlight the need of denormalization.

Logstash — Denormalize documents (Part 1) was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Using Elasticsearch searchable snapshots in cold / frozen tiers

Ingrid Jardillier — Fri, 26 Jan 2024 14:10:39 GMT

In this article, I’ll present how to use Elasticsearch searchable snapshots in cold and frozen tiers in order to reduce disk usage.

Warning: Searchable snapshots are a paid feature. You must have an Enterprise licence to use it.

Data tiers architecture

In a complete Elasticsearch architecture, you may have the following tiers:

Hot nodes: handle the indexing load for time series data such as logs or metrics and hold your most recent, most-frequently-accessed data.
Warm nodes: hold time series data that is accessed less-frequently and rarely needs to be updated.
Cold nodes: hold time series data that is accessed infrequently and not normally updated.
Frozen nodes: hold time series data that is accessed rarely and never updated.

Warm vs cold tiers

Warm and cold tiers share the same hardware resources needs. Indeed, when you create a new deployment on Elastic Cloud Service, you will remark that the same hardware profile is used on warm and cold tiers.

The difference between these two tiers lies in the fact that cold tier enables you to use searchable snapshots.

When you enable searchable snapshots on cold tier using ILM ‘Index Lifecycle Management), as the indices move to the cold tier, they are saved as snapshots in the associated repository. The primary shard(s) of the index is(are) then restored with the “restored-” prefix. Such indices shards are fully cached in the Elasticsearch cluster.

With such a mecanism, replicas are no more needed on this tier for reliability. If a recovery is needed for an indice, it is automatically done using the snapshots. Such indices are called fully mounted indices. Fully mounted indices are read-only. These indices, as they eliminate the need for replicas, reduce required disk space by approximately 50% compared to the regular indices.

These fully mounted indices contain settings not available in classic indices, as you can see in the following capture. All information describing the associated snapshot are set.

Search performance is normally comparable to a regular index. While recovery is ongoing, search performance may be slower than with a regular index because a search may need some data that has not yet been retrieved into the local cache. On-disk data is preserved across restarts, such that the node does not need to re-download data that is already stored on the node after a restart.

Frozen tier

The frozen tier requires a snapshot repository. You can’t use classic indices in this tier. Using searchable snapshot is mandatory. This tier stores partially mounted indices of searchable snapshots exclusively.

When the indices move to the frozen tier, they are saved as snapshots in the associated repository. Indices with prefix “partial-” are created but with a “0b” storage size. Only recently searched parts of the snapshotted index’s data are stored in a local cache. This cache has a fixed size and is shared across shards of partially mounted indices allocated on the same data node.

This mecanism extends the storage capacity even further (by up to 20 times compared to the warm tier).

Searches on the frozen tier are slower than on the cold tier because Elasticsearch must sometimes fetch frozen data from the snapshot repository.

Conclusion

In an operating environment which is compatible with searchable snapshots, searchable snapshots reduce the costs of running a cluster by removing the need for replica shards and for shard data to be copied between nodes.

Storage offered by all major public cloud providers typically provides very good protection against data loss or corruption. If you manage your own repository storage then you are responsible for its reliability.

Using Elasticsearch searchable snapshots in cold / frozen tiers was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Send .Net application traces to Elasticsearch using Elastic APM / RUM agent

Ingrid Jardillier — Tue, 25 Apr 2023 12:28:40 GMT

Send .Net application traces to Elasticsearch using Elastic APM / RUM agent

Good practices to send traces and add logs correlation ids to Elasticsearch, using Elastic APM / RUM agent.

What is Elastic APM agent?

The Elastic APM .NET Agent automatically measures the performance of your application and tracks errors. It has built-in support for the most popular frameworks, as well as a simple API which allows you to instrument any application.

The agent auto-instruments supported technologies and records interesting events, like HTTP requests and database queries. To do this, it uses built-in capabilities of the instrumented frameworks like Diagnostic Source, an HTTP module for IIS, or IDbCommandInterceptor for Entity Framework. This means that for the supported technologies, there are no code changes required beyond enabling auto-instrumentation.

Source : APM .Net Agent

Real User Monitoring captures user interaction with clients such as web browsers. The JavaScript Agent is Elastic’s RUM Agent.

Unlike Elastic APM backend agents which monitor requests and responses, the RUM JavaScript agent monitors the real user experience and interaction within your client-side application. The RUM JavaScript agent is also framework-agnostic, which means it can be used with any front-end JavaScript application.

You will be able to measure metrics such as “Time to First Byte”, domInteractive, and domComplete which helps you discover performance issues within your client-side application as well as issues that relate to the latency of your server-side application.

Source : Real User Monitoring

Supported technologies for APM agent

For APM agent, choosing between Profiler auto instrumentation and NuGet use will depend on your needs and supported technologies.

See these page for more information: Supported technologies

For RUM agent, Elastic provides a JavaScript agent and adds framework-specific integrations for React, Angular and Vue.

See these page for more information: Supported technologies

Elastic APM agent implementation

Profiler auto instrumentation

In our case, as we use Docker, it would be easy to add Profiler auto instrumentation, we just have to add these lines in our Dockerfile:

ARG AGENT_VERSION=1.20.0

FROM mcr.microsoft.com/dotnet/aspnet:6.0-alpine3.16 AS base
    
# ...

FROM mcr.microsoft.com/dotnet/sdk:6.0-alpine3.16 AS build
ARG AGENT_VERSION

# install zip curl
RUN apk update && apk add zip wget

# pull down the zip file based on ${AGENT_VERSION} ARG and unzip
RUN wget -q https://github.com/elastic/apm-agent-dotnet/releases/download/v${AGENT_VERSION}/elastic_apm_profiler_${AGENT_VERSION}-linux-x64.zip && \
    unzip elastic_apm_profiler_${AGENT_VERSION}-linux-x64.zip -d /elastic_apm_profiler

# ...

FROM build AS publish

# ...

FROM base AS final

WORKDIR /elastic_apm_profiler
COPY --from=publish /elastic_apm_profiler .

# ...

# # Configures whether profiling is enabled for the currently running process.
ENV CORECLR_ENABLE_PROFILING=1
# # Specifies the GUID of the profiler to load into the currently running process.
ENV CORECLR_PROFILER={FA65FE15-F085-4681-9B20-95E04F6C03CC}
# # Specifies the path to the profiler DLL to load into the currently running process (or 32-bit or 64-bit process).
ENV CORECLR_PROFILER_PATH=/elastic_apm_profiler/libelastic_apm_profiler.so

# # Specifies the home directory of the profiler auto instrumentation. 
ENV ELASTIC_APM_PROFILER_HOME=/elastic_apm_profiler
# # Specifies the path to the integrations.yml file that determines which methods to target for auto instrumentation.
ENV ELASTIC_APM_PROFILER_INTEGRATIONS=/elastic_apm_profiler/integrations.yml
# # Specifies the log level at which the profiler should log. 
ENV ELASTIC_APM_PROFILER_LOG=warn

# Core configuration options / Specifies the service name (ElasticApm:ServiceName).
ENV ELASTIC_APM_SERVICE_NAME=NetApi-Elastic
# Core configuration options / Specifies the environment (ElasticApm:Environment)
ENV ELASTIC_APM_ENVIRONMENT=Development
# Core configuration options / Specifies the sample rate (ElasticApm:TransactionSampleRate).
# 1.0 : Dev purpose only, should be lowered in Production to reduce overhead.
ENV ELASTIC_APM_TRANSACTION_SAMPLE_RATE=1.0 

# Reporter configuration options / Specifies the URL for your APM Server (ElasticApm:ServerUrl).
ENV ELASTIC_APM_SERVER_URL=https://host.docker.internal:8200
# Reporter configuration options / Specifies if the agent should verify the SSL certificate if using HTTPS connection to the APM server (ElasticApm:VerifyServerCert). 
ENV ELASTIC_APM_VERIFY_SERVER_CERT=false /* Testing purpuse */ 
# Reporter configuration options / Specifies the path to a PEM-encoded certificate used for SSL/TLS by APM server (ElasticApm:ServerCert).
# ENV ELASTIC_APM_SERVER_CERT=

# Supportability configuration options / Sets the logging level for the agent (ElasticApm:LogLevel).
ENV ELASTIC_APM_LOG_LEVEL=Debug

# ...

You can find all the documentation at this place: Profiler Auto instrumentation

But, in our case, we don’t need any feature provided by the Profiler auto instrumentation. So this code is just shown for example.

NuGet — Zero code change setup

As we use .Net 6, we can also use the “zero code change” to integrate NuGet and be able to use NuGet features without changing any code. This is available when using .Net Core and .Net 5+.

To do this, just add the following environment variables in the Dockerfile:

ARG AGENT_VERSION=1.20.0

FROM mcr.microsoft.com/dotnet/aspnet:6.0-alpine3.16 AS base

# ...

FROM mcr.microsoft.com/dotnet/sdk:6.0-alpine3.16 AS build
ARG AGENT_VERSION

# install zip curl
RUN apk update && apk add zip wget

# pull down the zip file based on ${AGENT_VERSION} ARG and unzip
RUN wget -q https://github.com/elastic/apm-agent-dotnet/releases/download/v${AGENT_VERSION}/ElasticApmAgent_${AGENT_VERSION}.zip && \
    unzip ElasticApmAgent_${AGENT_VERSION}would.zip -d /ElasticApmAgent

# ...

FROM build AS publish

# ...

FROM base AS final

WORKDIR /ElasticApmAgent
COPY --from=publish /ElasticApmAgent .

# ...

# Inject the APM agent at startup
ENV DOTNET_STARTUP_HOOKS=/ElasticApmAgent/ElasticApmAgentStartupHook.dll
# If the startup hook integration throws an exception, additional detail can be obtained by setting the Startup Hooks Logging variable.
ENV ELASTIC_APM_STARTUP_HOOKS_LOGGING=1

# Core configuration options / Specifies the service name (ElasticApm:ServiceName).
ENV ELASTIC_APM_SERVICE_NAME=NetApi-Elastic
# Core configuration options / Specifies the environment (ElasticApm:Environment)
ENV ELASTIC_APM_ENVIRONMENT=Development
# Core configuration options / Specifies the sample rate (ElasticApm:TransactionSampleRate).
# 1.0 : Dev purpose only, should be lowered in Production to reduce overhead.
ENV ELASTIC_APM_TRANSACTION_SAMPLE_RATE=1.0 

# Reporter configuration options / Specifies the URL for your APM Server (ElasticApm:ServerUrl).
ENV ELASTIC_APM_SERVER_URL=https://host.docker.internal:8200
# Reporter configuration options / Specifies if the agent should verify the SSL certificate if using HTTPS connection to the APM server (ElasticApm:VerifyServerCert). 
ENV ELASTIC_APM_VERIFY_SERVER_CERT=false /* Testing purpuse */ 
# Reporter configuration options / Specifies the path to a PEM-encoded certificate used for SSL/TLS by APM server (ElasticApm:ServerCert).
# ENV ELASTIC_APM_SERVER_CERT=

# Supportability configuration options / Sets the logging level for the agent (ElasticApm:LogLevel).
ENV ELASTIC_APM_LOG_LEVEL=Debug

# ...

But, with this implementation, we won’t be able to make correlation with logs by adding transaction id and trace id.

NuGet — .Net Core setup

So, we will prefer using NuGet integration, adding logs correlation, and be able to choose the features to integrate.

The following Elastic for .Net NuGet packages are used:

But if you prefer choosing the features you want to integrate, you can choose only the packages you are interesting in instead of Elastic.Apm.NetCoreAll. The documentation is provided here.

To enable Elastic APM, you just have one line to add in your Configure method:

public void Configure(IApplicationBuilder app)
{
    app.UseAllElasticApm(Configuration);            
}

In the case you only want to activate some modules, you can use the UseElasticApm method instead, after adding needed packages:

app.UseElasticApm(Configuration,
      new HttpDiagnosticsSubscriber(),  /* Enable tracing of outgoing HTTP requests */
      new EfCoreDiagnosticsSubscriber()); /* Enable tracing of database calls through EF Core */

To define the APM server to communicate with, add the following configuration in the appsettings.json file:

{
    "AllowedHosts": "*",
    "ElasticApm": 
    {
        "ServerUrl":  "https://host.docker.internal:8200",
        "LogLevel":  "Information",
        "VerifyServerCert": false /* Testing purpuse */ 
    }
}

See this page for all options available.

To add the transaction id and trace id to every Serilog log message (see previous article about logs) that is created during a transaction, you just add to update your configuration in the appsettings.json file:

{
  "Serilog": {
      "Using": ["Elastic.Apm.SerilogEnricher"],
      /* ... */
      "Enrich": [/* ... */, "WithElasticApmCorrelationInfo"],
      /* ... */
  }
}

Elastic RUM agent implementation

For common JavaScript application, the implementation only takes a few lignes (asynchronous / non-blocking pattern):

You can You can find this implementation in the sample source code in the _Layout.cshtml of the Blazor App.

For frameworks like React, Angular and Vue, you can refer to this page.

Sending traces to Elasticsearch

The configuration has already been seen in the previous section.

You just have to ensure you have an APM server available (this is now done with an elastic-agent with an APM integration in Fleet).

Analyse traces in Kibana

First thing we can check, the correlation ids for logs.

Logs correlation ids on Discover

Then, on the APM App on Kibana, we have a lot of information thanks to our traces.

APM Inventory which gives the list of all services that send traces:

APM Inventory

APM Service Map which display a map with our services (in case of complexe architecture, it is easy to see dependencies between services):

APM Service map

APM Overview which gives an overview of all information about traces:

APM Overview

APM Transactions which gives information about all transactions coming from our services:

APM Transactions

APM Dependencies which list all dependencies of the current service:

APM Dependencies

APM Errors which list all errors not catched by our service:

APM Errors

APM Logs which list all logs for the current service:

APM Logs

An interesting view is the trace sample on the Transactions view. You can view the detailed trace for a transaction:

APM Transactions — Trace timeline

The same view for JavaScript service (RUM):

APM Transactions — Trace timeline — JavaScript service

And you can also see related logs:

APM Transactions — Trace logs

A full dashboard for User Experience (RUM):

APM User Experience Dashboard

Conclusion

In this article, we have seen how to use Elastic APM / RUM agent to send traces to Elasticsearch and add logs correlation ids.

A complete sample, with 2 projects (.Net API and .Net client with Blazor UI) is available on Github.

Send .Net application traces to Elasticsearch using Elastic APM / RUM agent was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.

Write and send .Net application metrics to Elasticsearch using Prometheus

Ingrid Jardillier — Thu, 20 Apr 2023 07:34:34 GMT

Write and send .Net application metrics to Elasticsearch using Prometheus

Good practices to properly write and send metrics to Elasticsearch, using Prometheus.

What is Prometheus?

Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.

Source : Prometheus

NuGet packages

The following Prometheus for .Net NuGet packages are used:

These are .NET libraries for instrumenting your applications and exporting metrics to Prometheus.

Prometheus implementation

First, you have to add the following packages in your csproj file (you can update the version to the latest available for your .Net version):

By default, Prometheus .Net library add some application metrics about .Net (Memory, CPU, garbaging, …). As we plan to use APM agent, we don’t want it to add this metrics, so we can suppress them. We will also add some static labels to each metrics in order to be able to add contextual information from our application, as we did it for logs:

public virtual void ConfigureServices(IServiceCollection services)
{
    // ...

    Metrics.SuppressDefaultMetrics();

    Metrics.DefaultRegistry.SetStaticLabels(new Dictionary
    {
        { "domain", "NetClient" },
        { "domain_context", "NetClient.Elastic" }
    });

    // ...     
}

We also have to map endpoints for metrics:

public void Configure(IApplicationBuilder app)
{
    // ...

    app.UseEndpoints(endpoints =>
    {
        // ...

        endpoints.MapMetrics();
    });
}

This map exposes the /metrics endpoint with the Prometheus format.

If you need OpenMetrics format, you can easily access it with /metrics?accept=application/openmetrics-text

The result is the below:

# HELP aspnetcore_healthcheck_status ASP.NET Core health check status (0 == Unhealthy, 0.5 == Degraded, 1 == Healthy)
# TYPE aspnetcore_healthcheck_status gauge
aspnetcore_healthcheck_status{name="self",domain="NetClient",domain_context="NetClient.Elastic"} 1
# HELP myapp_gauge1 A simple gauge 1
# TYPE myapp_gauge1 gauge
myapp_gauge1{service="service1",domain="NetClient",domain_context="NetClient.Elastic"} 1028
# HELP myapp_gauge2 A simple gauge 2
# TYPE myapp_gauge2 gauge
myapp_gauge2{service="service1",domain="NetClient",domain_context="NetClient.Elastic"} 2403
# HELP myapp_gauge3 A simple gauge 3
# TYPE myapp_gauge3 gauge
myapp_gauge3{service="service1",domain="NetClient",domain_context="NetClient.Elastic"} 3872
...

Forward Health checks to Prometheus

We can easily forward our health checks (described int the previous article) to Prometheus endpoint, to avoid using http module from Metricbeat and retrieve all metrics including health checks from Metricbeat Prometheus module. By the way, we will also benefit from our static labels if defined.

This is done here in our custom extension which is used in the ConfigureServices of the Startup file:

public virtual void ConfigureServices(IServiceCollection services)
{
    // ...
    services.AddCustomHealthCheck(Configuration)
    // ...     
}

public static IServiceCollection AddCustomHealthCheck(this IServiceCollection services, IConfiguration configuration)
{
    IHealthChecksBuilder hcBuilder = services.AddHealthChecks();
    hcBuilder.AddCheck("self", () => HealthCheckResult.Healthy());

    hcBuilder.ForwardToPrometheus();

    return services;
}

Business metrics

Prometheus .Net library offers an easy way to add business metrics.

To create a new metric, you just have to instantiate an new counter, gauge, …:

private readonly Gauge Gauge1 = Metrics.CreateGauge("myapp_gauge1", "A simple gauge 1");

If you need to add attached labels, you have to add a configuration:

private static readonly GaugeConfiguration configuration = new GaugeConfiguration { LabelNames = new[] { "service" }};
private readonly Gauge Gauge2 = Metrics.CreateGauge("myapp_gauge1", "A simple gauge 1", configuration);

To apply a label and a value to a metric, use this kind of code:

Gauge1.WithLabels("service1").Set(_random.Next(1000, 2000));

Sending metrics to Elasticsearch

All the metrics are available on the /metrics endpoint.

In our example, we don’t have any Prometheus server, so metricbeat will directly access metrics from the application metrics endpoint. But if you have a Prometheus server, you can add a new target in your scrape configuration.

So, to send the metrics to Elasticseach, you will have to configure a metricbeat agent with prometheus module:

    metricbeat.modules:
    - module: prometheus
      period: 10s
      metricsets: ["collector"]
      hosts: ["host.docker.internal:8080"]
      metrics_path: /metrics

For more information about this metricbeat configuration, you can have a look to: https://github.com/ijardillier/docker-elk/blob/master/extensions/beats/metricbeat/config/metricbeat.yml

Analyse metrics in Kibana

You can check how metrics are ingested in the Discover module:

Metrics on Discover

You can see how metrics are displayed in the Metrics Explorer App:

Conclusion

In this article, we have seen how to use Prometheus to write and send metrics to Elasticsearch.

A complete sample, with 2 projects (.Net API and .Net client with Blazor UI) is available on Github.

In the next article, we will focus on traces with Elastic APM agent.

Write and send .Net application metrics to Elasticsearch using Prometheus was originally published in Zenika on Medium, where people are continuing the conversation by highlighting and responding to this story.