Jaeger, Fluentd, and OpenSearch with Verrazzano

Sherwood Zern
Verrazzano
Published in
9 min readMar 1, 2023

The Verrazzano platform includes several cloud native solutions to improve an enterprise’s day-2 operations. A subset of these focuses on observability:

  • Monitoring: Prometheus and Grafana for monitoring
  • Logging: Fluentd and OpenSearch
  • Distributed Tracing: Jaeger

The intent of this article is to show how to leverage Fluentd, Jaeger, and OpenSearch to assist with your observability efforts. Let’s briefly discuss each of these tools:

· Fluentd is an open-source data collector and unifies the data collection and consumption for a better use and understanding of data. It decouples data sources from backend systems by providing a unified logging layer in between. Fluentd is normally deployed as a daemonset in the Kubernetes environment i.e., it runs on every node in your Kubernetes cluster.

· Jaeger is a distributed tracing system. It is used for monitoring and troubleshooting microservices-based distributed systems. Jaeger provides distributed context propagation, distributed transaction monitoring, root cause analysis, service dependency analysis, and performance / latency optimization.

OpenSearch is a scalable, flexible and extensible open-source software suite for search, analytics, and observability applications. OpenSearch will be used as the storage of logs and Jaeger traces.

Application

The application is implemented with Helidon 3.1 (It is not necessary to use Helidon or Java. In a subsequent article I will demonstrate more Verrazzano features, but using Golang to implement the application). The application is a mere modification of the hello world. There are 2 very lightweight services — A proxy service and a service that identifies the cloud provider and says hello from that cloud provider. The application logic is not the focus, but rather the demonstration of using observability in the Verrazzano platform.

The diagram below provides a high-level overview of the application integrating with the observability components of Verrazzano. Verrazzano installs Fluentd, OpenSearch and Jaeger for us. In addition, they are configured such that all application logs and traces are sent to OpenSearch. You do not have to do all of the configuration for these observability tools. Verrazzano handles this for you — Look mom no hands.

The application code will have the following features:

  1. Helidon creates a Netty web server and the Jaeger tracer.
  2. Jaeger spans are added to the trace throughout the code.
  3. Span tags are added for querying capability in the Jaeger console.
  4. Span events are generated so that logs are captured and written with each span in the trace.
  5. A unique correlation Id is generated for every request to assist in correlating logs and traces from OpenSearch to Jaeger.
  6. The trace will be across all services.
  7. All logs and Jaeger traces are stored in the OpenSearch backend.

Follow the Verrazzano installation instructions to deploy the observability stack. With Verrazzano (V8O) installed, you can configure your application to take advantage of the V8O stack.

Let’s start with the application proxy service, discussed previously. The configuration file (application.yaml) provides those elements to set up the Netty server and the client elements.

Proxy Service Configuration:

tracing:
service: "greetings-proxy"
host: "jaeger-operator-jaeger-collector.verrazzano-monitoring"
sampler-param: 1
port: 14250
components:
web-server:
spans:
- name: "HTTP Request"
enabled: false
- name: "content-read"
enabled: false
- name: "content-write"
enabled: false

client:
follow-redirects: true
max-redirects: 5
services:
tracing:

In the application.yaml the key points are the elements under the “tracing:” element. The key elements are the service, host, and port. You are required to provide a name for the service. It can be anything you want, though make sure it has some meaning so you can identify the service in your trace console.

The host is the endpoint to send the Jaeger traces. The endpoint for the collector is the Jaeger service name followed by the namespace. The Jaeger collector’s service name is jaeger-operator-jaeger-collector and it is deployed in the verrazzano-monitoring namespace. Therefore, the host endpoint is “jaeger-operator-jaeger-collector.verrazzano-monitoring. The port, if not specified, will default to 14250. Therefore, you can explicitly specify the port or let it default.

Important Note:
The Verrazzano documentation references the use of the Jaeger sidecar.
For your application to communicate with the Jaeger sidecar, you must use the
UDP protocol. However, OpenTelemetry does not support the UDP protocol;
therefore, your application code must connect directly to the Jaeger collector.

With the configuration to set up the Netty server, but more importantly the tracing, within your application code do the following:

// By default this will pick up the application.yaml from the classpath
Config config = Config.create();
Config tracingConfig = config.get("tracing");

WebServer server = WebServer.builder(createRouting(config))
.config(config.get("server"))
.addMediaSupport(JsonpSupport.create())
.tracer(TracerBuilder.create(tracingConfig))
.build();

Single<WebServer> webserver = server.start();

Helidon will create the Jaeger tracer for you. To gain access to the tracer, all you need to do is ask the ServerRequest for the tracer, such as “request.tracer().”

The code shown above is from the proxy service but you can use it for the backend service too. In fact, even the application configuration (application.yaml) for the backend service is the same. The only difference is the backend service has no need to specify a client as it will not be invoking another service.

Since the proxy service invokes the backend service, you need to specify a client. The client adds a tracing service by virtue of client.services.tracing. Specifying the tracing service indicates to Helidon that you want traces across all the downstream invocations. In our case, we only have one — the backend service. (Each service that acts as a client to another service must include this configuration if you are hoping to have all traces captured.)

From the code, it looks like the following:

webClient = WebClient.builder()
.config(config.get("client"))
.baseUri(host)
.addMediaSupport(JsonpSupport.create())
.build();

Since you configured adding tracing in the configuration, there is no need to explicitly specify a tracing service in the application code. In addition, the Jaeger tracer was also declared when the Netty server was created.

Important Note: If specifying client.services.tracing in the application.yaml 
file it is required to add the following in the pom.xml file:
<dependency>
<groupId>io.helidon.tracing</groupId>
<artifactId>helidon-tracing-jaeger</artifactId>
</dependency>

Geez, that was a lot of an explanation for setting the foundation to add trace functionality. Overall, it is a very simple setup, but I just didn’t want to throw code out there and no explanation behind the how.

We’re now ready to deploy and execute the application. But before I get to that and show the Jaeger and OpenSearch dashboards, I want to point out some interesting features you can do with your Traces. I will briefly discuss a span, tag, and event. These elements can make your traces far more valuable when it comes to correlating your logs and traces in collaboration with OpenSearch.

Span: It represents a logical unit of work that has an operation name, the start time of the operation, and the duration. Spans may be nested and ordered to model causal relationships. With a trace span I can add tags and events.

Tag: These are attributes that you can add to traces. Tags will allow you to query your traces to filter the results and help with your collaboration and debugging efforts.

Event: An event is another attribute you can add to traces. The events will show up as logs associated with the span that added the event.

Let’s have a peek at some simple code to add a tag and an event.

// Create a Trace span
Span.Builder<?> sb = request.tracer().spanBuilder(spanName);

// If there is already a spancontext then set it as a parent of the just
// created span
request.spanContext().ifPresent(sb::parent);

// start the span
return sb.start();
. . . . . .
// Add an event to the current span. The event value is a String and will be
// shown as logs
// in the Jaeger dashboard
strb.append("Starting the Proxy Service: Trace Id: ")
.append(span.context().traceId());
span.addEvent(strb.toString());

// Add tags to the span. When adding tags I can then do a query search based
// upon the Key
// name and a value. I’m adding a simple key-value pair
span.tag("RequestId", request.requestId());
span.tag("TracerId", span.context().traceId());

// The correlation Id is a random set of 32 characters and numbers
span.tag("Correlation Id", correlationId);

The application is ready to be deployed and exercised. After exercising the application several traces should have been created. You will now want to see those traces with their associated spans.

The Jaeger console will be accessible via the Verrazzano dashboard:

Verrazzano Dashboard

Select “Jaeger Console” from the Verrazzano Dashboard. The Jaeger console shows the traces and the associated spans.

Jaeger Console

The initial screen is where you can specify the traces you want to evaluate. In the example above I specify the greetings-proxy. The service name comes from the configuration that was provided in the application.yaml file, shown previously. In addition, to filter the number of responses I added a filtering option by specifying a tag element to be part of the search. In the above, the tag was the TracerId; however, you can use any of the tag elements.

After the successful search query, the console below is shown.

Jaeger Trace and Spans

The Jaeger console provides all the spans that make up the trace. Each span will contain all the tags, events, and timings for that span. Each span can be expanded to see the values of the tags, events, and the start time, end time, and the overall execution time of the span.

All the Jaeger traces are stored in OpenSearch. Verrazzano pre-configures this integration for you.

In addition to the Jaeger traces being stored, all the logs are also captured and stored in OpenSearch. Verrazzano has also configured for the logs to be sent to OpenSearch. In the case of logs, this is done by Fluentd.

Each Fluentd instance pulls logs from the node’s /var/log/containers directory and writes them to the target OpenSearch data stream. As part of the installation, Verrazzano sets up several indexes in OpenSearch. The indexes created are as follows:

  • verrazzano-system: Verrazzano system applications receive special handling and write their logs to this data stream.
  • verrazzano-application-<application-namespace>: Application logs are exported to a data stream on the application’s namespace.
  • verrazzano-jaeger-service: All the Jaeger services, such as “greetings-proxy”.
  • verrazzano-jaeger-span: All the Jaeger spans

From the “Discover” menu option you can see all the indexes that have automatically been created for you. There are two applications deployed in two different namespaces, “hello-helidon”, and “jaeger-apps”. The application discussed in this article has been deployed to the “jaeger-apps”.

OpenSearch Discover Page

The selection of the “verrazzano-application-jaeger-apps” will provide all the logs captured. To limit the number of search results you can enter a data query language (DQL) query in the search bar.

In the example below I chose to restrict the results on a correlation identifier. The application generated a correlation identifier on every request. The trace added a tag with the correlation Identifier. In this manner, you can correlate the logs from OpenSearch with the trace. Obviously, you can also choose other elements for the correlation.

OpenSearch Logs

The key thing to remember is that you can bring your logs, traces and metrics together using these tools. Bringing these tools together means you can turn all this captured data into useful and actionable information.

Conclusion

There are several open-source tools in the community. You can choose to install and configure each tool individually and correlate the results. However, a better option is to do one installation of Verrazzano and allow the platform to install, configure, and correlate the results for you.

Verrazzano is a solid platform that continues to grow and will bring these individual components into a cohesive unit that helps you with your day-2 operations.

Stay tuned for forthcoming information and articles about this ever-growing platform. If you’re interested in investigating further, visit the location: https://verrazzano.io/latest.

Thanks to Julian OI and Tomas Langer for their help in putting together this information.

--

--