T-Trace: agent-like instrumentation for GraalVM

Pavol Loffay
JaegerTracing
Published in
5 min readFeb 24, 2020

In this article we will look at the agent-like instrumentation tool T-Trace. The tool provides non-intrusive instrumentation capabilities for applications running on GraalVM. We will instrument a simple NodeJS application by using T-Trace and OpenTracing API with Jaeger NodeJS tracer.

So first things first, GraalVM is a polyglot virtual machine. It can run any JVM based language but also other mainstream languages like NodeJS, Python and Ruby… It also supports LLVM languages like C and C++. The sweet spot is that the languages can be combined together in a single application. Another very interesting feature is that GraalVM can produce a native executable for JVM languages. This feature is used by modern cloud-native Java runtimes like Quarkus and Micronaut.

Styles of instrumentation

Before we deep dive into the demo application let’s have a look at different instrumentation techniques. There are several instrumentation styles that can be applied to code fragments or the whole application. At a high level we can talk about black-box and white-box instrumentation. The black-box technique does not require any modifications to the application. The monitoring system, in this case, uses only events produced by the application. These events can be logs or any data that the application produces.

In this article we are going to focus on the white-box technique. This technique assumes that the code of the application is known and can be inspected. There are several methods of how the instrumentation points can be added to the application:

  • Non-intrusive agent-like instrumentation: in this case the instrumentation is added to the application dynamically. In some languages (Python, Ruby, JavaScript) this technique is called monkey-patching where modules or classes are dynamically replaced at runtime. In other languages less dynamic techniques are allowed, for instance -javaagent in Java.
  • Explicit instrumentation: this instrumentation is applied directly to the code. However it does not mean that the application developer has to instrument all the code. The instrumentation code can be embedded directly into (RPC) frameworks or external integration. In some languages and runtime frameworks it is easier to initialize the instrumentation like in other languages. For instance in Java it’s enough just to drop an instrumentation JAR into the classpath, whereas in statically linked languages like Golang the initialization has to be done explicitly.

The application

The application is a simple NodeJS server with one handler. The full demo code with instructions can be found in pavolloffay/graalvm-t-trace. Here is the server.js file:

const port = 3000;

const http = require("http");
const srv = http.createServer((req, res) => {
console.log(`server: obtained request ${res.id}`);
setTimeout(() => {
res.write(`OK# ${res.id}`);
console.log(`server: replied to request ${res.id}`);
res.end();
}, 5);
});
srv.listen(port);

As you can see there is no tracing related in the application. The application prints request ids at the beginning and at the end of the request. The res.id is set in the agent script — this is just for the demo purpose to demonstrate that the agent script can pass objects to the main application.

Now let’s have a look at the instrumentation script jaegernode.js. There are two functions: initializeJaeger and initializeAgent. The first function creates Jaeger tracer instance with HTTP sender to the collector port 14268 and calls the second one at the end:

let initializeJaeger = function (require) {
let http = require("http"); console.log(`${typeof
http.createServer} http.createServer is available to the agent`);
var initTracer = require('jaeger-client').initTracer;
console.log('server: Jaeger tracer obtained');
// See schema https://github.com/jaegertracing/jaeger-client-node/blob/master/src/configuration.js#L37
var config = {
serviceName: 't-trace-demo',
reporter: {
// Provide the traces endpoint; this forces the client to connect directly to the Collector and send
// spans over HTTP
collectorEndpoint: 'http://localhost:14268/api/traces',
// logSpans: true
},
sampler: {
type : 'const',
param : 1
}
};
var options = {
tags: {
't-trace-demo.version': '1.1.2',
},
// metrics: metrics,
logger: console,
sampler: {
type : 'const',
param : 1
}
};

var tracer = initTracer(config, options);
initializeAgent(tracer);
};

let waitForRequire = function (event) {
if (typeof process === 'object' && process.mainModule &&
process.mainModule.require) {
agent.off('source', waitForRequire);
initializeJaeger(
process.mainModule.require.bind(process.mainModule));
}
};

agent.on('source', waitForRequire, { roots: true });

The second function is initializeAgent which is instrumenting the application code:

let initializeAgent = function(tracer) {
agent.on('enter', function(ctx, frame) {
const args = frame.args;
if ('request' !== frame.type || args.length !== 2 || typeof args[0] !== 'object' || typeof args[1] !== 'object') {
return;
}
const req = args[0];
const res = args[1];
const span = tracer.startSpan(req.method);
span.setTag("span.kind", "server");
span.setTag("http.url", req.url);
span.setTag("http.method", req.method);
res.id = span.context().spanIdStr;
res.span = span;
console.log(`agent: handling #${res.id} request for ${req.url}`);
}, {
roots: true,
rootNameFilter: name => name === 'emit',
sourceFilter: src => src.name === 'events.js'
});

agent.on('return', function(ctx, frame) {
var res = frame['this'];
if (res.span) {
res.span.setTag("http.status_code", res.statusCode);
if (res.statusCode >= 400) {
res.span.setTag("error", "true");
}
res.span.finish();
console.log(`agent: finished #${res.id} request`);
} else {
// OK, caused for example by Tracer itself connecting to Jaeger server
}
}, {
roots: true,
rootNameFilter: name => name === 'end',
sourceFilter: src => src.name === '_http_outgoing.js'
});
console.log('agent: ready');
};

The instrumentation is done via agent.on('enter', fn) and agent.on('return', fn). The first instrumentation point is invoked when any function from the application is invoked and the second when a function returns execution the caller. The agent.on function has access to the frame variables and also to the method arguments. The arguments are used in a condition to check whether the function is HTTP handler. You can also note that the span object is injected into the response.

Now let’s run the application and Jaeger server:

docker run --rm -it --net=host jaegertracing/all-in-one:1.16.0
$GRAALVM_HOME/bin/npm install jaeger-client@3.17.2
$GRAALVM_HOME/bin/node --experimental-options --js.print=true --agentscript=jaegernode.js server.js
curl http://localhost:3000
Screenshot from Jaeger showing trace from t-trace-demo application.

Conclusion

We have seen a NodeJS hello-world example of T-Trace with Jaeger and OpenTracing. The demo shows how agent-like instrumentation can be applied to the NodeJS application without monkey-patching.

There are many improvements that can be made to the code. For instance we could extend it by not instrumenting the code in the agent script but rather reusing OpenTracing instrumentation for NodeJS and just install it in the agent script. Another possible improvement we could do is to support dynamic loading of the script. In this case the application will be started with the agent script that exposes REST API for loading and disabling scripts. This is a very powerful feature that enables changing the granularity of the instrumentation on the fly without recompiling and redeploying the application.

T-Trace is also able to mix the instrumentation language with the main language. For example, the agent script can be written in JavaScript to trace Ruby or C++ application.

References

--

--

Pavol Loffay
JaegerTracing

Software engineer working in observability space. Working on Hypertrace, OpenTelemetry, Jaeger, OpenTracing, MicroProfile projects.