T-Trace: agent-like instrumentation for GraalVM
In this article we will look at the agent-like instrumentation tool T-Trace. The tool provides non-intrusive instrumentation capabilities for applications running on GraalVM. We will instrument a simple NodeJS application by using T-Trace and OpenTracing API with Jaeger NodeJS tracer.
So first things first, GraalVM is a polyglot virtual machine. It can run any JVM based language but also other mainstream languages like NodeJS, Python and Ruby… It also supports LLVM languages like C and C++. The sweet spot is that the languages can be combined together in a single application. Another very interesting feature is that GraalVM can produce a native executable for JVM languages. This feature is used by modern cloud-native Java runtimes like Quarkus and Micronaut.
Styles of instrumentation
Before we deep dive into the demo application let’s have a look at different instrumentation techniques. There are several instrumentation styles that can be applied to code fragments or the whole application. At a high level we can talk about black-box and white-box instrumentation. The black-box technique does not require any modifications to the application. The monitoring system, in this case, uses only events produced by the application. These events can be logs or any data that the application produces.
In this article we are going to focus on the white-box technique. This technique assumes that the code of the application is known and can be inspected. There are several methods of how the instrumentation points can be added to the application:
- Non-intrusive agent-like instrumentation: in this case the instrumentation is added to the application dynamically. In some languages (Python, Ruby, JavaScript) this technique is called monkey-patching where modules or classes are dynamically replaced at runtime. In other languages less dynamic techniques are allowed, for instance
-javaagent
in Java. - Explicit instrumentation: this instrumentation is applied directly to the code. However it does not mean that the application developer has to instrument all the code. The instrumentation code can be embedded directly into (RPC) frameworks or external integration. In some languages and runtime frameworks it is easier to initialize the instrumentation like in other languages. For instance in Java it’s enough just to drop an instrumentation JAR into the classpath, whereas in statically linked languages like Golang the initialization has to be done explicitly.
The application
The application is a simple NodeJS server with one handler. The full demo code with instructions can be found in pavolloffay/graalvm-t-trace. Here is the server.js
file:
const port = 3000;
const http = require("http");
const srv = http.createServer((req, res) => {
console.log(`server: obtained request ${res.id}`);
setTimeout(() => {
res.write(`OK# ${res.id}`);
console.log(`server: replied to request ${res.id}`);
res.end();
}, 5);
});
srv.listen(port);
As you can see there is no tracing related in the application. The application prints request ids at the beginning and at the end of the request. The res.id
is set in the agent script — this is just for the demo purpose to demonstrate that the agent script can pass objects to the main application.
Now let’s have a look at the instrumentation script jaegernode.js
. There are two functions: initializeJaeger
and initializeAgent
. The first function creates Jaeger tracer instance with HTTP sender to the collector port 14268
and calls the second one at the end:
let initializeJaeger = function (require) {
let http = require("http"); console.log(`${typeof
http.createServer} http.createServer is available to the agent`);
var initTracer = require('jaeger-client').initTracer;
console.log('server: Jaeger tracer obtained');// See schema https://github.com/jaegertracing/jaeger-client-node/blob/master/src/configuration.js#L37
var config = {
serviceName: 't-trace-demo',
reporter: {
// Provide the traces endpoint; this forces the client to connect directly to the Collector and send
// spans over HTTP
collectorEndpoint: 'http://localhost:14268/api/traces',
// logSpans: true
},
sampler: {
type : 'const',
param : 1
}
};
var options = {
tags: {
't-trace-demo.version': '1.1.2',
},
// metrics: metrics,
logger: console,
sampler: {
type : 'const',
param : 1
}
};
var tracer = initTracer(config, options);
initializeAgent(tracer);
};
let waitForRequire = function (event) {
if (typeof process === 'object' && process.mainModule &&
process.mainModule.require) {
agent.off('source', waitForRequire);
initializeJaeger(
process.mainModule.require.bind(process.mainModule));
}
};
agent.on('source', waitForRequire, { roots: true });
The second function is initializeAgent
which is instrumenting the application code:
let initializeAgent = function(tracer) {
agent.on('enter', function(ctx, frame) {
const args = frame.args;
if ('request' !== frame.type || args.length !== 2 || typeof args[0] !== 'object' || typeof args[1] !== 'object') {
return;
}
const req = args[0];
const res = args[1];
const span = tracer.startSpan(req.method);
span.setTag("span.kind", "server");
span.setTag("http.url", req.url);
span.setTag("http.method", req.method);
res.id = span.context().spanIdStr;
res.span = span;
console.log(`agent: handling #${res.id} request for ${req.url}`);
}, {
roots: true,
rootNameFilter: name => name === 'emit',
sourceFilter: src => src.name === 'events.js'
});
agent.on('return', function(ctx, frame) {
var res = frame['this'];
if (res.span) {
res.span.setTag("http.status_code", res.statusCode);
if (res.statusCode >= 400) {
res.span.setTag("error", "true");
}
res.span.finish();
console.log(`agent: finished #${res.id} request`);
} else {
// OK, caused for example by Tracer itself connecting to Jaeger server
}
}, {
roots: true,
rootNameFilter: name => name === 'end',
sourceFilter: src => src.name === '_http_outgoing.js'
});
console.log('agent: ready');
};
The instrumentation is done via agent.on('enter', fn)
and agent.on('return', fn)
. The first instrumentation point is invoked when any function from the application is invoked and the second when a function returns execution the caller. The agent.on
function has access to the frame variables and also to the method arguments. The arguments are used in a condition to check whether the function is HTTP handler. You can also note that the span object is injected into the response.
Now let’s run the application and Jaeger server:
docker run --rm -it --net=host jaegertracing/all-in-one:1.16.0
$GRAALVM_HOME/bin/npm install jaeger-client@3.17.2
$GRAALVM_HOME/bin/node --experimental-options --js.print=true --agentscript=jaegernode.js server.js
curl http://localhost:3000
Conclusion
We have seen a NodeJS hello-world example of T-Trace with Jaeger and OpenTracing. The demo shows how agent-like instrumentation can be applied to the NodeJS application without monkey-patching.
There are many improvements that can be made to the code. For instance we could extend it by not instrumenting the code in the agent script but rather reusing OpenTracing instrumentation for NodeJS and just install it in the agent script. Another possible improvement we could do is to support dynamic loading of the script. In this case the application will be started with the agent script that exposes REST API for loading and disabling scripts. This is a very powerful feature that enables changing the granularity of the instrumentation on the fly without recompiling and redeploying the application.
T-Trace is also able to mix the instrumentation language with the main language. For example, the agent script can be written in JavaScript to trace Ruby or C++ application.
References
- Demo code: https://github.com/pavolloffay/graalvm-t-trace
- T-Trace: https://github.com/oracle/graal/blob/master/tools/docs/T-Trace.md
- T-Trace, polyglot tracing: https://github.com/oracle/graal/blob/master/tools/docs/T-Trace-Manual.md#trully-polyglot---t-trace-any-language
- T-Trace embedded tracing: https://github.com/oracle/graal/blob/master/tools/docs/T-Trace-Embedding.md