Serverless observability made easy with Lambda Powertools for Java — Metrics & Logging

Metrics

dependencies {
aspect ‘software.amazon.lambda:powertools-metrics:1.12.1’
}
  • We retrieve the MetricsLogger from Lambda Powertools instead of instantiating one:
MetricsLogger metricsLogger = MetricsUtils.metricsLogger();
  • While it’s still possible, we don’t need to specify the namespace everywhere in the code. We can set it using an environment variable:
env.put(“POWERTOOLS_METRICS_NAMESPACE”, “DeliveryApi”);
  • Adding the @Metrics annotation allows us to wrap the handler method and permits us to remove the call to flush at the end of the method. Not only that, but Lambda Powertools also ensures that metrics are flushed for every call, even in the case of an exception withinin the function. This greatly simplifies the error handling (finally blocks) in our code. The annotation also provides the captureColdStart option that we can set to automatically capture the number of cold starts as an additional metric.
@Tracing
@Metrics(captureColdStart = true)

public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent input, Context context) {
// ...
}
  • Apart from that, we can add dimensions and collect metrics exactly the same way as before:
metricsLogger.putDimensions(DimensionSet.of("FunctionName", "CreateSlots"));
metricsLogger.putMetric("SlotsCreated", rowsUpdated, Unit.COUNT);
  • As a result, we have almost the same logs in Cloudwatch:
X-Ray trace id in the metrics
X-Ray trace id in the metrics
  • The main difference in the logs is that we now have a function_request_id and xray_trace_id. Those 2 identifiers, and especially the trace id, are really helpful when investigating an issue. Thanks to it, we can correlate the collected metrics and the trace in X-Ray. For example, it would be normal for the CreateSlots function to take more time when multiple slots have to be created.
  • In addition we also have the cold starts, which is actually a really important metric to track to provide the best experience to our users, for example enabling provisioned concurrency.
Cold starts automatically measured by powertools-metrics
Cold starts automatically measured by powertools-metrics

Logging

  1. First, we had a bunch of debug logs, … which are obviously disabled in production. In order to get some debug logs in prod, we’d need to modify our log4j2.xml config file and redeploy the functions. Not really ideal during a crisis.
  2. Second, the few logs we had in Cloudwath were raw messages. While this is humanly readable, it is really hard to search and extract valuable information without being a grep expert or using more advanced/expensive tools (ElasticSearch, Splunk, Datadog, …).
  3. Lastly, the logs were missing a lot of context information: a user id for example, the farm id, … a request id or the famous trace id I mentioned earlier. Due to this lack of information, they were really hard to filter and to correlate with any event.
dependencies {
aspect ‘software.amazon.lambda:powertools-logging:1.12.1’
}

Log level

Structured logging and additional context

Well-Architected Framework — Serverless Lens — Operational Excellence
  • To get the structured logging, we need to configure our log4j2.xml to use the JsonTemplateLayout and a layout provided in the Lambda Powertools library:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
<Appenders>
<Console name="JsonAppender" target="SYSTEM_OUT">
<JsonTemplateLayout eventTemplateUri="classpath:LambdaJsonLayout.json" />
</Console>
</Appenders>
<Loggers>
<Logger name="JsonLogger" level="INFO" additivity="false">
<AppenderRef ref="JsonAppender"/>
</Logger>
<Root level="info">
<AppenderRef ref="JsonAppender"/>
</Root>
</Loggers>
</Configuration>
  • To get the additional context, we simply add the @Logging annotation to the handleRequest method. We can complement this with additional information using custom keys, in our case the farmId:
@Logging
@Tracing
@Metrics
public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent input, Context context) {
// ...
LoggingUtils.appendKey("farmId", farmId);// ...
}
  • Optionally we can log the event that triggered the function using the logEvent option: @Logging(logEvent = true).
Structured logs (JSON)
Structured logs (JSON)
  • Information about the function itself (arn, name, version).
  • If it was during a cold start or no.
  • The “business” context (farm id, user id, slot id).
  • And last but not least the X-Ray trace id that will permit us to correlate this with others logs, and to jump in X-Ray for further analysis of the request.
CloudWatch Logs Insights (filter & query on the left, discovered fields on the right)
CloudWatch Logs Insights

Side Notes on performance

Percentiles (duration & initDuration) for the BookingSlot Lambda function without Lambda Powertools
Percentiles (duration & initDuration) for the BookingSlot Lambda function with Lambda Powertools (Logging/Metrics/Tracing)

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jérôme Van Der Linden

Jérôme Van Der Linden

103 Followers

Senior Solution Architect @AWS - software craftsman, agile and devops enthusiastic, cloud advocate. Opinions are my own.