Introducing Azkarra Streams 0.7

Florian Hussonnois
StreamThoughts
Published in
5 min readMay 12, 2020
Azkarra Streams — Release 0.7

I am pleased to announce the release of Azkarra Streams 0.7. This new release packs with several major new features.

For readers that discover the framework; the Azkarra Streams project is dedicated to making the development of streaming microservices based on Apache Kafka Streams simple and fast.

With this new release, the Azkarra framework enhances Kafka Streams applications and make them easier to operate in production.

This blog post summarizes the most important improvements.

Monitoring Kafka Streams Consumers Lags

A fundamental indicator to monitor on an Apache Kafka data streaming platform is the “consumer group lag”. The consumer group lag tells us how far behind our consumer applications are from the producers — i.e if our applications are up to date in terms of records processing. Generally, the offset-lag of a consumer is computed as the (last_produced_record_offset — last_consumed_offset)for each topic/partition. So, the smaller the offset-lag the more real-time the data consumption.

Azkarra Streams 0.7 introduces new capabilities to track the consumption of your Kafka Streams instances. A new REST endpoint (/api/v1/streams/:id/offset) exposes the last consumed/committed offsets for each internal consumers of your Kafka Streams instance.

The following example shows the JSON response returned from the new endpoint :

{
"group": "word-count-topology",
"consumers": [
{
"client_id": "word-count-topology-babe9079-fc6e-4b9e-a518-3067c899e692-StreamThread-1-consumer",
"stream_thread": "word-count-topology-babe9079-fc6e-4b9e-a518-3067c899e692-StreamThread-1",
"positions": [
{
"topic": "streams-plaintext-input",
"partition": 0,
"consumed_offset": 5,
"consumed_timestamp": 1589230351903,
"committed_offset": 6,
"committed_timestamp": 1589230354458,
"log_end_offset": 6,
"log_start_offset": 0,
"lag": 0
},
{
"topic": "word-count-topology-count-repartition",
"partition": 0,
"consumed_offset": 11,
"consumed_timestamp": 1589230351903,
"committed_offset": 12,
"committed_timestamp": 1589230354525,
"log_end_offset": 12,
"log_start_offset": 12,
"lag": 0
}
]
}
]
}

Additionally, the Azkarra WebUI has been updated to give you quick access to that new indicators.

Azkarra Streams — Consumer Offsets Lags

Exporting Kafka Streams States to Kafka

The Kafka Streams API provides a few methods to get access to both the current state of a streams instance (i.e:KafkaStreams#state()) and the runtime metadata of the internal threads that execute your topology (i.e: KafkaStreams#localThreadsMetadata()).

In our work to build event streaming architectures based on Apache Kafka, we have observed a common practice of publishing the data returned by these methods in a dedicated Kafka topic. When running a bunch of Kafka Streams microservices that approach provides a centralized solution to discover both the current instances and their current states. So, we thought it would be a nice idea to offer this built-in feature in Azkarra.

Azkarra Streams 0.7 introduces a new StreamsLifecycleInterceptor so-called MonitoringStreamsInterceptor that periodically publishes streams state in the form of events that adhere to the CloudEvents specification.

The CloudEvent specification is developed under the Cloud Native Computing Foundation with the aim to describe a standardized and protocol-agnostic definition of the structure and metadata description of events.

Azkarra Streams — Streams State Cloud Event

Built-in integration with Micrometer

Another new monitoring-related feature in 0.7 is the integration between Azkarra Streams and Micrometer that provides more advanced capabilities for collecting and exposing metrics to external monitoring systems.

This integration is provided by a new module called Azkarra Metrics :

<dependency>
<groupId>io.streamthoughts</groupId>
<artifactId>azkarra-metrics</artifactId>
<version>0.7.0</version>
</dependency

For example, Azkarra Metrics allows you to easily collect JVM metrics (Classloaders, Memory, GCs, threads) and Kafka Streams metrics to expose them from a single http endpoint(e.g: /prometheus).

For more information about Azkarra Metrics, please read the documentation.

REST Extensions

Since its first public release, Azkarra Streams provides several REST endpoints that you can use to monitor and manage the lifecycle of your Kafka Streams instances. But sometimes, you may want to expose additional REST endpoints to extend Azkarra capabilities.

This new release introduces a new mechanism, called REST extensions, that allows you to register JAX-RS resources that will be loaded and initialized by the embedded web server.

This new feature is inspired by the same capability that Kafka Connect offers. REST Extensions are already used by the Azkarra Streams project itself with the new Azkarra Metrics module.

Custom Jackson Serialization

One of the main benefits offered by Azkarra Streams is the native support of interactive queries. Azkarra allows you to easily retrieve data from the state stores of Kafka Streams instances using the REST API.

Internally, Azkarra uses the Jackson library for JSON serialization and provides several serializers, for example, to handle Avro records. In most of the cases, these serializers are sufficient. But sometimes, you may have the need to configure your own custom serializers in order to use the interactive queries.

Azkarra Streams now lets you register custom Jackson Module that will be automatically configured.

Example :

@Factory
public class JacksonModuleFactory {

@Component
public Module customModule() {
var module = new SimpleModule();
module.addSerializer(
MyCustomType.class, new MyCustomSerializer());
return module;
}
}

UncaughtExceptionHandler Strategies

In previous versions, when an internal StreamThread instance of a Kafka Streams has terminated abruptly due to an uncaught exception, Azkarra would immediately stop the corresponding KafkaStreams instance.

Azkarra 0.7 introduces the new java interface StreamThreadExceptionHandler that allows implementing a custom strategy for handling uncaught exceptions. In addition, Azkarra provides three built-in implementations :

  • CloseKafkaStreamsOnThreadException
  • LogAndSkipOnThreadException
  • RestartKafkaStreamsOnThreadException

The strategy to use can be configured using a new config property default.stream.thread.exception.handler .

Enhanced components registration

Azkarra provides a very simple mechanism to implement the Dependency Injection pattern. Like any new release of Azkarra, this one offers new capabilities to help you to declare your components.
Among the new features, Azkarra 0.7 adds supports for the followings annotations that you can now use on any class or method directly or indirectly annotated with @Component

  • @Primary : Indicates that a component must be selected in the case of multiple possible implementations.
  • @Secondary : Indicates that a component must be de-prioritize in the case of multiple possible implementations.
  • @ConditionOn: Indicates one or more conditions that need to be fulfilled for a component to be eligible for use in the application.
  • @Eager : Indicates that a component should be eagerly initialized during application startup itself.

Support for Apache Kafka 2.5

Finally, Azkarra Streams is always built on the most recent version of Kafka. Therefore, this new release adds support for version 2.5.

Join the Azkarra Streams community on Slack

The Azkarra project now has a dedicated Slack to help and discuss the community. Join Us!

About Us :

StreamThoughts is an open source technology consulting company. Our mission is to help organizations create systems and applications that reflect how their business actually work, by helping them to get easy access to their data in real-time.

We deliver high-quality professional services and training, in France, in data engineering, event streams technologies and the Apache Kafka ecosystem and Confluent.Inc Streaming platform.

--

--

Florian Hussonnois
StreamThoughts

Lead Software Engineer @kestra-io | Co-founder @Streamthoughts | Apache Kafka | Open Source Enthusiast | Confluent Kafka Community Catalyst.