Docker free Kafka integration tests

Published in

Data rocks

6 min readFeb 8, 2021

Source: https://appsoft.pro/wp-content/uploads/2020/01/CKZjY_HXAAIxjfZ.png

Are you tired of waiting for Docker containers to fire up while testing your Kafka solution? Are you tired of waiting for an ephemeral environment to be available before running your tests? Are you looking for a solution to reduce the time you are loosing waiting for Apache Kafka cluster and Confluent Schema Registry to start while testing?

Source: https://media.giphy.com/media/LRiazihWvlztC/giphy.gif

schemaregistry-junit is the answer you were looking for!

By pairing schemaregistry-junit together with kafka-junit, you can speed up your tests and shorten the feedback loop by reducing the time lost waiting for Docker containers or ephemeral environments to start up. schemaregistry-junit allows you to run a fully working Confluent Schema Registry completely in-memory. The server lifecycle is fully automated via JUnit extensions. The library provides a fluent DSL to configure the Confluent Schema Registry as part of your JUnit test definition.

A bit of context

While testing my code using Kafka Streams or even a basic Kafka Consumer or Producer, I spent a lot of time waiting for a temporary test environment to be ready before running my tests. I then replaced the ephemeral test environment with Testcontainers. Although I saw an improvement, the time lost waiting for the Docker containers to boot was not negligible. I soon realised the time lost waiting for test resources to be available was higher than the actual test execution.

One could say: why didn’t you use the dev cluster to run your tests? The main reasons behind not using the dev cluster while running my tests were two:

I wanted my tests to run in isolation, therefore I needed a clean environment for each run
I was testing new code, hence I was not 100% sure about its correctness. Given how crucial is Kafka in my architecture, I did not want to introduce any bad message in any of the topics I was writing to. Broken messages could become poison-pills for downstream consumers and cause the dev environment’s instability, affecting other teams’ work.

Still not happy with the speed offered by Testcontainers and conscious I only required a Kafka cluster and a Confluent Schema Registry to run my tests (these tests did not have any other external dependencies other than Kafka and Confluent Schema Registry), I started looking for an in-memory alternative. That’s how I came across kafka-junit. Quoting kafka-junit README

This library wraps Apache Kafka’s KafkaServerStartable class and allows you to easily create and run tests against one or more “real” kafka brokers. No longer do you need to setup and coordinate with an external kafka cluster for your tests! The library transparently supports running a single or multi-broker cluster. Running a multi-broker cluster allows you to validate how your software reacts under various error scenarios, such as when one or more brokers become unavailable.

Source https://media.giphy.com/media/11sBLVxNs7v6WA/giphy.gif

Sadly, my joy was short-lived. I was still missing the Schema Registry. Although I evaluated the hybrid approach of using kafka-junit plus a Docker Confluent Schema Registry, the time required by the Confluent Schema Registry to start up inside Docker was still in the order of 10s of seconds. These are the reasons that led to the development of schemaregistry-junit.

How does it work

Both JUnit4 and JUnit5 offer a solution to manage external resources as part of the test class definition. JUnit4 exposes this functionality via ClassRule, while JUnit5 provides a more powerful approach called Extension.
ClassRule exposes two functions: before and after, the former is used to set up the rule, and the latter is used to clean it up. If a test class contains a ClassRule definition, JUnit will invoke the before function before running any tests; after every test run, independently from their results, JUnit will invoke the function after. Extensions in JUnit5 follow the same pattern with an enriched semantic.

In the context of schemaregistry-junit, the before function configures and starts the Schema Registry server while the after function stops and cleans up.

More in details, the before function proxies to the test resource start function. This function mimics the behaviour of the SchemaRegistry’s main function. start executes four steps

instantiate the SchemaRegistryConfig which validates user-specified properties
instantiate the SchemaRegistryRestApplication using the config created at point 1
create the Jetty server using the SchemaRegistryRestApplication made at step 2
start the Jetty server created at step 3

After step 4 is executed, an in-memory fully working Confluent Schema Registry is up and running in a few milliseconds.
Once all tests have run, the after method proxies to the stop function, which will stop the Jetty server.
These steps take just a few milliseconds.

Thanks to kafka-junit and schemaregistry-junit the time lost waiting for test resources is reduced to just 100s of milliseconds.

Source: https://media.giphy.com/media/7srpeY4TZMrO8/giphy.gif

Only one gotcha

schemaregistry-junit requires kafka-junit to be executed before it can start because Confluent Schema Registry requires the broker hosts to connect to a running Kafka to be used as a storage layer. Such information is exposed by kafka-junit only after it is up and running. This requirement/limitation clearly defines an order relationship between these two resources. While in JUnit5 order between Extensions can be easily achieved using the Order annotation, JUnit4 does not offer such a mechanism. Quoting JUnit4 ClassRule documentation

If there are multiple annotated ClassRules on a class, they will be applied in an order that depends on your JVM's implementation of the reflection API, which is undefined, in general. However, Rules defined by fields will always be applied before Rules defined by methods.

Hence, in JUnit4, kafka-junit should be defined via a variable while a method should be used to instantiate schemaregistry-junit. This way kafka-junit will be started before schemaregistry-junit.

How to add schemaregistry-junit to your project

schemaregistry-junit is available in MavenCentral. You can add it into your project as any other java dependency. Below, an example using Gradle and JUnit5

dependencies {
  implementation "io.confluent:kafka-avro-serializer:6.0.0"
  implementation "io.confluent:kafka-schema-registry-client:6.0.0"

  testImplementation "com.salesforce.kafka.test:kafka-junit5:3.2.2"

  // schema-registry junit
  testImplementation 'io.github.data-rocks-team:schemaregistry-junit5:0.1.1'
  testImplementation "io.confluent:kafka-schema-registry:6.0.0"
}

Note how schemaregistry-junit is followed by io.confluent:kafka-schema-registry. This extra dependency is required because schemaregistry-junit is Schema Registry version agnostic: it does not come with any built-in Confluent Schema Registry, schemaregistry-junit can load any version available in the classpath.
schemaregistry-junit works with every Confluent Schema Registry version starting from 4.0.0 onward (4.0.0 was released in 2017).

For a JUnit4 setup, there are only two differences: kafka-junit5 should be replaced with kafka-junit4 as well as schemaregistry-junit5 should be swapped with schemaregistry-junit4 .

dependencies {
  implementation "io.confluent:kafka-avro-serializer:6.0.0"
  implementation "io.confluent:kafka-schema-registry-client:6.0.0"

  testImplementation "com.salesforce.kafka.test:kafka-junit4:3.2.2"

  // schema-registry junit
  testImplementation 'io.github.data-rocks-team:schemaregistry-junit4:0.1.1'
  testImplementation "io.confluent:kafka-schema-registry:6.0.0"
}

How to use schemaregistry-junit JUnit 5

In order to use schemaregistry-junit alongside JUnit5, you need to add the following code

@RegisterExtension
@Order(1)
static final SharedKafkaTestResource kafka = new     
  SharedKafkaTestResource().withBrokers(1);@RegisterExtension
@Order(2)
static final SharedSchemaRegistryTestResource schemaRegistry =
    new SharedSchemaRegistryTestResource()
        .withBootstrapServers(kafka::getKafkaConnectString);

Note how SharedKafkaTestResource is used to configure the SharedSchemaRegistryTestResource, this is why we need the ordering annotation.
Inside your tests, you can use schemaRegistry.schemaRegistryUrl() to access the in-memory running schema-registry or schemaRegistry.schemaRegistryTestUtils().schemaRegistryClient() to leverage a preconfigure schema-registry client.

How to use schemaregistry-junit JUnit 4

JUnit4 does not differ match from JUnit 5, the only differences are related to the absence of an ordering annotation.

@ClassRule
public static final SharedKafkaTestResource kafka = new
  SharedKafkaTestResource().withBrokers(1);private static final SharedSchemaRegistryTestResource schemaRegistry = new SharedSchemaRegistryTestResource()
  .withBootstrapServers(kafka::getKafkaConnectString);@ClassRule
public static TestRule schemaRegistry() {
  return schemaRegistry;
}

In this case, SharedSchemaRegistryTestResource is not defined as ClassRule, instead schemaRegistry() function holds the ClassRule annotation. This approach will guarantee the ordering behaviour we need between the two ClassRule.

Conclusion

schemaregistry-junit together with kafka-junit have been a valid solution to speed up tests and shorten the feedback loop by avoiding dependence on Docker containers or ephemeral environments.

If you are interested in using schemaregistry-junit, but you still don’t have a clear idea of how it works, take a look at the examples listed in the project README here.

schemaregistry-junit is fully open-sourced and licensed under the MIT License. The latest version is available in MavenCentral under io.github.data-rocks-team group (direct link here). The latest JavaDoc is automatically published in javadoc.io under io.github.data-rocks-team group (direct link here).

If you have any idea or suggestion to improve the library, or you need a feature currently not supported, or you found a bug, feel free to open a GitHub issue, submit a PR or get in touch!
You can find me on LinkedIn or Twitter.

Acknowledgements

Thanks to Nathan Phillips for reviewing this post, being an early beta tester and the first contributor of schemaregistry-junit.