Integration Testing with Kafka & Authentication
Introduction
This article is to share the troubles I went through while writing integration tests for operations on Kafka, in the hope that it will help someone who might need to go the same route.
You might have read the first paragraph thinking: “you silly, it is so easy, just launch a container with Kafka and call it a day”, and you would be right, unless… you decided to add authentication to the mix!
What I mean with “adding authentication” is creating the integration tests to validate that your code can do the operations it is supposed to do correctly AND validate that it is also able to correctly connect and authenticate with a cluster configured with authentication, as “every” production cluster is.
So, let’s get to it
Let me tell you right off the bat that every code snippet you will see from this point forward is written in Kotlin… I hope that is okay with you. If you know Java you should have little issue understanding the code.
The Basics
Okay, let’s start looking into the actual tests. As you know, tests should be independent from each other, meaning that the passing or failing of a test should not impact any other test’s result (and if you didn’t know, now you do). In order to achieve this you basically have two routes:
- Launch a Kafka Cluster, but whatever you do in each test you have to undo after the test is completed. Meaning that you will either have a “clean-up function” per test or your test function will be bigger than needed for the scope of the test. Also keep in mind that you can still forget to undo something after some test and afect other’s results. Good luck debbuging that when they start failing randomly as the running order randomly change (as it should).
- Launch a Kafka Cluster with the same config before every test and destroy it afterwards. Meaning, you have a clean canvas before every test, effectively isolating it from the others.
I hope you liked my highly biased points to force you into understanding that the second option is the better one in my opinion.
In case it was not clear, there is an obvious advantage and also a pretty clear disadvantage with the second approach: The “advantage” is that I am sure tests are independent and I need no extra code for that. The “disadvantage” is the time to launch a container between every two tests, which basically becomes more of an issue as the number of tests increases.
So, how do you do the “launch - use - kill - repeat” for the Kafka ITs? It is pretty straight forward, using JUnit5 annotations and TestContainers.
I told you it was easy: Before each test start a Kafka Container and create an adminClient Instance (with a very basic configuration for now) pointing to the Container that was just initiated. After each test close both of them!
Authentication & Authorization
TL;DR: The final solution is right before the conclusion paragraph near the end.
Is authentication part of what we should test?
In my opinion, it is! Being quick to the point: What gains do you get by risking having code that is correctly doing whatever you want on a Kafka Cluster, Database, etc. but cannot even get to the resources in the first place? So, bottom-line: On Integration tests you should test what your application does but also, as the name of this type of tests might suggest, how it integrates with the systems it should connect to.
If the real-world scenario in which your application will operate involves authentication, test authentication in your ITs!
Additionally, if you happen to need to perform operations which require authorization and authentication you need to have it configured for your ITs otherwise you are not able to really test your application. In the Kafka setting an example of that would be anything related with ACLs: To create, change or delete ACLs on Kafka you must have authentication and authorization configured on the cluster, so basically the “Is authentication part of what we should test?” was a trick question as you need it to have ACL operations tested! 😅
The convoluted ride through documentation and Google searches
As I started to try and configure my container to enforce authentication & authorization I bumped into the first barrier: I was hoping this would be easy, like, calling some functions on the KafkaContainer to configure it or something, but boy was I wrong!
TestContainers does not really specify if their class supports any of those things, neither do they give you any hint on how to do it in any other way on their website, which left me immediately worried and thinking “This should totally be here, I don’t believe no one ever needed this!”. This thought was rapidly followed by “maybe they just expect us to go figure it out with the documentation for whatever image we decide to use within the container” and this led me into Confluent’s documentation pages.
Confluent is great, I believe they are the reference in terms of Apache Kafka services, they have a lot of things also working around Kafka and they provide their docker images publicly for you to use. I’ve also had professional contact with them and the guys were awesome. Now, that being said, their documentation looks very complete, which led me to believe that configuring a Kafka container with their image and simple plain authentication would be super easy to sort out… it wasn’t.
I am going to cut it short regarding my experience following the documentation and configuring things out: Basically I had everything as documented on their website, but still something was missing because I just could not get it to work. From that point on it was a struggle with google searches for error messages and threads of similar issues that led nowhere! Nothing worked! I’ve had a similar experience when trying to launch a KSQL cluster with a config that was not the default, and it also was a mess to get it working.
I don’t fault Confluent too much, they make money on support as well as Confluent Cloud services, so it makes sense to me that their documentation is done in a “this is almost everything, but the final touches you’ll have to be a bit clever to get to them for free” way. I don’t know if this is the case though… it is just my reading of the situation, but I am a moron.
Now, let’s get into what you actually came here to read about…
How did I do it
So the first step for me to get it to work was to get Zookeeper out of the way as since Apache Kafka 2.8.0, Zookeeper is no longer needed!
By doing so, a lot of complexity in configuration goes away immediately, so I just went for a docker image that was already using Kafka 2.8.0. Is this cheating? I wonder… without caring too much about the answer since the client config is the same and that was good enough for me. 😂
Also, another thing I did was to get rid of external jaas config files, both client and server side: You can provide jaas config directly via your config as a string or point to external files.
The files looked easier to build manually, but you have to take into consideration identation constraints and stuff like that. I just thought it was yet another point of failure for the whole thing and dropped the external files. I preferred to create auxiliary functions to build the strings as they should be and use those in the configs for the server and client.
Finally, regarding actual configuration, on the server side it was done via environment variables passed to the container running the kafka single-node cluster and I am not going to sugar coat it: To find which ones to use, it was mostly a process of consulting documentation and going full trial and error on it until I found a working configuration. On the client side it was just adding the jaas config entry.
It was not pretty, it was not a process of surgical precision, but eventually I got it working, and here is the result for anyone in need of a working base for integration tests, with Apache Kafka using Authentication & Authorization:
Conclusion
I wrote this article with the objective of sharing my pains in a fun to read and kind of a storytelling way, but also share with the community a working configuration for creating integration tests with Kafka and Authentication as it was something that did not seem to be commonly done, and I do not understand why at all.
I hope this was entertaining in some weird way, but most of all I hope this was helpful.