Consuming from a secured Kafka with Spark streaming
We use Cloudera’s CDH for our hadoop servers at work. Up until recently, we’ve been using Spark 1.6 since that was the default Spark version that was included with CDH.
However, we recently had a need to read from a secured Kafka (version 0.10, since prior Kafka versions does not support security) maintained by another team. Thus we had to install Cloudera Spark 2.1, which includes the Spark Streaming integration for Kafka 0.10. Here’s the official link comparing spark streaming with Kafka 0.8 vs Kafka 0.10.
Without further ado, below is how we were able to read from a Kafka that was authenticated using SASL with plaintext. We use scala, and build our jar using sbt assembly.
To run the spark submit job:
spark2-submit — files jaas.conf — driver-java-options “-Djava.security.auth.login.config=./jaas.conf” — conf “spark.executor.extraJavaOptions=-Djava.security.auth.login.config=./jaas.conf” — class Spark2Kafka Spark2Kafka-assembly-1.0.jar