Overview of Custom Kafka Connectors
First of all a “Big Hi” to all of you who are all reading this Blog. This is my first blog across all blogging mediums and thanks for spending some time reading it.
This blog gives some basic information on the need for Custom Kafka Connectors and how to create it.
First of all What is Kafka?
To answer the above question in technical terms, Apache Kafka is one of the most used open-source publish-subscribe based distributed event streaming platforms for storing, reading, and analyzing streaming data.
In general terms for better understanding, Apache Kafka is a tool that is used as a pipeline between two applications to transfer data.
The basic features provided by Kafka are below :
- Kafka Producers
- Kafka Consumers
- Kafka Connect
- Kafka Streams
And today we will look into Kafka Connect.
Kafka Connect
Kafka Connect is a framework which helps Kafka to connect External Systems like Databases, File Systems etc.. and vice versa.
There are two types of connector we deal with:
- Source Connector: This type of connector helps to streamline data from External Systems into Kafka Topic.
- Sink Connector: This type of connector helps to streamline data from Kafka Topic into External Systems.
The Internal functionality of the Connector
The Connector generally divides the given work into smaller tasks and assign them to the workers internally. Workers are the processes that execute the tasks assigned by the Connectors.
Need for Custom Kafka Connectors
There are a lot of these connectors available for ready to use in the market such as the Official Confluent website and the Github. Please do check this link for Confluent Connectors :
But the need for Custom Kafka Connectors arises at two cases :
- When the Kafka Connector that is readily available doesn’t serve you the purpose on which the connector is introduced for the use-case
- Licensing Problems
So the concept of Custom Kafka Connector is introduced, to create a connector that helps the user to meet the requirements and to opt-out from complex licensing strategies.
How to create a Custom Kafka Connector using JAVA
Create a Maven Java Project and add the below maven dependency in the pom.xml file.
<dependency><groupId>org.apache.kafka</groupId><artifactId>connect-api</artifactId></dependency>
- Configuration Class: This class is to describe and validate the configuration properties that will be used for our connector. The Java Class needs to extend the AbstractConfig Class.
import org.apache.kafka.common.config.AbstractConfig;
2. SourceConnector or SinkConnector Class: This class is at the abstraction level where the implementation logic of the connector resides. The data that is sent in the Kafka Topic first reaches here and handled before sending to the Task class. Based on the type of the connector created, the respective class should be extended.
import org.apache.kafka.connect.sink.SinkTask;import org.apache.kafka.connect.source.SourceTask;
These methods should be overridden on the extension of the class:
1)start
2)stop
3)taskClass
4)taskConfigs
5)config
6)version
3. SourceTask or SinkTask Class: This class is where the functionality of the task (purpose of the connector) resides. In Easy terms, the processing logic of the connector is written here. Based on the type of the connector created, the respective class should be extended.
import org.apache.kafka.connect.sink.SinkConnector;import org.apache.kafka.connect.source.SourceConnector;
These methods should be overridden on the extension of the class:
1)start
2)stop
3)poll or put(Based on the Connector Type)
4)version
Apart from these classes,two-property files are required for this connector to run-in the Kafka Environment.
- Config.properties: This file will contain the configurations that are required for the connector to run. The Configuration Class of the Connector will be populated using this data and validated.
- Standalone.properties or Distributed.properties: The Kafka Environment can be generally set up in Standalone mode or Distributed(Clustered) mode. Depending on the type of mode, the respective file is populated with the required properties like bootstrap.servers, rest.port, group.id etc…
Conclusion
Hope you got some overview of the Custom Kafka Connectors in terms of its need and creation.
Thanks for spending your time reading this blog. Feel free to comment on the topics covered above and suggestions are welcome.
Check out the below link as a reference concerning Connector creation