Kafka connect API

Well lets start it like this ,if a reliable resource is available we can always use it without reinventing the wheel. thats the whole concept behind the concept kafka connect . we already know there are so many sources that we can import daata to kafkaa from. you are not the first person to pull data from a source and direct it to kafka.

so what is kafka connect ?

kafka connect is basically a framework that is included in apache kafka

the main functionality of it is to connect other systems to kafka

so to exchange data between kafka and another system we need to initiate kafka connectors

there are 2 types of connectors

source connectors

sink connectors

source connectors import data to kafka from a different sources .for an example to push real time tweets from twitter to kafka we can use a twitter source connector . we can pull data from so many sources including elastic search ,files and directories ,jenkins ,github ,couch DB ,apache ignite solr and manuy more .there are connectors available for each.

sink connectors on the other hand export data from kafka to a different source .this source maybe a relational database ,nosql database or even be elastic search.

therefore connectors are written by confluent or other vendors or third parties to facilitate this exchange of data between Kafka and other systems

here you will find a complete list of connectors available both sink and source.

therefore you don't need to write new code.instead you fetch the code that is already there and adjust the configurations so that it suits your purpose.

isnt it easy ?

so Kafka connect API makes life so easy and helps you connect kafka to almost any known popular system. this flexibility offered by kafka makes it so powerful.

kafka connect is all about connectors re use and simplifying getting data in an out of kafka

so our objective is to pull data from twitter to kafka

so we will need a twitter source connector

if you follow the above link way down you will find the twitter source connector and 2 links with it .i will use this one

read it carefully ,go to the confugurations section

there you will find the configurations explained

name=connector1
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector

# Set these required values
twitter.oauth.accessTokenSecret=
process.deletes=
filter.keywords=
kafka.status.topic=
kafka.delete.topic=
twitter.oauth.consumerSecret=
twitter.oauth.accessToken=
twitter.oauth.consumerKey=

first configuration twitter.oauth.accessTokenSecret= and the last three you will receive the tokens necessary to fill these values when you register for a twitter developer account .

go here and register if you havent ,they will review your application and will allow you to extract tweets from twitter .

process.deletes=
filter.keywords=
kafka.status.topic=
kafka.delete.topic=

these four we have to fill

got to the releases ,get the latest zip file

all the jars in the zip file include them in our project

in your config folder there is a file named connect-standalone.properties ,open it it will have the defaults and in the bottom in the plugin-path field set the path to the jars you added to the project

make another file twitter.properties and include these with the values

name=connector1
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector

# Set these required values
twitter.oauth.accessTokenSecret=
process.deletes=
filter.keywords=
kafka.status.topic=twitter_status_connect
kafka.delete.topic=
twitter.oauth.consumerSecret=
twitter.oauth.accessToken=
twitter.oauth.consumerKey=

create a topic for kafka connector to write to and include it at kafka.status.topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic twitter_status_connect

create the other topic too

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic twitter_deletes_connect

now go ahead and add this newly created topic also in configs

name=connector1
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector

# Set these required values
twitter.oauth.accessTokenSecret= from twitter developer credentials
process.deletes= false
filter.keywords= cricket
kafka.status.topic=twitter_status_connect
kafka.delete.topic= twitter_deletes_connect
twitter.oauth.consumerSecret= from twitter developer credentials
twitter.oauth.accessToken= from twitter developer credentials
twitter.oauth.consumerKey= from twitter developer credentials

made process.deletes false and keyword was set to cricket so that we pull tweets with word cricket on it

make a folder kafka-connect and put connect-standalone.properties .connectors folder with jars in it . a run.sh file where we have the commands to run and twitter.properties file.

bin/connect-standalone.sh connect-standalone.properties twitter.properties

what this command basically says is use connect-standalone.properties and then twitter.properties files

when you run this command you will see data being pulled from twitter and you will see the data from the consumer console

so how easy was it ,get your connector from confluent page ,set its properties and configurations include the jars in your project ,create the necessary topics to which twitter connector should send data ,then run your connector. walk in the park.

this is the fundamentals ,i strongly suggest you do some research yourself about Kafka connect .Its a vast area eventhough the idea sounded simple .

so good luck !! you can refer to official Kafka documentation if you want to deep dive in to this topic

Sithija Thewa Hettige

Sometimes I write stuff people like

Sithija Thewahettige

Written by

Software Engineering Intern @ mubasher technologies

Sithija Thewa Hettige

Sometimes I write stuff people like

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade