Kafka Connect, Postgres, and Amazon S3 — Part 2
As discussed in Part 1, we will create an end-to-end pipeline that retrieves data from Postgres using a source connector and saves it to AWS S3 using a sink connector.
Part 1: Producer Postgres records to Kafka
Part 2(Current Page) : Consumer Kafka topic records to s3 bucket
Our main focus now is building the sink connector. As mentioned earlier, we will be using Confluent Connect. The setup should not differ much from native Kafka Connect.
Steps to follow :
Step 1 - Download and install s3 plugin
confluent-hub install confluentinc/kafka-connect-s3:10.5.6
Step 2 - Configure s3 sink properties
. we are going to use the same topic that Postgres uses at Part 1
name=s3-sink
connector.class=io.confluent.connect.s3.S3SinkConnector
tasks.max=1
topics=postgres-jdbc-source-city
s3.region=eu-north-1
s3.bucket.name=confluent-kafka-connect-s3-testing-2023-05
s3.part.size=5242880
flush.size=3
storage.class=io.confluent.connect.s3.storage.S3Storage
format.class=io.confluent.connect.s3.format.avro.AvroFormat
partitioner.class=io.confluent.connect.storage.partitioner.DefaultPartitioner
schema.compatibility=NONE
Important Note : you need to set 2…