KAFKA in docker container and command line

Andriy Lesch
4 min readNov 1, 2022

--

Hello everyone. In the current story I would like to share for you info. How to produce and consume Kafka messages using just the command line. Easy but nice to try a little play with it.

Historically happens that first my experience with Kafka was landoop/fast-data-dev docker image. So that’s why I will show all examples by using it.

So are you ready to start? I hope yes.

So what do you need before starting?

In our perfect world, there is no problem with OS. Most tools support cross-platform. Also there is not exception for Docker Desktop tool (feel free to download and install on your local PC link here). When all will be installed on your local PC check that these commands are working well.

docker -v
docker-compose -v

in a successful case, you will be available to see in console

Prepare docker-compose file and execute it

open directory and create new file with name docker-compose.yml. Copy the contents below into your file and save it.

version: "3.7"
services:
kafka:
image: landoop/fast-data-dev:latest
container_name: kafka
ports:
- 3181:3181
- 3040:3040
- 7082:7082
- 7083:7083
- 9092:9092
environment:
ADV_HOST: 127.0.0.1
RUNTESTS: 0
SAMPLEDATA: 0
BROKER_PORT: 9092
REST_PORT: 7082
CONNECT_PORT: 7083
ZK_PORT: 3181
WEB_PORT: 3040
REGISTRY_PORT: 8081
restart: always

Start Kafka with the -d option to run in detached mode:

docker-compose up -d

The above command starts Kafka container. If all is ok you can open in browser http://localhost:3040

Produce message to Kafka

On production case, DevOps or Developer will manage to create kafka topic and provide configuration for consumer and producer. For demo version, Kafka topic will be opened automatically during producing first message.

Open a terminal and execute command, it will allow you to get access to bash in container

docker container exec -it kafka /bin/bash

after execute command

kafka-console-producer --broker-list localhost:9092 --topic test-message-in

then provide a message and press enter. In our case, it will be (you can also use some text, for current story it’s not important)

{"name":"Ronda Shepard", "email":"rondashepard@solaren.com"}

you can also use some text, for current story it’s not important.

Consume message from kafka

there are two cases to check it if messages were send to kafka topic:

via Web UI — just open http://localhost:3040 -> Topics -> chose your topic name (in our case test-message-in) -> in opened table chose tab topic or raw data.

via terminal — open new terminal window, repeat command

docker container exec -it kafka /bin/bashkafka-console-consumer --bootstrap-server localhost:9092 --topic test-message-in -from-beginning

Quick demo

As bonus commands

# create kafka topic with custom partitions
kafka-topics --bootstrap-server localhost:9092 --topic <YOUR_TOPIC> --create --partitions <NUMBER>
# list of kafka topics
kafka-topics --bootstrap-server localhost:9092 --list
kafka-topics.sh --bootstrap-server localhost:9092 --describe
# delete kafka topic
kafka-topics --bootstrap-server localhost:9092 --delete --topic <YOUR_TOPIC>

Also in

/var/log directory - all log files.
/data/kafka/logdir directory - all kafka topics (provided as directories).

FYI : As kafka topic has test-message-in name with defaut settings as partition=1. So it’s possible to find test-message-in-0 directory. Suffix as zero it means number of partitions. In case partitions > n , it will be created directories as <KAFKA-TOPIC>-0, <KAFKA-TOPIC>-1, … <KAFKA-TOPIC>-n-1

in *.log file you can find all messages in kafka topic.

Let’s see what it is each file.

*.log file — it’s a table with our messages and additional information

offset | Position | Timestamp         | Message
0 0 <long_timestamp1> <msg1>
1 200 <long_timestamp2> <msg2>
.... etc

where offset - number message in partition (starting from 0), if our message is 200 byte, next line will be started with offset 1 and position 200, and etc.

*.index file - file index with data as

offset | Position
0 0
1 200
.... etc

it’s mapping offset on position, if we want to read from offset number N (as example 1), it’s going on this file and read offset number and position after going to file *.log and seek 200 byte. So our message starting from 200 byte and offset 1.

*.timeindex - mapping offset to timestamp

offset | Timestamp        
0 <long_timestamp1>
1 <long_timestamp2>
.... etc

it can be useful if we want to read message but we don’t know offset but know date. Starting from <long_timestamp2> or closest time after. So possible to find offset number. But it’s working if in producer wasn’t set timestamp randomly. So process of read will be going to *.timeindex -> *.index -> *.log -> and finally read our message

====> to be continue KAFKA producer & consumer (part 1)

Thank you for reading 👍. Kindly follow me for more interesting articles.

--

--

Andriy Lesch

Software Engineer | Full stack Developer (Asp.Net, C#, Angular, etc)