Miscellaneous ways of Installation of Kafka on Ubuntu 18.04 for Novice Data Architect

Suraj Saha
The Startup
Published in
4 min readJun 8, 2020
Photo by Lukas Blazek on Unsplash

I have kept this blog as short as possible on two commonly used ways of how to use Kafka Producer-Consumer processes over single node single broker acrhitecture with as minimal required features as possible to make it simple.

Prerequisite:

  1. Ubuntu 18.04 server and a non-root user with sudo privileges.
  2. At least 4GB of RAM is required on the server. Installation without this amount of RAM may cause the Kafka service to fail, with the Java Virtual Machine(JVM) throwing out an “Out Of Memory” exception during startup. Even using Docker services one need to make sure the host machine has more than 4GB of RAM (advisable 8 GB RAM) as it is an absolute requirement for Kafka will consume a big part of RAM.
  3. OpenJDK 8/11 should be installed on the server. Kafka is written in Java, so it requires a JVM, however in order to use Confluent Open Source binaries one need to use Java 8.

Remember : While using sudo one must remember the root password.

1. Use Kafka setup files licensed by Apache

In order to install Apache Kafka as stated by official Kafka, we follow series of steps:

  • First Update the package repository cache of Ubuntu server with the following command :
$sudo apt-get update
  • Then we will install OpenJDK 8 or 11 on Ubuntu 18.04 server :
$ sudo apt-get intall openjdk-8-jdk
$ sudo apt-get install openjdk-11-jdk
  • Then we download the Kafka 2.4.0 (or any latest version as available) from Official website of Kafka :
$ sudo wgethttp://apachemirror.wuchna.com/kafka/2.4.0/kafka_2.12–2.4.0.tgzor$ sudo wget http://mirrors.estointernet.in/apache/kafka/2.4.0/kafka_2.12_2.4.0.tgz
  • After the installation we have to download the Zookeeper :
$ sudo wgethttp://apachemirror.wuchna.com/zookeeper/stable/apache-zookeeper-3.5.6-bin.tar.gzor$ sudo wgethttp://mirrors.estointernet.in/apachezookeeper/
  • Now we need to extract the kafka_2.12–2.4.0.tgz and apache-zookeeper-3.5.6-bin.tar.gz
$ sudo tar -xzf Downloads/kafka_2.12–2.4.0.tgz$ sudo tar -xzf Downloads/apache-zookeeper-3.5.6-bin.tar.gz
  • Now we enter into Kafka directory :
$ cd kafka_2.12–2.4.0.tgz
  • As Kafka uses Zookeeper so we need to first start a Zookeeper server, if we don’t have one. Always remember that Kafka broker runs over Zookeeper servers. It is Zookeeper server that is responsible for management of Kafka leaders and followers.
$ sudo bin/zookeper-server-start.sh config/zookeeper.properties
  • Now after Zookeeper instance is on, we use Kafka-server :
$ bin/kafka-server-start.sh config/server.properties
  • We create a topic named “Test123” :
$ sudo bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic Test123
  • We can check created topic by writing the command :
$ sudo bin/kafka-topics.sh --list --bootstrap-server localhost:9092
  • After running the producer we can write some messages:
$ sudo bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Test123

and then we can consume the data by running a consumer process on another terminal.

$ sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic Test123 --from-beginning

The demo of the above hand usage of Kafka for a single broker and node is shown below:

2. Install Kafka from Confluent Open Source

Confluent is a organisation founded by Neha Narkhede, Jun Rao and Jay Kreps who were solely responsible for the development of Kafka when they were in LinkedIn. The organization provides us with many external open source libraries and gives advantage over Kafka in the following ways:

  • Additional Clients : Supports C, C++, Python, .NET and several other non-Java Clients.
  • REST Proxy — Provides universal access to Kafka from any network connected device via HTTP
  • Schema Registry — Central registry for the format of Kafka data — guarantees all data is always consumable
  • Pre-Built Connectors — HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified and supported by Confluent

Now we will be revising the steps for the basic installation of Kafka and later we will see how to use commands for using Kafka.

  • First Update the package repository cache of Ubuntu server with the following command :
$ sudo apt-get update
  • Next we will install the Confluent public key :
$ wget -qO --http://packages.confluent.io/deb/3.3/archive.key | sudo apt-key add
  • Next we add the repository to sources list:
$ sudo add-apt-repository “deb [arch=amd64] http://packages.confluent.io/deb/3.3 stable main
  • Now we install Confluent Open Source Platform:
$ sudo apt-get installconfluent-platform-oss-2.11
  • Now we can start our Confluent OSS Platform. On starting the platform we start Kafka Server, Zookeeper Server, Schema Registry, REST API and Kafka Connect at the same time in the background.
$ sudo confluent start
  • As the servers are up we can create a topic in our CLI using commnd :
$ cd /usr/bin$ sudo kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic Test123
  • We can already observe we don’t have to give .sh extension in our commands. But we need to enter into /usr/bin folder before accessing this commands from now on. We can start our producer process by using :
$ sudo bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Test123
  • Write some messages in Kafka Producer. In another terminal, we will be implementing Consumer process in order to consume the messages:
$ cd /usr/bin$ sudo bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic Test123 --from-beginning

Hopefully this is clear to all the Debian Linux users and let me know if any doubts ever come up.

--

--