Krishna Parsarampuria
Feb 4 · 4 min read

Forsk Technologies

In this blog, I will explain how to build an end to end ELK (Elasticsearch, Logstash and Kibana) Pipeline in Integration with Kafka and use them to analyze Real Time Weather conditions.

ELK Pipeline

The ELK Stack or Pipeline5 is a collection of three open-source products — Elasticsearch, Logstash, and Kibana — all developed, managed and maintained by Elastic. Elasticsearch is a NoSQL database that is based on the Lucene search engine. Logstash is a log pipeline tool that accepts inputs from various sources, executes different transformations, and exports the data to various targets. Kibana is a visualization layer that works on top of Elasticsearch.

Setting up ELK Pipeline :

We setup and use this pipeline on Ubuntu Linux Platform.

Installing Java

ELK requires the installation of Java 8 and higher.

Install java with:

sudo apt-get install default-jre

To check version:

java -version

Checking your Java version now should give you the following output or similar:

openjdk version “1.8.0_151”
OpenJDK Runtime Environment (build 1.8.0_151–8u151-b12–0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Installing Elasticsearch

First Download the Elasticsearch Package from here.

After that install the package using “dpkg -i elasticsearch-6.6.0.deb” .

Then start Elasticsearch using:

sudo service elasticsearch start

Installing Kibana

Download the Kibana Package from here.

After that install the package using “dpkg -i kibana-6.6.0-amd64.deb” .

Then start Kibana using:

sudo service kibana start

Installing Logstash

Download the Logstash Tar.gz file from here. After downloading Unzip the tar using :

tar -xzf elasticsearch-6.6.0.tar.gz

We will start logstash at last after setting up all things.

Now ELK Pipeline is ready to work mow we set up Kafka.

Kafka

Apache Kafka is a distributed streaming platform. At its core, it allows systems that generate data (called Producers) to persist their data in real-time in an Apache Kafka Topic. Any topic can then be read by any number of systems who need that data in real-time (called Consumers). Therefore, at its core, Kafka is a Pub/Sub system. Behind the scenes, Kafka is distributed, scales well, replicates data across brokers (servers), can survive broker downtime, and much more.

Setting us Apache Kafka:

Download Kafka and extract the archive.

wget http://apache.claz.org/kafka/0.10.2.2/kafka_2.12-0.10.2.2.tgz
tar -zxf kafka_2.12-0.10.2.1.tgz

Start the ZooKeeper Server:

cd kafka_2.12-0.10.2.2
bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka Broker:

bin/kafka-server-start.sh config/server.properties

Now create a topic in Kafka Broker using command:

bin/kafka-topics.sh --create \
--zookeeper localhost:2181 \
--replication-factor 1 \
--partitions 1 \
--topic weather

Now Kafka is ready to work. We will create a Python script that will bring Weather information from openweathermap using API. Then we send the data to kafka producer and we take input in logstash from kafka consumer, logstash then feed data to elasticsearch and then we can visualize data using kibana.

Python Script for taking data and sending to Kafka Producer:

from confluent_kafka import Producer
import json
import requests
import time
p = Producer({‘bootstrap.servers’: ‘localhost:9092’})
while True:
response1 = requests.get("http://api.openweathermap.org /data/2.5/weather?q=jaipur&appid=**")
p.produce('weather', key='jaipur', value=response1.text)

The data is sent to Topic “weather”, now we will start logstash and take input from kafka consumer and save to elasticsearch.

To start logstash:

  • Go to logstash folder.
  • type command in terminal:
bin/logstash -e ‘input { kafka { bootstrap_servers => “localhost:9092” topics => “weather” } } filter { json { source => “message”} } output { elasticsearch { hosts => [“localhost:9200”] index => “weather” } stdout {} }

Now logstash is started and it will start sending realtime data that comes from API to Elasticsearch and we can visualize that data in Kibana.

To visualize Data open Kibana using “localhost:5601”.

  • Now go to Management => Index Patterns
  • Create Index Pattern.
  • It will show “weather” index pattern.
  • On next step select @timestamp
  • New index created.

Go to Discover page and visualize the data.


Stay connected with Forsk Technologies to learn trending technologies in industry like python, ML, DL, AI, IoT etc.

Forsk Labs

We inspire, educate and equip young minds with the computing skills to pursue 21st-century opportunities. 300+ hours of immersive instruction in Data Science (Python, Analytics, ML , DL, Streaming Analytics ) mentorship and exposure from engineers and entrepreneurs.

Krishna Parsarampuria

Written by

Forsk Labs

We inspire, educate and equip young minds with the computing skills to pursue 21st-century opportunities. 300+ hours of immersive instruction in Data Science (Python, Analytics, ML , DL, Streaming Analytics ) mentorship and exposure from engineers and entrepreneurs.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade