Twitter Data Analysis for the Lazy in Elastic Stack (Xbox VS PlayStation)

Maciej Szymczyk
The Startup

--

Twitter data can be obtained in many ways, but who wants to write the code 😉. Especially one that will work 24/7. In Elastic Stack, you can easily collect and analyze data from Twitter. Logstash has input to collect tweets. Kafka Connect discussed in the previous story also has this option, but Logstash can send data to many sources (including Apache Kafka) and is easier to use.

In the article:

  • Saving a tweet stream to Elasticsearch in Logstash
  • Visualizations in Kibana (Xbox vs PlayStation)
  • Removing HTML tags for the keyword with a standardization mechanism

Elastic Stack Environment

All the necessary components are contained in one docker-compose. If you already have an Elasticsearch cluster, you just need Logstash.

version: '3.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
restart: unless-stopped
environment:
- discovery.type=single-node
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata:/usr/share/elasticsearch/data
restart: unless-stopped
ports:
…

--

--

Maciej Szymczyk
The Startup

Software Developer, Big Data Engineer, Blogger (https://wiadrodanych.pl), Amateur Cyclists & Triathlete, @maciej_szymczyk