How to Use Luigi and Docker to Build a Simple Data Engineering Pipeline for Medium

Introduction

The idea for this project started when some Python scripts I wrote for a data engineering pipeline got out of control:

  • Some parts of the pipeline took a long time to run and the process would sometimes fail.

My process looked like this:


This tutorial describes an approach for building a simple ChatOps bot, which uses Slack and Grafana to query system status. The idea is to be able to check the status of your system with a conversational interface if you’re away from your desk but still have basic connectivity e.g. on your phone:

This tutorial is split into two parts: the first part will set up the infrastructure for monitoring Kafka with Prometheus and Grafana, and the second part will build a simple bot with Python which can respond to questions and return Grafana graphs over Slack.

Notifications are a native…


Blockchain technology and Apache Kafka share characteristics which suggest a natural affinity. For instance, both share the concept of an ‘immutable append only log’. In the case of a Kafka partition:

Each partition is an ordered, immutable sequence of records that is continually appended to — a structured commit log. The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition [Apache Kafka]

Whereas a blockchain can be described as:

a continuously growing list of records, called blocks, which are linked and secured using cryptography. Each block typically…

Luc Russell

Software Engineering @DellEMC | Contact me at https://www.linkedin.com/in/lucrussell

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store