Building an end-to-end data pipeline from scratch

Rafay Aleem
Jun 10, 2020 · 1 min read

This series is meant to cover a broad range of topics that involve setting up a production grade ETL pipeline. I have broken it down into chapters and more of them will be added as we move along this series.

I am a Mac user so everything that is written is in context of software and tools installed on OS X. You should be able to find and install equivalent versions of those for your own setup.

Feel free to leave any feedback on my Twitter handle or through my website.

Chapter 1 - Orchestration basics: Setting up Airflow in a local Kubernetes cluster using helm

Chapter 2 - Introduction to KubernetesExecutor and KubernetesPodOperator

Uncanny Recursions

A software engineer’s journey around tech and product

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store