A guide to all Data Engineering Practitioners and Starters

Ketan Sahu
Plumbers Of Data Science
6 min readMay 20, 2021

If you’re planning to move into the data engineering field or already started by doing some data engineering projects, then welcome this article is for you.

Photo by tian kuan on Unsplash

If you’re reading this, so let me be obvious here, this article has no technical pinpoints to discuss. Still, it simply describes a journey of a data engineering practitioner who is a step closer to call himself a data engineer.

First, a short intro about my current background: I consider myself a data engineering practitioner but not a data engineering starter. Data Engineering practitioner because I’ve gained some valuable knowledge and skills by building some data engineering projects in the last six months, so I’m not a beginner anymore. But my journey starts almost a year back. In my previous company, I was working on some simple ETL pipeline projects. I only used python and the in-house local infrastructure and tools, so no cloud and other open-source tools. But I have to accept; it was only after this project, I realized my self-worth. I also found the area and field (data engineering) I want to work on in my daily professional life. Then, In January 2021, I planned everything and decided to move permanently to the data engineering field. Below, I will write down what I’ve done in the last five months.

Started to imagine me as a Data Engineer

To build courage and confidence, you must start to imagine yourself as a data engineer. This is how I build my confidence.

Reading Blogs/Articles: I love reading. The first thing I did, I took a Medium membership. Read as many articles as I can relate to beginner’s guide to data engineering. I jot down every important point from these articles. I made a list of tools, platforms, and infrastructures used by data engineers out there. You can start by googling as well; there are tons of articles available free to help and guide absolute beginners. You can read about essential tools: the ETL/ELT, batch, and stream processing pipelines.

Watching YouTube: Watch Some Data Engineer Gurus; their discussion and guidance help you sharpen your view. Personally, by following and watching them, I got the answers to many of my unasked questions. I can recommend you to Andreas Kretz and Ternary Data. Other excellent data engineering YouTube channels exist, but I mostly watch these two due to the time limit.

Join Meetup: I started to join online meetups related to Data engineering. One good thing here, you will meet like-minded people, and the discussion will always be interesting to hear. One meetup which is organized every week is Data Engineer’s Lunch by Data Wrangler DC.

Listening to Podcasts: I recommend you start listening to podcasts when you are familiar with this field. You’re going to enjoy it and won’t feel daunting.

Join Slake Groups: Slake groups are a great platform where useful discussion happened. Most open-source platforms have their own slake group, and you can join them. Again I will recommend you to join once you have some good knowledge about the field. I’m currently trying to find any slake group of data engineers working in the Randstad region (Amsterdam, Utrecht, Den Hague, and Rotterdam); if anyone knows, please let me know in the comments.

Write your Blogs: I want to test my confidence; hence, I started writing and publishing my blogs based on my experience. This article is one such example where I could share my experience with you all without any hesitation. Write your blogs, think about something you learned or find out while doing the projects; write a blog about it.

In short, the above approaches will not help you to gain any technical knowledge but indeed makes your view more clear and answered the question, which might pop up in your mind every other day.

How I sharpened my axe?

I used the one step at a time methodology.

Learn Data Engineering Basic: The first thing I did to make sure that my SQL and Python fundamentals are clear. For example, in my case, I revised my SQL skills by solving problems on HackerRank.

In case, If you’re an absolute beginner with Python and SQL, I recommend following CodewithMosh for Python and SQL fundamentals and Kevin Markham’s DataSchool for Pandas library.

Learn Tools: Once I realized my fundamentals are clear, I moved to the second stage, where I made a list of open-source tools and managed open-source platforms I need to learn. I watched some YouTube videos to get started with Amazon AWS, Docker, Git, and Airflow. In my case, I started with AWS cloud because I feel, in my region, most companies ask for AWS in their job profile, but you can also consider Google Cloud Platform and Azure. For example, one of my friends told me that most companies work with Azure in Norway, so if you’re in Norway, you should learn Azure.

Tip: If you’re planning to learn AWS, check out this website.

Build Project: In the third stage, I started coaching with Data Engineering Academy. In the coaching, we started selecting the right tools for my project. Next, we focused on choosing a suitable dataset. After that, we designed the project, and at last, I implemented the project. Check out all my projects on my GitHub.

As I said, I took one step at a time; after finishing my first project, I focused on others tools and how to build a new project around the latest tools, for example, Kafka, Spark, and Data Warehouse Dimension Modelling.

Write about your Project: Every time I finished my project, I added my projects to GitHub and described the project in detail.

Anxiety Moment: At this moment, I like to share the anxiety situation I faced. While reading multiple articles related to my projects or related to some tools, I started to feel I should also learn other tools ASAP. However, I realize it is crucial to stick to the plan and do things step-by-step. Before trying something new, make sure that you have developed the fundamental understanding of the topic, tool, or platform, and then try some next.

Build Portfolio

Share your Work: Based on your circumstance, if you feel you have worked on some good projects, they are worth sharing. Please share it with your public on LinkedIn or by publishing an article about it.

Create a Website: GitHub offers you an excellent option to host your website free. You need an HTML script. Check out here how to host your website on GitHub. Check out my website for a reference. You can create your portfolio on your website, where people can find out about you in one place.

Create a Resume and Cover: The only tip I can give to write an outstanding resume and cover letter is by follow the Wonsulting (Jonathan & Jerry). Don’t forget to check their Instagram account (Jonathan & Jerry); they are super awesome, and their guidance will help you with perfect resume writing.

Don’t lose Hope

It is essential to keep your morale high. Show your perseverance and resilience. You will undoubtedly face many frustrating points while working on projects or applying for a job, but if you’re able to hold it, you will surely achieve your target.

At the moment of writing this article, I’m still in the phase of transforming my imagination to reality. Or In other words, to move to a full-time data engineering position,

If you want to connect with me, let’s connect on LinkedIn or Instagram. If you want to know more about me, check out my website.

If you have some questions to discuss, you can write me a mail or a LinkedIn message. Happy to help you with my experience.

In last, If you have any Data Engineering roles open, feel free to reach out to me on LinkedIn.

--

--