Learning Python for Data Engineering 2023

Darshil Parmar
3 min readFeb 23, 2023

Complete Road-map to learn Python for Data Engineering

Photo by James Harrison on Unsplash

If you are interested in learning all of these things that I mentioned in the blog at once place then I have recently launched my course on Python for Data Engineering

This course will take you from a very basic to an advanced level.

Learning Python comes under the Top 5 skills for Data Engineers.

Source: opendatascience

If you are deciding to learn Python, here’s how you can.

Python has so many different use cases
— Web Development
— Data Science/Analytics/Engineering
— Game Development
— Web Scrapping
— Machine Learning
and many more…

Here is the framework I suggest you use

  1. Learn Basics
  2. Advanced Concepts & Hands-On
  3. Picking Niche

Learn Basic Fundamentals

There are a few concepts that are common across all the programming language
✅ Variables
✅ Operators
✅ Loops
✅ Conditional Statements
✅ Data Types
✅ Functions
and many more

These concepts you will find in all the primary programming languages.

YOU CAN NOT SKIP THESE!

Advanced Concepts & Hands-On

Once you learn the basics that means you just collected different tools.

✅ Learn Object Oriented Programming
✅ Exception Handling
✅ Working with different packages
✅ Functions
✅ Lambda functions

Practice your skills
— Competitive Coding
— Implementing Algorithms (for learning)
— Doing small projects (Building calculator)

This will give you the confidence to write code and improve your logic.

Picking Niche

As I said, Python has many different use cases and you don’t have to learn all the packages that existed

— If you are learning Python for Data Science then focus on packages such as NumPy, Pandas, Matplotlib, etc…
— If you are learning Python for Web Scrapping then focus on packages such as BeautifulSoup or Scrapy
— If you are learning Python for Web Development then focus on frameworks such as Django, FastAPI, or Flask

Learning Python for Data Engineering

What does a Data Engineer do on day to day basis?
— Read different types of files and write them
— Do cleaning/manipulation of data
— Write transformation job
— Query database using libraries
— Automate some tasks, etc…

Once you get your fundamentals and logic clear then focus on these things

1. Understand the different types of file formats (Read and Write)
— CSV
— AVRO
— JSON
— ORC
— PARQUET

2. Learn how to connect and query databases using Code
— SQLAlchemy
— pymysql
— psycopg2

3. Working with different types of DateTime formats and timezone
— Yes, your columns will be in the UTC timezone but you need to convert them to the local timezone
— Sometimes date and time are not properly formatted so make sure you know how to handle it

4. Doing Transformation
— Many libraries in python can help you to read files and do operations on top of it

One of them is Pandas

5. Learn to Automate things
— These include setting up cron jobs or deploying code automatically on the cloud using some scripts (Yes, this is a little bit towards DevOps but knowing this is also important)

6. Learn to read the documentation and connect with different tools
— Many times you will have to make connections to different tools
Working on the cloud? Connect to the service using code to create/update or remove things

About the course:

You will also get to do End-To-End Project on Data Engineering using Python!

End-To-End Data Engineering Project

Let me know if you have more questions

--

--

Darshil Parmar

Data Engineering | Building @DataVidhya | YouTube (120k+)