Learning Python for Data Engineering 2023
Complete Road-map to learn Python for Data Engineering
If you are interested in learning all of these things that I mentioned in the blog at once place then I have recently launched my course on Python for Data Engineering
This course will take you from a very basic to an advanced level.
Learning Python comes under the Top 5 skills for Data Engineers.
If you are deciding to learn Python, here’s how you can.
Python has so many different use cases
— Web Development
— Data Science/Analytics/Engineering
— Game Development
— Web Scrapping
— Machine Learning
and many more…
Here is the framework I suggest you use
- Learn Basics
- Advanced Concepts & Hands-On
- Picking Niche
Learn Basic Fundamentals
There are a few concepts that are common across all the programming language
✅ Variables
✅ Operators
✅ Loops
✅ Conditional Statements
✅ Data Types
✅ Functions
and many more
These concepts you will find in all the primary programming languages.
YOU CAN NOT SKIP THESE!
Advanced Concepts & Hands-On
Once you learn the basics that means you just collected different tools.
✅ Learn Object Oriented Programming
✅ Exception Handling
✅ Working with different packages
✅ Functions
✅ Lambda functions
Practice your skills
— Competitive Coding
— Implementing Algorithms (for learning)
— Doing small projects (Building calculator)
This will give you the confidence to write code and improve your logic.
Picking Niche
As I said, Python has many different use cases and you don’t have to learn all the packages that existed
— If you are learning Python for Data Science then focus on packages such as NumPy, Pandas, Matplotlib, etc…
— If you are learning Python for Web Scrapping then focus on packages such as BeautifulSoup or Scrapy
— If you are learning Python for Web Development then focus on frameworks such as Django, FastAPI, or Flask
Learning Python for Data Engineering
What does a Data Engineer do on day to day basis?
— Read different types of files and write them
— Do cleaning/manipulation of data
— Write transformation job
— Query database using libraries
— Automate some tasks, etc…
Once you get your fundamentals and logic clear then focus on these things
1. Understand the different types of file formats (Read and Write)
— CSV
— AVRO
— JSON
— ORC
— PARQUET
2. Learn how to connect and query databases using Code
— SQLAlchemy
— pymysql
— psycopg2
3. Working with different types of DateTime formats and timezone
— Yes, your columns will be in the UTC timezone but you need to convert them to the local timezone
— Sometimes date and time are not properly formatted so make sure you know how to handle it
4. Doing Transformation
— Many libraries in python can help you to read files and do operations on top of it
One of them is Pandas
5. Learn to Automate things
— These include setting up cron jobs or deploying code automatically on the cloud using some scripts (Yes, this is a little bit towards DevOps but knowing this is also important)
6. Learn to read the documentation and connect with different tools
— Many times you will have to make connections to different tools
Working on the cloud? Connect to the service using code to create/update or remove things
About the course:
You will also get to do End-To-End Project on Data Engineering using Python!
Let me know if you have more questions