Data Science Learning Path with Python

AC
Data Folks Indonesia
3 min readAug 25, 2021

I often receive messages from new members of Jakarta AI Research Discord Server, “where do I start to learn data science?” people are often confused statistics first or programming first. For those who do not come from academic background either computer science, engineering-related, or statistics and mathematics are difficult to find the learning path for data science.

On the other hand, for us who come from STEM background, some of the courses are intersection with data science courses such as programming, statistics descriptive, statistical testing, SQL, etc. If you are from stats and math, you may just need to learn python and its environment and libraries to implement stats and math method to solve the problem. If you are from CS or engineering, you need to learn more advanced stats and machine learning algorithms.

To answer the above question, we need to understand what data science is. Basically, data science has three major components. Programming, Math & Stats, and Domain Knowledge.

Programming

At this point, you need to understand how to create a program to make an analysis or statistical model using programming language. The most popular programming language for data science are Python and R language. The others are usually specific use cases such as Julia and C++. Another useful programming language is SQL to gather data from the data warehouse.

Mathematics & Statistics

Maths and Stats play important role in data science. If you do not have a fundamental understanding of maths and stats you will be lost to debug or to understand the behavior certain methods. Because all the techniques in data science are basically coming from there such as handling missing values, correlation coeff, statistical testing, data distribution, probabilistic, dimensionality reduction and more.

Domain Knowledge

The last thing you need to complete data science skills is domain knowledge. Domain knowledge is about an area of expertise that you are really master and redefining business problem into data science approach. This domain knowledge can be customer analytics (acquisition, retention), anti-fraud, finance, recommendation system in ecommerce, supply chain, distribution and more. Some people when step into a company that really focus in a specific industry are usually will stay within its industry. But, you also may see some people have minimum knowledge on business but rather expert in unstructured data such as text, images, and speech.

Data Science Learning Path

But, hey it is not enough to give you a portrait to learn data science in detail. Then, two weeks ago I created a complete and comprehensive learning path for data science. It covers programming, statistics, machine learning and deep learning. It is outside of business domain because it probably can be experienced in the workplace. Full List on Github.

Conclusion

Self-learning is not enough, learning from the books, tutorials, Youtube videos may still be not enough. My suggestion is still self-learning, learning on your own speed, take you time, don’t fall into data science boot camps on social media ads. That won’t make you a data scientist and don’t make you easily to get the job. If you have question, struggle, guidence, you can drop your question here

If you enjoyed this post, feel free to hit the clap button 👏🏽 and if you’re interested in posts to come, make sure to follow me on medium

--

--