The Ultimate Data Science Course List
Start Your Data Science Career Here!
Data science is a booming industry today, often called “the sexiest job in the world”. However, it can be a pain to manually research and find the right learning resources to start your journey. We here at Data Science Library have put together an extensive index of online courses, videos, Medium posts, and books to help you get your journey started. Please enjoy!
Learn how to use Git:
Git is a very complicated piece of software to explain, but there are a few good Medium posts that explain them in-depth such as this one and this one. We highly recommend you give them both a read through before you attempt to go through these 2 listed Git resources.
- Git and GitHub for Beginners (video series): This short video series provide beginners with an approachable introduction to Git and GitHub, and how to use them together.
- Pro Git (online book): The first few chapters provide a thoughtful introduction to Git, while the later chapters handle more complicated topics like GitHub integration and workflows on large projects. This is one of the most approachable online Git resources, as it teaches the concepts (as well as the code) in an approachable and logical way.
Data Science
A gigantic field, data science encompasses statistics and computer science and business knowledge all in an effort to extract valuable information from data. There are numerous resources to learn about this, but we have compiled some extremely useful videos on the subject. A few of these are by highly reputable universities like Stanford and MIT, so any aspiring data scientist in college should definitely give them a look. It could just be what you will encounter in the near future!
- Foundations of Data Science by Microsoft Research
- MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
- Data Science at Stony Brook University
- Data Science with Datacamp
- Data Science with Udacity
- Edureka Data Science Course (project-based learning)
- Introduction to Big Data with PySpark
- Machine Learning | Stanford
- CS190.1x: Scalable Machine Learning | UC Berkley
- Creative Applications of Deep Learning with TensorFlow
- Introduction to Machine Learning by Udacity
- Data Wrangling with MongoDB Online Course | Udacity
SQL
Structured Query Language, otherwise known as SQL, is an essential programming language for anyone interested in data science to learn. It is the standard language for accessing and manipulating databases. There’s hundreds of places to learn it, so here are a few of them. Definitely learn this first, since it isn’t that difficult to pick up and is surprisingly easy to master!
- SQL Beginner Course
- W3Schools SQL Intro
- Relational Databases by Professor Greg Hay
- SQL Server Tutorials for Beginners
- SQL Server Training Videos
- SQL Server Interview Questions and Answers
- SQL Server DBA Interview Questions and Answers by Tech Brothers
- Extremely Difficult SQL Questions
Data Visualization and Analytics
The beauty of data science is that once you have your information, you can do virtually anything with it. From creating your own personalized scatter plots to developing functions to find trends in your data, these sources will help you master the art of presenting your findings to the world!
- Tableau — Do it Yourself Tutorial — Getting Started — project based learning
- Advanced Tableau
- Data viz with Edureka Tableau
- Data Analytics with Tableau
- Analyzing and Visualizing Data with Microsoft Power BI | Getting Started with BI
- The Tableau Reference Guide
D3
From streamgraphs to Voronoi to polar clocks, there’s essentially no limit to the types of data visualizations you can make with the JavaScript library D3.
Here’s how to get started with D3!
Learn D3:
- Intro: https://www.dashingd3js.com/introductory-d3-course
- Intermediate: https://www.dashingd3js.com/intermediate-d3-course
- Official D3 Tutorials: https://github.com/d3/d3/wiki/Tutorials
- D3 by Scott Murray: http://alignedleft.com/tutorials/d3
- D3 API reference: https://github.com/d3/d3/blob/master/API.md
Here’s another resource for interactive data visualization, taken from a UW Informatics course!
Web Scraping
Web scraping is essentially the act of extracting large amounts of data from a website, and then storing all of that information in a table or spreadsheet on your computer. It is somewhat complicated but can be very useful for making databases for you to use in your own projects, so its worth checking out once your data science skills become more refined.
- Python Web Scraping
- Web Scraping with Node.js
- Web Scraping with Node.js by jfWiz
- Web Scraping with Scrappy
Machine Learning:
Machine learning is a particular method of data analysis that uses various algorithms and statistical models to automate computers, so that they can act and predict outcomes with little to no human interaction. It sounds complex, and generally is, but with the courses we’ve listed you can get started in diving into this deep topic. The payoff is worth it, as any data scientist worth their salt will tell you!
- 15 hours of Expert Machine Learning by Data School
- Statistical Learning (9-week online course): Taught by Trevor Hastie and Rob Tibshirani of Stanford using their new “Introduction to Statistical Learning” textbook. It covers a wide gamut of supervised learning methods and a few unsupervised learning methods. They cover the math and concepts behind each method and then work through example implementations in R. Although this course skewed a bit heavy on math and light on application for my taste, the textbook is fantastic and they are clearly masters of this material.
- Introduction to Statistical Learning: free PDF download of the textbook
- Solutions to the textbook exercises: unofficial and incomplete answers on GitHub
- Machine Learning applications (links): This is a curated list of links to news articles and research papers about how machine learning has been used to solve interesting, real-world problems.
Natural Language Processing
(recommended by Randy Lao)
Abbreviated as NLP, this is a subset of computer science and information engineering that focuses around artificial intelligence. More specifically, it revolves around getting computers to understand and process human languages so that they can get closer to our level of understanding. It is a fascinating, high-level field that is definitely worth investigating if you have the interest. Some say AI is the future, so here’s your chance to get on the wave while you still can!
Core Concepts:
- Solutions to the textbook exercises: unofficial and incomplete answers on GitHub
- Introduction to Bag of Words (CountVectorizer, TFIDF, HashVectorizer)
- Text Preprocessing (Stopword Removal, Tokenization, Stemming/Lemmatization)
- Word Vectors
- Regex Tutorial
Common NLP Libraries:
NLP Projects
- Build a Simple Chatbot from Scratch
- Web Scraping and Sentiment Analysis
- Textual Feature Importance with ELI5
- Topic Modeling — Latent Dirichlet Allocation (LDA)
Click here to learn how to solve 90% of NLP problems!
Thank you for reading! If you liked these resources, please give us a clap. Also, a special thank you to Randy Lao for recommending a lot of these resources and Sanjay Unni for editing this article. Add Randy on LinkedIn here! Add Sanjay on LinkedIn here!