Top Resources for Starting Out in Data Science

We asked the Edge Team what resources they recommend for those starting out in data science

Ren Gibbons
Edge Analytics
4 min readMar 15, 2022

--

If you’re just getting started in data science, you may be overwhelmed by the many options for learning fundamentals and best practices. From textbooks, to online courses, to data camps, it can be hard to figure out where to start. We asked the Edge Team what resources they recommend and curated the following list based on their responses.

An Introduction to Statistical Learning

An Introduction to Statistical Learning text takes a less technical approach to key concepts in statistical learning including fundamental algorithms like regression, decision trees, clustering, and deep learning. It was written by the authors of the classic Elements of Statistical Learning out of a need to make the contents of Elements more accessible to a broader audience. The PDF is free online, and the associated lecture videos are also highly recommended.

Learning From Data: A Short Course

Learning From Data: A Short Course describes the fundamentals of machine learning, with a balanced approach of the theoretical and the practical, the mathematical and the heuristic. The book incorporates in-depth discussion of linear models, overfitting to stochastic and deterministic noise, and regularization.

Numerical Python

Numerical Python provides methods and case studies on how to numerically compute solutions and mathematically model applications. This text features case studies on computing techniques including array-based and symbolic computing, visualization, numerical file I/O, equation solving, optimization, interpolation and integration, and domain-specific computational problems.

Algorithms for Communications Systems and their Applications

Algorithms for Communications Systems and their Applications is a practical guide to applying algorithms in communications systems. Written for researchers and professionals in the areas of digital communications, signal processing, and computer engineering, this text presents algorithmic and computational procedures within communications systems that overcome a wide range of problems facing system designers.

The Data Science Primer

The Data Science Primer GitHub repository is a curated set online resources in the following topics: programming, linear algebra, statistics, probability, and SQL, and machine learning. The stated goal of the repository is “to provide an on-ramp to becoming a data scientist no matter someone’s background”.

DeepLearning.AI courses

Featuring courses from leading practitioners in machine learning, DeepLearning.AI provides excellent hands-on coursework. From one of the Edge team members: “DeepLearning.AI courses on Coursera are well worth the subscription cost, and the Jupyter-based assignments help you build strong intuitions as you build real deep learning models.”

CS231N: Convolutional Neural Networks for Visual Recognition

This is Fei-Fei Lei’s famous Stanford class on convolutional neural networks and computer vision. The university open-sourced the lectures and class notes. The class notes are especially great at explaining machine learning in an intuitive and technically sound way. Some consider it a must read for anybody wanting to get into ML.

Keras Examples

Keras Examples features short, focused demonstrations of vertical deep learning workflows in Python. Computer vision, natural language processing, generative deep learning, reinforcement learning, and graph data examples are all included. For machine learning specifically, we find that Keras Examples is a fantastic resource.

Kaggle

Kaggle provides code and data to learn data science and is famous for its hosted competitions. From their website: “Use over 50,000 public datasets and 400,000 public notebooks to conquer any analysis in no time.” From one of the Edge team members: “Kaggle is a great resource for ML tools. I’ve never taken part [in competitions], but I have looked through a lot of solutions.”

StatQuest with Josh Starmer

StatQuest with Josh Starmer is a YouTube channel with videos on countless concepts in statistics and machine learning. The information is presented in an approachable, ground-up way. It is a great resource to brush up on topics in ten minutes or less, and the stats-centric guitar jingles add to the fun.

Stack Overflow

Would any list of data science resources be complete without Stack Overflow?

Honorable Mentions

Twitter

In addition to following Edge (@_EdgeAnalytics), there are countless practitioners and academics sharing valuable insights on Twitter. Specifically, we recommend following @svpino; one Edge team member describes his Twitter as “always having thought provoking ideas and 60 second brain teasers”.

Hands-on experience to build intuition

We like working on actual projects! Classes, videos, and books will only get you so far. Intuition is really important in data science, which you can only get by working through challenging problems in code.

The Edge Team

From one of the Edge team members: “I’ve learned a lot from talking to other members of the Edge team to understand how they operate and what kinds of tools they use.” Feel free to reach out with your data science questions!

What resources would you recommend for getting started in data science? Leave your picks in the comments!

Edge Analytics is a consulting company that specializes in data science, machine learning, and algorithm development both on the edge and in the cloud. We partner with our clients, who range from Fortune 500 companies to innovative startups, to turn their ideas into reality. Have a hard problem in mind? Get in touch at info@edgeanalytics.io.

--

--