Journeying into Data Science

Ani Madurkar
May 14 · 5 min read
Sunset in Iceland. Image by author

A low-cost & high-quality guide to begin your own journey.

First, what is Data Science? Wikipedia claims it to be:

“Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains.”

In this domain, there are several job roles and subfields forming as it evolves. I imagine that over time, saying you ‘do data science’ will be equivalent to saying you ‘do science’ or ‘do programming’. It’ll require more nuance and specification to accurately mean anything. For our current world, I think it is imperative we leave the definition a little broad and provide as many entry ramps into this world as possible. This is because the evolution of the Information Age has made learning from data and making it as useful as possible in the space you operate in the best skill you can adopt.

And the best part? You can adopt it for free. Or quite close.

Here’s how I would recommend to go about your learning journey:

Find a Domain of Interest -> Programming -> Application in Domain -> Math/Statistics -> Application in Domain -> Iterate with Feedback

Programming

Courses (not all free, but lowest-cost high-quality options I’ve found):

  1. Introduction to Python Programming — Udacity: Free
  2. Python 3 Specialization — University of Michigan on Coursera (5 month-long courses): $50/month, ~$250 total, free to audit
  3. Computational Thinking using Python XSeries Program— MITx on edX (5 months long est.): $150 total, $135 current discounted price
  4. Applied Data Science with Python — University of Michigan on Coursera (5 month-long courses): $50/month, ~$250 total, free to audit
  5. Intro to SQL: Querying and managing data — KhanAcademy: Free

Books:

  1. Automate the Boring Stuff with Python by Al Sweigart
  2. Effective Python: 90 Specific Ways to Write Better Python, 2nd Edition by Brett Slatkin
  3. Fluent Python, 2nd Edition by Luciano Ramalho
  4. Python for Data Analysis, 2nd Edition by Wes Mckinney
  5. Learning SQL, 3rd Edition by Alan Beaulieu

Mathematics/Statistics

Courses (not all free, but lowest-cost high-quality options I’ve found):

  1. Statistics with Python Specialization — University of Michigan on Coursera (5 month-long courses): $50/month, ~$250 total, free to audit
  2. Fundamentals of Statistics — MITx on edX: Free
  3. Introduction to Statistics: Probability — Berkeley on edX: Free
  4. Essence of linear algebra — 3Blue1Brown on Youtube: Free
  5. Mathematics for Machine Learning: Linear Algebra — Imperial College London on Coursera (1 month long course): $50/month, free to audit

Books:

  1. Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong
  2. Practical Statistics for Data Scientists, 2nd Edition by Peter Bruce, Andrew Bruce, and Peter Gedeck
  3. A Common-Sense Guide to Data Structures and Algorithms by Jay Wengrow
  4. Bayesian Analysis with Python-Second Edition by Osvaldo Martin
  5. Math for Programmers by Paul Orland

Conclusion

If you just wanted to take a deep dive into the more advanced concepts of Machine Learning, here are some excellent courses to start with:

  1. Machine Learning — Stanford on Youtube: Free
  2. Probabilistic Machine Learning — Phillip Henning on YouTube: Free
  3. Deep Learning Specialization— DeepLearning.AI on Coursera (5 month-long courses): $50/month, ~$250 total, free to audit
  4. Causal Inference — Columbia University on Coursera (6 week long course): $50/month, free to audit
  5. PyTorch Basics for Machine Learning — IBM on edX: Free

One thing to remember as you embark on your journey into this field is that ‘beginner’ probably isn’t what you think it is. The low bar for excellence can still be quite high, but as long you keep iteratively increasing that bar of ‘making data useful’ in your domain then you’re on the right track. This article doesn’t cover getting a Graduate degree in this field (Computer Science, Statistics, Mathematics, Data Science, etc.), but as a graduate from the Master of Applied Data Science program from University of Michigan I can personally vouch for the value of the structure and speed that option can provide for your journey if you do it right. It’s not necessary, but it has given me what I was looking for in my journey: a strong grasp of foundations, an advanced set of data skills to use in projects I’m interested in, and an inspiring/supportive community to be a part of. However you’re able to, if you consistently focus on getting & reinforcing those 3 things then you’ll advance your data journey quite significantly.

CodeX

Everything connected with Tech & Code

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store