Inspiring Resources for Data Science

Aakash Tandel
Inspire + Advance
Published in
2 min readSep 5, 2017

I suffered from information paralysis when I started learning about data science. There are a hundred and one different websites, books, and videos to watch. I didn’t know if DataCamp was better than DataQuest.io or if I should just start reading one of the many O’Reilly books. I spent a fair amount of time on both of those sites and have read a good part of a few O’Reilly books like Think Bayes and Doing Data Science. I’ve read Introduction to Statistical Learning and plan on reading Elements to Statistical Learning in a few months. But all of those resources are very technical. As a new data scientist, you will often be encouraged to learn some of the many technical aspects of the field. I am always looking at the big picture. How does this new technique fit into solving problems with data science? This is the question I think is sometimes lost when you spend your time in the recesses of a books and journal articles. I have a few places I go when I feel like I am bogged down in the weeds of data science.

The first place I like to go to is YouTube. Past the entertaining DriveTribe videos and fail compilations, there exists a great channel called PyData. PyData “provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other.” The YouTube channel offers a multitude of different lectures ranging the full spectrum of data science topics. I will often find myself listening to a lecture and following along with the lecturer’s Jupyter Notebook on my local machine. This is a great way to learn from folks using data science in the field. I recently listened to a great lecture by Patrick Harrison on modern natural language processing techniques. I would never have been able to learn from Harrison if it weren’t for PyData.

Books have always offered me a certain level of inspiration. Nate Silver’s The Signal and the Noise, Seth Stephens-Davidowitz’s Everybody Lies, and Christian Rudder’s Dataclysm are all great examples of books that provide context to data science. Each of these veterans of data science offers a unique perspective and subject matter expertise on a wide range of topics. All three of these books leave the technical details out and for the data scientist that may be unfortunate. Nonetheless, each book provides the reader with a new use case for data science and can offer you context to your new founded data science skills.

I believe in having a macro and micro view of data science. Your current data science project is a micro project. How it fits into the world is a macro problem. At the core of it, data science is about solving problems and acting like a detective with data. If you ever find yourself bogged down by where the lambda tuning parameter in Scikit-Learn’s regularization function is, remember that data science problems need context. You are solving interesting real world problems. Always remember that.

--

--