Reading is imperative. So is Data Science.

Akhil Gupta
Data Science Group, IITR
3 min readJan 9, 2017
Try and buy them for your shelf!

There are two types of people in this world. Those who like to read and those who don’t. Alas, the world doesn’t entirely work in binary.

“The journey of a lifetime starts with the turning of a page.”

This post is for those people who believe in following the old-school method of reading books, over the short-and-crisp MOOCs.

It is important to get a feel of what you are going to dive into. These books will help you to motivate you for this field. If you have some time on your hands, do go through them, otherwise jump to the real deal!

  1. Moneyball by Michael Lewis (4.2): This book talks in detail about how Statistical analysis revolutionised the sport of Baseball, and demonstrated newer ways of evaluating talent. It gets a bit dense in between, but you can skip those parts and read about how players were affected.
  2. The Signal and the Noise by Nate Silver (4): This guy predicted the winners of all 50 states in 2012 US Presidential elections. Yes, he is the master of predictions. In this book, he explores different areas where predictions can be successful. You can follow his blog here.
  3. Freakonomics by Steven Levitt (3.9): The title is a bit misleading. It’s not really about economics, but what interesting things you can discover when you apply statistical analysis to problems where you wouldn’t normally think of using it. It’s a must read for budding entrepreneurs.

Statistics for ML

  1. Introduction to Statistical Learning by Gareth James (4.6): Bible of Statistics in Machine Learning. This book is a definite read for everyone. It explains concepts of Statistical Learning from the very beginning. Plenty of R code. Advanced version of this is The Elements of Statistical Learning. Finish ISLR before ESLR.
  2. Think Stats by Allen Downey (3.6): Very comprehensible introduction into computational statistics. It covers data analysis from beginning to end in Python. Learn Python before starting this book.
  3. CT-3 Probability and Mathematical Statistics: This material is used by students preparing for Actuarial exam. TBH, this is one of the best and most straight-forward resource for Probability and Statistics. It shall help you in clarity of concepts.

Data Analytics (Business Intelligence)

  1. R Cookbook by Paul Teetor (4): Helps to perform data analysis with R quickly and efficiently. R gives you all the statistical power, but structure can be difficult to master. Facilitates learning by doing. Designed for clear-cut problem solving in R. But, acquire basic concepts of R before starting with this book.
  2. Python for Data Analysis by Wes Mckinney (4): Concerned with the nuts and bolts of manipulating, processing, cleaning and crunching data in Python. Wes is the principal author of Pandas library, so this is heavily focussed on Pandas and Numpy. Good for beginners in this field.

Machine Learning

  1. Python Machine Learning by Sebastian Raschka (4.3): This book teaches fundamentals of ML, and how to utilise these in real-world applications using Python. It doesn’t treat ML as a black box, and contains math and equations. Uses scikit-learn, most beautiful and practical machine learning library.
  2. Deep Learning Book by Ian Goodfellow, Yoshua Bengio (4.4): The most rigorous and up-to-date reference of deep learning algorithms. It’s divided into 3 parts, namely:
    -> Basic Maths and ML Concepts
    -> Most established DL Algorithms
    -> Ideas for future research in DL

Note that the above mentioned books are my personal favourite, and have been tested by professionals also. You can safely follow them for preliminary studies in this field. And, always remember that people who say they don’t have time to read simply don’t want to.

Open to suggestions and comments. :)
And, ❤ if this was a good read. Enjoy!

--

--

Akhil Gupta
Data Science Group, IITR

Graduate Student at the University of Illinois. ML @ deepair. Working towards social good using AI.