27 Amazing Data Science Books Every Data Scientist Should Read

Pranav Dar
Jan 17, 2019 · 11 min read

A carefully curated list of books from around the world that deal with the different and vast branches of data science.

Image for post
Image for post

Every person has their own way of learning. What helped me break into data science was books. There is nothing like opening your mind to a world of knowledge condensed into a few hundred pages. There is a magic and allure to books that I have never found in any other medium of learning.

“If you only read the books that everyone else is reading, you can only think what everyone else is thinking.” — Haruki Murakami

Learning Data Science on your own can be a very daunting task! There are numerous ways to learn today — MOOCs, workshops, degrees, diplomas, articles, and so on. But putting them in a structure and focusing on a structured path to become a data scientist is of paramount importance.

But there are hundreds of books out there about data science. How do you choose where to start? Which books are ideal for learning a certain technique or domain? While there’s no one-shoe-fits-all answer to this, I have done my best to cut down the list to these 27 books we’ll see shortly.

I have divided the books into different domains to make things easier for you:

  • Books on Statistics
  • Books on Probability
  • Books on Machine Learning
  • Books on Deep Learning
  • Books on Natural Language Processing (NLP)
  • Books on Computer Vision
  • Books on Artificial Intelligence
  • Books on Tools/Languages

Bonus:

Without any further ado, let’s dive right in.

Books on Statistics

Statistics in Plain English

Image for post
Image for post

Author: Timothy C. Urdan

I started my journey into the world of statistics with this beauty of a book. It’s written for absolute beginners and in a way that makes you come back for more. The writing style and explanations provided do justice to the title — Statistics in Plain English. You could recommend it to any non-technical person and they would get the hang of these topics, it’s that good!

Think Stats: Probability and Statistics for Programmers

Image for post
Image for post

Author: Allen B. Downey

You’ll find this book at the top of most data science book lists. The book comes with plenty of resources. Use the above link to go to the book home page and you’ll see resources like data files, codes, solutions, etc. It will be especially useful for folks who know the basics of Python. The language is used to demonstrate real world examples.

Introduction to Statistical Learning

An all-time classic. This book is recommended or referenced in most machine learning courses I’ve come across, it’s just that well written. It covers basic statistics as well as machine learning techniques. The awesome thing about this book is that each concept is explained with case studies in R. So once you have a handle on programming, you can always come back and try out each concept again. What better way to ingrain a concept than by practicing it multiple times?

Books on Probability

Probability: For the Enthusiastic Beginner

Ideal book for beginners. It is written for college students so all of you looking to learn probability from scratch will appreciate the way this is written. All the basics are covered — combinatorics, the rules of probability, Bayes’ theorem, expectation value, variance, probability density, common distributions, the law of large numbers, the central limit theorem, correlation, and regression.

Introduction to Probability

Another introductory book covering basic probability concepts. Like the book above, this one is a comprehensive text written with college graduate students in mind. Why do I keep repeating that, you might be wondering. It’s because I want to emphasize that if there’s a place to start learning from scratch, it’s a book that’s written for students who haven’t ever ventured into this field before.

An Introduction to Probability Theory and its Applications

As the book’s description states, it’s a complete guide to the theory and practical applications of probability theory. I recommend reading this if you really want to deep dive into the world of probability. It’s a VERY comprehensive text and might not be to a beginner’s taste. If you’re learning probability just to get into data science, you can get away with reading either of the two probability books mentioned above.

Books on Machine Learning

The Hundred-Page Machine Learning Book

I love this book. Having read a ton of books trying to teach machine learning from various angles and perspectives, I struggled to find one that could succinctly summarize difficult topics and equations. Until Andriy Burkov managed to do it in some 100-odd pages. It is beautifully written, is easy to understand and has been endorsed by thought leaders like Peter Norvig. Need I say more? Beginner or established, every data scientist should get their hands on this book.

Machine Learning

Before all the hype came about, Tom Mitchell’s book on machine learning was the go-to text to understand the math behind various techniques and algorithms. I would suggest brushing up on your math before taking this up. But you don’t need any background in AI or statistics to understand these concepts. It was the first-ever book I read on ML! It’s modestly priced so it’s definitely worth adding to your collection.

Elements of Statistical Learning

Image for post
Image for post

Authors: Trevor Hastie, Robert Tibshirani and Jerome Friedman

And we’re back with another classic by Hastie and Tibsharani! It’s the natural successor to the ‘Introduction to Statistical Learning’ book we covered earlier. While there are a few overlaps with that book, this one takes a more advanced look at what we call machine learning algorithms. Topics like neural networks, matrix factorization, spectral clustering are covered apart from the common ML techniques.

Books on Deep Learning

Deep Learning

Image for post
Image for post

Authors: Ian Goodfellow, Yoshua Bengio and Aaron Courville

What a list of rockstar authors! The ‘Deep Learning’ book is widely regarded as the best resource for beginners. It’s divided into three sections: Applied Math and Machine Learning Basics, Modern Practical Deep Learning Frameworks, and Deep Learning Research. It is to-date the most cited book in the deep learning community. Keep it by your bedside, worship it and reference it often — this will be your companion whenever you start your deep learning journey.

Deep Learning with Python

A really cool way of learning deep learning (or machine learning for that matter) is by programming side-by-side with the theory. And that’s the approach Francois Chollet follows in the ‘Deep Learning with Python’ book. Concepts are taught using the popular Keras library. Francois is the creator of Keras so who better to teach you this topic? I also recommend following Francois on Twitter — there is a lot we can learn from him.

Neural Networks and Deep Learning

This is a free online book to learn about the core component that powers deep learning — neural networks. I quite like the way this book has been written. It takes a practical approach to teaching and looks at deep learning topics from the lens of a beginner. You will not learn any programming language in this book — it’s a good old fashioned text book on the underlying insights behind neural networks.

Books on Natural Language Processing (NLP)

Natural Language Processing with Python

Another book in this collection which sticks to the learn by doing policy. You’ll pick up Python concepts you otherwise wouldn’t have and will navigate the world of NLP using the NLTK library (Natural Language Toolkit). While this shouldn’t be the only resource you refer to for learning NLP (it’s far too complex a field for that), it offers a pretty decent introduction to the topic.

Foundations of Statistical Natural Language Processing

Published almost two decades ago, this text still serves as an excellent introduction to natural languages processing. It’s a very comprehensive guide to the broader sub-topics in NLP, like Text Categorization, Parts-of-Speech Tagging, Probabilistic Parsing, among various other things. The authors have provided a rigorous coverage of mathematical and linguistic foundations. Again, the book is quite detailed so keep that in mind.

Speech and Language Processing

The emphasis of this book is on practical applications and scientific evaluation in the scope of natural language and speech. I included this book to expand our horizons beyond text — to look at speech recognition as well. And why not? It’s an area of research that is thriving nowadays with a plethora of applications coming out everyday. Jurafsky and Martin have written an in-depth book on NLP and computational linguistics. This one is from the masters themselves.

Books on Computer Vision

Computer Vision: Algorithms and Applications

Explore a variety of common computer vision techniques in this book, especially ones used for analyzing and interpreting images. While this was published almost 9 years ago, the examples and methodology illustrated by Richard Szeliski are applicable today as well. It’s a comprehensive text that takes a scientific approach to solving basic vision challenges. The website I have linked to above contains a free PDF copy of the book

Programming Computer Vision with Python

Before you dive into this awesome book, go to the website I’ve linked above and download the datasets, the code notebooks and clone the GitHub repository mentioned there. They are excellent companions in this REALLY hands-on introduction to the world of computer vision. As the author states, “You’ll learn techniques for object recognition, 3D reconstruction, stereo imaging, augmented reality, and other computer vision applications as you follow clear examples written in Python.”

Computer Vision: Models, Learning, and Inference

The book starts off from scratch by introducing us to the concepts of probability and quickly picks up pace from there. While some of the frameworks introduced here have seen more advanced versions come out, this book is nonetheless relevant in the current context. More than 70 algorithms have been introduced and the text is beautifully complemented by over 350 illustrations. The website also contains PowerPoint slides, if that’s the kind of learning you prefer.

Books on Artificial Intelligence

Artificial Intelligence: A Modern Approach

A book written by Stuart Russell and Peter Norvig? I am sold. It is the leading book in Artificial Intelligence. More than 1300 universities in over 100 countries reference/cite this book in their curriculum. Given who the authors are, it isn’t surprising to see the book length — 1100 pages. Covering the length and breadth of AI components — speech recognition, autonomous vehicles, machine translation, and computer vision among other things, this can be considered the Bible of AI.

Artificial Intelligence for Humans

What are the foundational algorithms underneath artificial intelligence? This book packs a lot of technical know-how into just 222 pages. This is volume 1 of a series of books on the techniques behind AI (dimensionality, distance metrics, clustering, error calculation, hill climbing, Nelder Mead, and linear regression). There is an accompanying site as well which contains examples cited in the book + a GitHub repository containing the code.

The Master Algorithm

If you’re looking for a technical book on AI, this isn’t it. What it is, however, is a masterful text on how machine learning is remaking business, politics, science and war. It is a thoughtful and thought-provoking book on where AI is right now, and where it might end up taking the human race. Will we ever find a single algorithm (or ‘The Master Algorithm’) that is capable of driving all knowledge from data? Join Pedro Domingos in his quest to find out.

Books on Python

Fluent Python: Clear, Concise, and Effective Programming

There are way too many resources out there to learn Python but nothing teaches you programming like a good old-fashioned book. As you might expect from a coding book, it’s a hands-on guide to help you understand how Python works and how to write awesome and effective Python code. Luciano Ramalho also covers a few popular libraries you’ll find yourself regularly using in data science projects. With a length of 794 pages, this book is worth the spend.

Programming Python: Powerful Object-Oriented Programming

Wait, another Python book?! If you thought the above book taught you everything you need to know about Python, think again. This is a vast programming language with a lot more left to cover. Once you’ve mastered the fundamentals from the above book by Luciano Ramalho, take a gander on this one by Mark Lutz. There are in-depth tutorials on a wide variety of topics: databases, networking, text processing, GUIs, etc. Tons and tons of examples are included. A must-read for programming geeks.

Mastering Python for Data Science

The two books we have covered so far for learning Python looked at the language from a programming perspective. Now it’s time to learn it from the data science angle. Which data science libraries are commonly used and how? How can you create data visualizations and mine for patterns in Python? And how can you code advanced data science/machine learning techniques to build models? These questions and more are answered by Samir Madhavan in this excellent write-up.

Books on R

R for Data Science

Image for post
Image for post

Authors: Garrett Grolemund and Hadley Wickham

Anyone who has remotely heard of R programming will have brushed across Hadley Wickham’s work. His work in this language is unparalleled — I could go on and on about him. I couldn’t recommend this book highly enough. You’ll learn how to import different kinds of data into R, the different data structures, and how to transform, visualize and model your data. The perfect book to learn data science through coding in R.

R for Everyone

I learned R way before I even heard about Python. I have a special place for it in my heart and Jared Lander’s R for Everyone played a big part in that. I got this book through one of my acquaintances and was immediately taken by how well it was written. It claims to be for ‘everyone’ and lives up to it’s name. This is a great book if you’re from a non-technical and non-statistical background.

R Cookbook

The R Cookbook is an excellent addition to your budding data science reading list. It contains more than 200 practical recipes to help you get started with analyzing and manipulating data in R. Each recipe looks at a different problem. It’s meant for beginners, intermediate users and advanced practitioners alike. Whether it’s learning new programming skills or brushing up your concepts, this cookbook is for everyone.

And as promised, here is the full infographic covering all the books we saw in this article:

Image for post
Image for post

Originally published at www.analyticsvidhya.com on January 17, 2019.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Pranav Dar

Written by

Machine Learning and football writer. An odd combination? Time will tell..Follow my football writings on thelastlibero.net.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Pranav Dar

Written by

Machine Learning and football writer. An odd combination? Time will tell..Follow my football writings on thelastlibero.net.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store