10 Free Data Science Books You Must Read in 2019

Rebecca Vickery
Dec 29, 2018 · 6 min read
Photo by Susan Yin on Unsplash

There are so many great resources out there to learn data science and analysis for free. Over the last year I have read quite a few data science books and I wanted to share some of the best here. If you are studying, or practicing data science, and haven’t read these books I really think they are worth adding to your reading list for 2019. Below is a list of the top 10 I have found most useful to me over the last few years that are currently available online.

Automate the boring stuff

I really love this book it is a simple introduction to getting started with python from a practical point of view. Although not a specifically data science related book it includes most of the basic concepts around using python for data science. Including flow control, functions, web scraping, working with csv and json files, and running programs. It is very much aimed at absolute beginners so a great book for getting started with python. As well as step by step instructions for each technique, at the end of each chapter there are also practice questions and problems.

Data science at the command line

I started using python for data analysis purely in Jupyter Notebooks. However, over time I found that using the command line enabled me to be much more efficient in my work. For example I can very quickly obtain data, run programs and search through files all by typing commands and pressing enter in the terminal window. This book is a highly accessible and comprehensive guide to data science at the command line. In each chapter it covers, alongside working examples, how to obtain, clean, explore, model and interpret data via the command line.

Think stats

This is a really practical overview of statistics for data science. The book uses a data set from the National Institute of Health throughout to explain the core concepts in probability and statistics necessary for data science and analysis. This is another highly practical book, and includes lots of example python code, and simple programs to explain the concepts. This is much more lightweight than a lot of the more theoretical textbooks you may find on this subject, and I found this really suited my learning style.

Python data science handbook

This is a really comprehensive guide to python for data science. This builds from beginner to advanced concepts. There is a chapter on iPython which really made such a difference to my efficiency as a data science practitioner. This book also covers Numpy, data manipulation with Pandas, visualisation methods, and Machine Learning. The Machine Learning chapter in particular is really good, and covers both the practical implementation of the various libraries, and the nuts and bolts of how they work.

R for data science

I mainly work in python but I still find it is really useful to have at least a working knowledge of R. I have often found that if a good library for a particular method is not available in python, R usually has one. This book is a really comprehensive guide to doing data science with R, and covers everything from data visualisation and transformation, to the R workflow, to data modelling.

Probabilistic Programming and Bayesian methods for hackers

In the authors own words this book is an attempt to “bridge the gap between Bayesian mathematics and probabilistic programming”, and I believe it does this very well. As with Think Stats it moves away from heavily theoretical textbooks and offers practical use cases for Bayesian inference, and the approach is a computational understanding first, and a mathematical understanding second. It is another python based book with lots of practical examples, and uses predominately the PyMC libraries.

Machine learning yearning

This book has been released in draft by Andrew Ng this year. It is designed to teach data scientists how to structure Machine Learning projects, and set direction for a data science team. It is a good overview of when and how to use Machine Learning, and how to handle the complexities involved in implementing AI in the real world.

Ethics and data science

There has been a lot in the news this year relating to bias in machine learning applications, and data protection and privacy concerns. I read this book as I wanted to ensure that I had the required knowledge to practice good data science. This book covers how to put ethical principles into data science projects. It includes a really good checklist to go through when designing a project as well as lots of suggestions for building ethics into a general data culture. Another resource released this year along very similar lines was the deon command line tool from drivendata.org. This tool allows you to build an ethics checklist into data science projects. This is definitely something I will be incorporating into my work in the new year.

Deep learning

This is an excellent book now available to read for free online. It covers applied maths for Machine Learning, and has a large emphasis on deep learning in particular. It covers the mathematics behind key concepts in deep learning such as convolutional networks, regularisation and recurrent and recursive nets. It is very much a theory based book but gives a deep level of understanding into the subject. It does also include chapters on the practical implementation of these techniques.

Rules for machine learning

This is really an ebook/paper and only about 24 pages long. However, I have to include it here as it is such a great resource and I found it by chance on twitter this year. This covers some best practices from Google in how to implement a machine learning project. It emphasises the importance of data engineering to create great features and a solid data pipeline over machine learning expertise.

These books have been really useful to me over the last couple of years, I am always amazed at the quantity and quality of free resources available online. I am sure that I will continue to refer back to these into 2019 and beyond, and hopefully find some more brilliant resources to share. Happy New Year!

vickdata

Sharing my journey in Data Science

Rebecca Vickery

Written by

Data Scientist, Holiday Extras | https://twitter.com/vickdata | https://www.linkedin.com/in/rebecca-vickery-20b94133/

vickdata

vickdata

Sharing my journey in Data Science

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade