My Recommendations to Learn Machine Learning in Production

Here is a compilation of books, courses, and repositories to get you started.

elvis
DAIR.AI
6 min readOct 19, 2020

--

For the last couple of months, I have been doing some research on the topic of machine learning (ML) in production. I have shared a few resources about the topic on Twitter, ranging from courses to books.

In terms of the ML in production, I have found some of the best content in books, repositories, and a few courses. Here are my recommendations for learning machine learning in production.

This is not an exhaustive list but I have carefully curated it based on my research, experience, and observations.

📘 Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

by Aurélien Géron

This is one of the most popular machine learning books and with good reason. If you are just getting started with machine learning I suggest you go through this book and explore the examples. The book doesn’t heavily focus on how to deploy ML models (although there is a nice chapter about it towards the end), but it provides a solid foundation on concepts related to machine learning and deep learning including decision trees, SVMs, CNNs, and much more.

🌐 Dive into Deep Learning

by Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola

Before jumping into the topic of getting ML models into production, I strongly believe it is important to understand the fundamentals of machine learning and deep learning. While this particular book doesn’t focus directly on deploying models it offers a good foundation on the theoretical aspects of deep learning, including code examples, which will help students to learn how to more effectively train deep learning models. The best part is that the book is free and contains examples in PyTorch, TensorFlow, and MXNet.

Source: https://d2l.ai/

📘 Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD

by Jeremy Howard and Sylvain Gugger

This book probably doesn’t need much introduction. If you are a student of fast.ai you are probably familiar with the authors and the tools used in this book. For people getting started with deep learning, this is another great resource that focuses on practical tips for training effective deep learning models and turning those models into usable web applications. This book is accompanied by a course which you can find here.

👩‍🏫 Full Stack Deep Learning

by Pieter Abbeel, Sergey Karayev, Josh Tobin

At its core, this public online course focuses on teaching the best practices, technologies, and techniques used to implement and deploy deep learning models into production. Topics range from infrastructure and tooling to testing and deployment. I particularly like the lectures on data management and machine learning teams which are important aspects of creating a world-class machine learning infrastructure for deploying models.

Source: https://course.fullstackdeeplearning.com/

Source: https://course.fullstackdeeplearning.com/

📘 Designing Data-Intensive Applications

by Martin Kleppmann

An important aspect of shipping machine learning models in production is building effective data pipelines. Given the huge volume of data that companies need to deal with today, designing data-intensive applications that consider issues such as scalability and consistency is key. This book discusses the pros and cons of different solutions used for storing and processing data.

📘 Building Machine Learning Pipelines

by Hannes Hapke and Catherine Nelson

This book provides an excellent guide to automate a machine learning pipeline. The focus is on helping data scientists deploy models more easily and assist managers to more efficiently manage and accelerate machine learning projects. Note that the book is focused on TensorFlow.

📘 Building Machine Learning Powered Applications

by Emmanuel Ameisen

This is another great book on the topic of building machine learning-powered applications with emphasis on starting with very basic models following a simple framework to then shipping more sophisticated applications. I really enjoyed the flow and how approachable the book content is.

📘 Introducing MLOps: How to Scale Machine Learning in the Enterprise

by Clément Stenac, Léo Dreyfus-Schmidt, Kenji Lefèvre, Nicolas Omont, and Mark Treveil

While still in its early release at the time of writing this article, this is a fantastic book to get your journey started on operationalizing machine learning models. It contains best practices, tips, and a framework for building ML pipelines and workflow. Topics range from productionalization and deployments to monitoring ML systems. An interesting subject in the book is a discussion of the different roles involved in the MLOps process. I really like that the book also discusses the topic of Responsible AI. When deploying models in production for them to be used as a service, these models can impact the behavior of users or discriminate against users so you want to avoid any harm that could potentially arise from these systems.

🐙 Awesome MLOps

by Larysa Visengeriyeva

This is a great repository containing resources on operationalizing machine learning including books, talks, papers, and newsletters. Larysa regularly maintains this repository and it’s one of my go-to resources to get informed on new and interesting educational resources related to MLOps.

Source: https://github.com/visenger/awesome-mlops

Source: https://github.com/visenger/awesome-mlops

🐙 Awesome production machine learning

This repository lists tools used for all stages of the model’s lifecycle (deploy, monitor, version, scale, and security).

Source: https://github.com/EthicalML/awesome-production-machine-learning

Source: https://github.com/EthicalML/awesome-production-machine-learning

📘 Kubeflow for Machine Learning

by Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko

This book strictly focuses on using Kubeflow to take projects from research to production. It provides guidance on how to get your models in production while emphasizing portability, scalability, and reliability. The premise is on relying on cloud-native tools to drive the model’s lifecycle.

That ends my recommendations for learning about machine learning in production for now. Feel free to share other resources with me either on Twitter or in the comments section. I am always looking to improve these recommendations.

If you want to follow my updates on the topics of machine learning engineering and research, please connect with me on Twitter.

--

--