Exceptional Resources for Data Science Interview Preparation. Part 3: Specialized Machine Learning

17 min readJun 23, 2024

Hi! My name is Artem. I work as a Data Scientist at MegaFon (a platform for secure data monetization, OneFactor). We build credit scoring, lead generation and anti-fraud models using telecom data, and we also do geoanalytics.

In the previous article, I shared resources for preparing for the stage on classical machine learning.

Let’s remember what sections the interview process for the position of Data Scientist consists of:

Live Coding
Classic Machine Learning
Specialized Machine/Deep Learning
Machine Learning System Design (middle+/senior)
Behavioral Interview (middle+/senior)

A typical interview process through the eyes of DALL-E 3

In this article, we will look at resources that can be used to prepare for the section on specialized machine learning.

Remarks

Most of the resources in this article are free, but there are a few paid ones. I recommend buying them only if you clearly understand that you do not want or cannot spend your personal time searching for information on your own.

I have highlighted my favorite materials ⭐.

Resources
- Deep learning
- Natural Language Processing
- Computer Vision
- Graph Neural Networks
- Reinforcement Learning
- Recommender Systems
- Time Series
- Big Data
Let’s sum it up
What’s next?

Resources

In addition to the basics of machine learning (we reviewed useful materials in the previous article), many tasks also require specific knowledge, which you will probably be tested during one of the interview stages.

Let’s list the most popular ML trends at the moment:

Natural Language Processing (NLP)
Computer Vision (CV)
Speech Recognition (Automated Speech Recognition (ASR) / speech-to-text (STT))
Speech Synthesis / text-to-speech (TTS)
Reinforcement Learning (RL)
Time series (TS)
Recommender System (RecSys)

In all these areas, Deep Learning (DL) methods are currently being actively used, although initially, many of them used classical ML approaches over hand-crafted features. Therefore, we will first look at materials for studying the theory of deep learning, and then dive into more applied areas.

Deep Learning

Neural Network approaches now dominate in solving problems that require analysis of unstructured data (text, speech, video, image). Below we will consider materials that will help in studying the theory and practice of deep learning.

Books

Deep Learning with Python by François Chollet

My first edition (and you’d better choose the second one)

An excellent book that serves as the perfect introduction to deep learning. By the way, I have the first edition of this book in printed format and it came in handy more than once when I was just starting to study DL.

⭐ Dive into Deep Learning

Interactive book on deep learning with code (in PyTorch, NumPy/MXNet, JAX and TensorFlow) and mathematics. Each section in the book is a separate Jupyter notebook, the code in which can be run and changed at your discretion. Used in 500 universities in 70 countries.

Understanding Deep Learning by Simon J.D. Prince

This book is about the ideas behind deep learning. The first part of the book introduces deep learning models and discusses how to train them, measure their performance, and improve it. The next part looks at architectures for working with images, text, and graphs. These chapters require only introductory knowledge of linear algebra, calculus and probability theory and should be accessible to any second-year technical university student. Subsequent parts of the book are devoted to generative models and reinforcement learning. These chapters require advanced knowledge of probability and calculus and are intended for more advanced students.

The Little Book of Deep Learning by François Fleuret

Schematic representation of a neural network

Many important topics are briefly (but meaningfully) discussed here: backpropagation, dropout, normalization, activation functions, attention layers, etc. At the same time, the book is suitable even for beginners: at the beginning, the author explains the basic concepts and even talks about GPUs and tensors. It is also worth noting that this version is adapted for reading on a phone.

⭐ What are embeddings by Vicki Boykis

This book is dedicated to embeddings:

Explains what embeddings are (using the example of a recommendation system)
Classic and modern approaches to creating embeddings are considered
The use of embeddings in production is discussed (how to receive, store, evaluate, etc.)

Courses

Deep Learning Specialization paid

My Certificate of Completion for Deep Learning Specialization

Good deep learning specialization on Coursera. It is suitable as an introductory course, but at the time of its completion (2019), practice was not the strong point of this specialization (you can read the reviews).

I would recommend watching the lectures and not paying for them.

MIT 6.S191. Introduction to Deep Learning

MIT’s introductory program in deep learning techniques with applications to natural language processing, computer vision, biology, and more! Here you’ll learn about deep learning algorithms, gain hands-on experience building neural networks, and master the latest topics, including large language models and generative artificial intelligence.

⭐ Practical Deep Learning for Coders от fast.ai

Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD

A free course designed for people who want to learn how to apply deep learning and machine learning to solve practical problems:

Training DL models for computer vision, natural language processing, solving tabular problems and collaborative filtering problems
Deploying Models
Using PyTorch as well as popular libraries such as fastai and Hugging Face

The course is perfectly complemented by the free book Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD.

⭐ Learn PyTorch for Deep Learning: Zero to Mastery

This course will teach you the basics of machine learning and deep learning using PyTorch:

Fundamentals
Neural Network Classification
Computer Vision
Custom Datasets
Going Modular
Transfer Learning
Experiment Tracking
Paper Replicating
Model Deployment

Full Stack Deep Learning

The course is suitable for people who already know the basics of DL and would like to better understand how to apply DL in real-world problems:

Statement of the problem and estimation of the project cost
Search, cleaning, processing, tagging, synthesis and augmentation of data
Selecting the right platform and computing infrastructure
Troubleshoot training and ensure reproducibility
Deploying the model to production
Monitor and continuously improve the deployed model
How ML teams work and how to manage ML projects
Using Large Language Models (LLM) and Other Foundation Models

Efficient Deep Learning Systems

This course, like the previous one, is more aimed at the production and optimization of DL models:

Introduction
Experiment tracking, model and data versioning, testing DL code in Python
Training optimizations, profiling DL code
Basics of distributed ML
Data-parallel training and All-Reduce
Training large models
Python web application deployment
LLM inference optimizations and software
Efficient model inference

⭐ Neural Networks: Zero to Hero by Andrej Karpathy

Andrej Karpathy reflects on what backpropagation is

Andrey Karpati’s course on building neural networks from scratch — from the basics of backpropagation to modern deep neural networks such as GPT.

According to the author, language models are a great place to learn deep learning, even if you intend to eventually move on to other areas such as computer vision, because much of what you learn will be immediately applicable to the new one. you area.

Cheatsheets

Brothers Afshin and Sherwin Amidi have prepared guides that highlight the important points of each subject that Sherwin taught at Stanford.

These materials are suitable when you need to remember the main points in a certain topic in a short time, for example, before an interview.

Convolutional Neural Networks

Layer types, hyperparameters for filters, activation functions
Object Detection, Face Verification and Recognition
Neural Style Transfer; architectures using computational tricks

Recurrent Neural Networks

Vanishing/exploding gradient, GRU, LSTM, varieties of RNNs
Word2vec, skip-gram, negative sampling, GloVe, attention model
Language model, beam search, Bleu speed

Deep Learning Tips and Tricks

Data augmentation, batch normalization, regularization
Xavier initialization, transfer learning, adaptive learning rates\
Overfitting small batch, gradient checking

Other

⭐ Deep Learning Models by Sebastian Raschka

Example of materials that the site offers

A collection of various deep learning architectures, models, and training tips (TensorFlow and PyTorch) in Jupyter Notebooks.

Deep Learning Tuning Playbook

A guide to tuning hyperparameters for deep learning models.

The Incredible PyTorch

This is a curated list of tutorials, projects, libraries, videos, articles, books and everything related to PyTorch.

Natural Language Processing

In this section, we will consider resources (mainly courses) on natural language text processing.

Resources (40+ links) entirely devoted to Large Language Models (LLM) and Prompt Engineering do not fit into the article, so you can find them in the Data Science Resources repository.

Courses

⭐ NLP Course For You

An excellent course covering both classical approaches and modern topics:

Word Embeddings
Text Classification
Language Modeling
Seq2seq and Attention
Transfer Learning
LLMs and Prompting
Transformer architecture & training tricks
Conversation systems, instruction fine-tuning & RLHF
Efficient inference of NLP models

Stanford Courses

⭐ Stanford CS224N: NLP with Deep Learning + Video + Notes
A classic NLP course from Stanford that requires no introduction:
- Word Vectors
- Word Vectors and Language Models
- Backpropagation and Neural Network Basics
- Dependency Parsing
- Recurrent Neural Networks
- Sequence to Sequence Models and Machine Translation
- Transformers
- Pretraining
- Post-training (RLHF, SFT)
- Efficient Adaptation
- Question Answering
- Security and Privacy
- Benchmarking and Evaluation
- Code Generation
- Multimodal Language Models
- Human Centered NLP
- Deployment and Efficiency
- Open Questions in NLP 2024
Stanford CS224U: Natural Language Understanding
Stanford CS 224V: Conversational Virtual Assistants with Deep Learning

⭐ Hugging Face NLP course

This course will teach you about natural language processing (NLP) using libraries from the Hugging Face ecosystem — 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizers, and 🤗 Accelerate — as well as the Hugging Face Hub. It’s completely free and without ads:

Chapters 1 to 4 provide an introduction to the main concepts of the 🤗 Transformers library. By the end of this part of the course, you will be familiar with how Transformer models work and will know how to use a model from the Hugging Face Hub, fine-tune it on a dataset, and share your results on the Hub!
Chapters 5 to 8 teach the basics of 🤗 Datasets and 🤗 Tokenizers before diving into classic NLP tasks. By the end of this part, you will be able to tackle the most common NLP problems by yourself.
Chapters 9 to 12 go beyond NLP, and explore how Transformer models can be used to tackle tasks in speech processing and computer vision. Along the way, you’ll learn how to build and share demos of your models, and optimize them for production environments. By the end of this part, you will be ready to apply 🤗 Transformers to (almost) any machine learning problem!

Linguistics for Language Technology

This course will be useful for those who want not only to solve NLP problems using machine learning, but also to understand the basics of linguistic theory: words, morphology, syntax, interlingual variations, semantics (word meanings, sentence meanings), discourse and pragmatics.

Books

⭐ Speech and Language Processing by Dan Jurafsky and James H. Martin

A textbook that covers both classical and modern approaches from Daniel Jurafsky — this is an immortal classic that is constantly being updated.

As a supplement, you can also take a look at the course LSA 311: Computational Lexical Semantics from the same author.

⭐ Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra adn Thomas Wolf

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you’re a data scientist or coder, this practical book shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You’ll quickly learn a variety of tasks they can help you solve.

Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
Learn how transformers can be used for cross-lingual transfer learning
Apply transformers in real-world scenarios where labeled data is scarce
Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments

Transformers for Natural Language Processing by Denis Rothman

This book will teach you how to train and configure deep neural network architectures (for NLP) using Python, Hugging Face, and OpenAI (GPT-3, ChatGPT, and GPT-4).

Papers

I thank Ilya Gusev for the list of articles:

Word2Vec, Mikolov et al., Efficient Estimation of Word Representations in Vector Space
FastText, Bojanowski et al., Enriching Word Vectors with Subword Information
Attention, Bahdanau et al., Neural Machine Translation by Jointly Learning to Align and Translate
Transformers, Vaswani et al., Attention Is All You Need
BERT, Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT-2, Radford et al., Language Models are Unsupervised Multitask Learners
GPT-3, Brown et al, Language Models are Few-Shot Learners
LaBSE, Feng et al., Language-agnostic BERT Sentence Embedding
CLIP, Radford et al., Learning Transferable Visual Models From Natural Language Supervision
RoPE, Su et al., RoFormer: Enhanced Transformer with Rotary Position Embedding
LoRA, Hu et al., LoRA: Low-Rank Adaptation of Large Language Models
InstructGPT, Ouyang et al., Training language models to follow instructions with human feedback
Scaling laws, Hoffmann et al., Training Compute-Optimal Large Language Models
FlashAttention, Dao et al., FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
NLLB, NLLB team, No Language Left Behind: Scaling Human-Centered Machine Translation
Q8, Dettmers et al., LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Self-instruct, Wang et al., Self-Instruct: Aligning Language Models with Self-Generated Instructions
Alpaca, Taori et al., Alpaca: A Strong, Replicable Instruction-Following Model
LLaMA, Touvron, et al., LLaMA: Open and Efficient Foundation Language Models

Other

⭐ 100 questions NLP

A list of 100 questions (on classic and modern NLP, as well as large language models) will help you structure your NLP learning and interview preparation process.

NLP chat

NLP chat in Telegram with 600+ participants, in which:

Ask questions (and sometimes answer)
Publish news
Announce competitions, seminars and conferences
Post vacancies

Computer Vision

In this section we will consider resources on computer vision.

Computer vision combines approaches to image processing and analysis, where the main goal is to teach a computer to perceive images like a human. Technologies based on computer vision are now actively used in industry, medicine, automotive and many other industries.

⭐ CS231n: Deep Learning for Computer Vision + Videos

A classic CV course from Stanford that requires no introduction:

Deep Learning Basics
Image Classification with Linear Classifiers
Regularization and Optimization
Neural Networks and Backpropagation
Image Classification with CNNs
CNN Architectures
Training Neural Networks
Recurrent Neural Networks
Attention and Transformers
Video Understanding
Object Detection and Image Segmentation
Visualizing and Understanding
Self-supervised Learning
Robot Learning
Generative Models
3D Vision

EECS 442: Computer Vision + Videos

An introductory course on computer vision:

Image formation / projective geometry / lighting
Practical linear algebra
Image processing / descriptors
Image warping
Linear models + optimization
Neural networks
Applications of neural networks
Motion and flow
Single-view geometry
Multi-view geometry
Applications

Graph Neural Networks

In this section, we will consider materials on graph neural networks.

Graph neural networks are a type of neural network architecture that allows learning from graph data. In the most general sense, a graph is a set of points (vertices, nodes) that are connected by a set of lines (edges, arcs). Graph neural networks are used in recommendation systems, combinatorial optimization, computer vision, physics and chemistry, drug development, and other industries.

CS224W: Machine Learning with Graphs + Video

This course focuses on computational, algorithmic, and modeling problems common to large graph analysis. By exploring the basic structure of a graph and its features, students are introduced to machine learning techniques and data mining tools that provide insight into various networks.

Graph Neural Networks

Introductory tutorial on graph neural networks.

Graph Neural Networks for RecSys

Tutorial on using graph neural networks in recommendation systems.

Reinforcement Learning

In this section we will consider materials on reinforcement learning.

Reinforcement learning is a machine learning method that trains a model that has no knowledge of the system, but has the ability to perform some actions in it. Actions transfer the system to a new state and the model receives some reward from the system.

It’s not as popular an area as NLP or CV, but it’s now gaining traction and is being used in autonomous cars, finance, medicine, industrial automation, and other areas.

Spinning Up in Deep RL

This is an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning (deep RL).

For the unfamiliar: reinforcement learning (RL) is a machine learning approach for teaching agents how to solve tasks by trial and error. Deep RL refers to the combination of RL with deep learning.

This module contains a variety of helpful resources, including:

a short introduction to RL terminology, kinds of algorithms, and basic theory,
an essay about how to grow into an RL research role,
a curated list of important papers organized by topic,
a well-documented code repo of short, standalone implementations of key algorithms,
and a few exercises to serve as warm-ups.

🤗 Deep Reinforcement Learning Course

This course will teach you about Deep Reinforcement Learning from beginner to expert.

In this course, you will:

📖 Study Deep Reinforcement Learning in theory and practice.
🧑‍💻 Learn to use famous Deep RL libraries such as Stable Baselines3, RL Baselines3 Zoo, Sample Factory and CleanRL.
🤖 Train agents in unique environments such as SnowballFight, Huggy the Doggo 🐶, VizDoom (Doom) and classical ones such as Space Invaders, PyBullet and more.
💾 Share your trained agents with one line of code to the Hub and also download powerful agents from the community.
🏆 Participate in challenges where you will evaluate your agents against other teams. You’ll also get to play against the agents you’ll train.
🎓 Earn a certificate of completion by completing 80% of the assignments.

Recommender Systems

Recommender systems is a fairly popular area in ML and is used everywhere, from streaming services to marketplaces, that is, they are used where you need to recommend something to a buyer/client based on the history of interaction with the system of all its users.

Books

Practical Recommender Systems by Kim Falk

Practical Recommender Systems explains how recommender systems work and shows how to create and apply them for your site. After covering the basics, you’ll see how to collect user data and produce personalized recommendations. You’ll learn how to use the most popular recommendation algorithms and see examples of them in action on sites like Amazon and Netflix. Finally, the book covers scaling problems and other issues you’ll encounter as your site grows.

Personalized Machine Learning by Julian McAuley

A fairly large section of this book (pages 79–214) is devoted to recommender systems, so I decided to recommend it.

Other

Recommenders

This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:

Prepare Data: Preparing and loading data for each recommendation algorithm.
Model: Building models using various classical and deep learning recommendation algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
Evaluate: Evaluating algorithms with offline metrics.
Model Select and Optimize: Tuning and optimizing hyperparameters for recommendation models.
Operationalize: Operationalizing models in a production environment on Azure.

Time Series

⭐ Topic 9. Time Series Analysis with Python

The Open Machine Learning Course, which we talked about in the second article, has chapter 9 dedicated to time series analysis.
It tells how to work with them in Python, what possible methods and models can be used for forecasting; what is double and triple exponential weighting? what to do if stationarity is not for you; how to build SARIMA and not die; and how to predict with xgboost.

Time Series

Kaggle Learn’s introduction to time series.

Forecasting time series with gradient boosting: Skforecast, XGBoost, LightGBM, Scikit-learn and CatBoost by Joaquín Amat Rodrigo, Javier Escobar Ortiz

This guide shows how to use skforecast library methods for time series forecasting using models from the XGBoost, LightGBM, Scikit-learn and CatBoost libraries.

Big Data

This direction is a little out of the selection because it characterizes not the problem that we solve, but the tool with which we do it, but since companies with a large amount of data at their disposal often require knowledge of these tools (namely, Spark ), and they ask about them in interviews, I decided to include this section in the article.

⭐ Spark in Action by Jean-Georges Perrin

Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms.

Learning Spark by Jules S. Damji, Brooke Wenig, Tathagata Das & Denny Lee

This book offers a structured approach to learning Apache Spark, covering new developments in the project.

⭐ Data Analysis with Python and PySpark by Jonathan Rioux

Data Analysis with Python and PySpark helps you solve the daily challenges of data science with PySpark. You’ll learn how to scale your processing capabilities across multiple machines while ingesting data from any source — whether that’s Hadoop clusters, cloud data storage, or local data files. Once you’ve covered the fundamentals, you’ll explore the full versatility of PySpark by building machine learning pipelines, and blending Python, pandas, and PySpark code.

Let’s Sum It Up

If you haven’t read it yet, I recommend reading the “Learning How to Learn” and “Let’s Sum It Up” blocks from the first article, since everything said there is also applicable for preparing for the machine learning section.

What’s next?

In the next article we will analyze materials for preparing for the section on machine learning system design.

You can find the latest resources for this series of articles in the Data Science Resources repository, which will be maintained and updated. You can also subscribe to my Telegram channel Data Science Weekly, where I share interesting and useful materials every week.

If you know of any cool resources that I didn’t include in this list, please write about them in the comments.

Exceptional Resources for Data Science Interview Preparation. Part 3: Specialized Machine Learning

Remarks

Table of contents

Resources

Deep Learning

Books

Courses

Cheatsheets

Other

Natural Language Processing

Courses

Books

Papers

Other

Computer Vision

Graph Neural Networks

Reinforcement Learning

Recommender Systems

Books

Other

Time Series

Big Data

Let’s Sum It Up

What’s next?

Written by Artem Ryblov