Summary of chapter -01 (Overview of Machine Learning Systems)

Aakash Goel
Designing Machine Learning Systems Summary
4 min readMar 13, 2024


The aim of this article is to summarize Chapter 01, “Overview of Machine Learning Systems” of book — “Designing Machine Learning Systems by Chip Huyen (O’Reilly). Copyright 2022 Huyen Thi Khanh Nguyen, 978–1–098–10796–3.”, providing a quick review for revision purposes.

When to use ML

  • ML is a powerful tool but not a universal solution; its use should be determined by necessity and cost-effectiveness.
  • ML learns complex patterns from data to make predictions on unseen data.
  • For ML to be applicable, it must have the capacity to learn and there must be complex patterns to learn from.
  • ML is suitable for tasks with complex patterns like object detection and speech recognition.
  • ML is distinct from traditional software solutions, often referred to as Software 2.0, as it learns patterns rather than being explicitly programmed.
  • ML excels in tasks deemed complex for machines, like object detection, but may struggle with tasks that are straightforward for humans, like recognizing cats in images.
  • Unseen data: ML models perform well when unseen data shares patterns with the training data, ensuring accurate predictions.
  • Low-cost prediction errors: ML is suitable when the consequences of incorrect predictions are minimal, as seen in recommender systems.
  • Scalability: ML solutions are advantageous for tasks requiring large-scale predictions or data processing, warranting the initial investment.
  • Dynamic patterns: ML adapts well to tasks with evolving patterns, allowing for continuous learning and model updates.
  • Ethical considerations, cost-effectiveness, and simpler solutions should be evaluated before employing ML.
  • ML can complement non-ML solutions by addressing specific components of a problem, enhancing overall efficiency.

ML UseCases

  • Consumer applications, like search engines and recommender systems, harness ML to personalize recommendations and search results.
  • Smartphones employ ML for predictive typing, photo editing suggestions, and biometric authentication.
  • Machine translation exemplifies ML’s potential to bridge language barriers, enabling cross-cultural communication.
  • Smart home devices, including personal assistants and security cameras, leverage ML for enhanced functionalities like health monitoring and security alerts.
  • Enterprise ML applications prioritize accuracy and efficiency, serving purposes such as cost reduction, customer insights generation, and process automation.
  • Fraud detection: ML is utilized for anomaly detection in transactions to predict and prevent fraudulent activity based on historical data.
  • Price optimization: ML algorithms estimate dynamic prices to maximize objectives such as revenue or growth rate, suitable for industries with fluctuating demand like internet ads and ride-sharing.
  • Demand forecasting: ML forecasts customer demand to optimize inventory, resource allocation, and pricing strategies, crucial for businesses like grocery stores.
  • Customer acquisition: ML helps identify potential customers, target ads effectively, and optimize discounts to reduce acquisition costs and increase profit margins.
  • Churn prediction: ML predicts when customers are likely to stop using products or services, enabling proactive measures to retain them.
  • Automated support ticket classification: ML analyzes ticket content to efficiently route them to relevant departments, reducing response time and enhancing customer satisfaction.
  • Brand monitoring: ML tracks brand mentions and sentiment across various platforms to maintain a positive brand image and address negative sentiment promptly.
  • Healthcare applications: ML aids in skin cancer detection, diabetes diagnosis, and other healthcare tasks, typically facilitated through healthcare providers due to strict accuracy and privacy requirements.

Research Vs. Production ML

  • In research, emphasis is on achieving state-of-the-art model performance on benchmark datasets, while in production, the focus shifts to meeting various stakeholder requirements. Consider a mobile app recommending restaurants: ML engineers aim for model complexity and data use for accurate recommendations, while the sales team prefers recommending expensive restaurants to maximize service fees. Also, For example, ensembling techniques popular in research may not be practical due to increased complexity and reduced interpretability.
  • Research prioritizes fast training and high throughput, whereas production prioritizes fast inference and low latency. The product team prioritizes low latency, aiming for recommendations within 100 milliseconds, while the ML platform team focuses on system scalability and stability.
  • Research often deals with static data, while production systems must adapt to constantly shifting data.
  • Fairness and interpretability are typically not central concerns in research but are critical considerations in production ML systems.
  • In research projects, stakeholders typically align on a single objective, often model performance on benchmark datasets. However, in production ML systems, various stakeholders have different, and sometimes conflicting, requirements.
  • Decoupling objectives, such as recommending restaurants for user clicks versus maximizing app revenue, may require developing separate models for each objective and combining their predictions.
  • Latency is not a single number but a distribution, with percentiles providing a more nuanced understanding. Higher percentiles, such as p90, p95, and p99, help identify outliers and performance issues affecting a small percentage of requests.
  • Interpretability is essential for users and developers to understand why decisions are made by ML models. While researchers prioritize model performance, interpretability is a requirement in production settings to build trust and detect biases.

ML Vs. Traditional Software

  • ML systems are part code, part data, and part artifacts created from both, requiring unique tools and approaches. Testing and versioning data becomes as important as testing and versioning code, posing challenges such as data versioning and data quality monitoring.
  • Challenges in ML systems include the size of models, which can have hundreds of millions or billions of parameters, requiring significant memory and processing power. Monitoring and debugging complex models in production are also challenging due to the lack of visibility into their workings.

This chapter emphasizes the importance of understanding the holistic nature of ML systems, encompassing not only algorithms but also data, deployment, monitoring, maintenance, and infrastructure. This system approach will be further elaborated upon in subsequent chapters of the book.