Orchestrating machine learning experiments for MLOps using Apache Airflow

Published in

Analytics Vidhya

5 min readJul 26, 2020

Nowadays that more and more machine learning models are going to production, the need to operationalize the overall Machine Learning workflow becomes crucial to companies who adopt artificial intelligence capabilities.

We can use Apache Airflow platform to orchestrate the different phases of machine learning

Machine learning experiments usually follow a predefined set of phases, such as:

Data ingestion: Collect and integrate data from different sources
Data validation: Ensure the collected data is valid and consistent with expectations
Data preparation: Validate, preprocess, extract features and transform the data to get it ready for the machine learning task
Model training: actual training of machine learning models, hyperparameters tuning
Model evaluation: Evaluate model performance, accept or reject its results
Model deployment: Deploy models if the performances of the previous step are acceptable to go in production

Orchestrating the different phases through a well defined and repeatable workflow can boost up productivity in your overall machine learning pipelines. By both promoting well structured codebase and creating a way to reproduce systematically the steps. Hence, provide capabilities such as Continuous training and…

Orchestrating machine learning experiments for MLOps using Apache Airflow

Written by Andrea Capuano