MLflow, Hyperopt, Prefect, Evidently, and Grafana: The Ultimate Guide to Building, Tracking, Orchestrating, and Monitoring Machine Learning Pipelines
The steps are the following:
- Introduction
- Set up the Environment
- Configure MLflow
- Load and split the data
- Train and tune the model
- Choose the best model
- Promote the best model for production
- Deploy Grafana Dashboard and Postgres database using Docker Compose
- Setting up the Postgres Database
- Monitor model performance and serve model
- Orchestrate the pipeline using Prefect
- Simulating a Production Environment
Introduction
In this comprehensive guide, we explore the seamless integration of MLflow, Hyperopt, Prefect, Evidently, and Grafana. You’ll discover how these tools empower you to:
- Build Robust Models: MLflow simplifies model development, allowing you to experiment and iterate efficiently, while Hyperopt optimizes model hyperparameters for peak performance.
- Track and Version Models: Keep meticulous records of your models with MLflow’s tracking capabilities, ensuring reproducibility and collaboration across teams.
- Orchestrate Workflows: Prefect enables you to design, schedule, and automate complex ML workflows, ensuring your models are trained and deployed systematically.
- Monitor Model Performance: Evidently helps you gain deep insights into model behavior and detect issues early, ensuring models remain reliable in production.
- Visualize and Alert: Grafana provides real-time visualization and alerting, giving you the tools to continuously monitor your ML pipelines and respond to anomalies swiftly.
In this age of data-driven decision-making, these tools are your allies, streamlining the ML lifecycle from development and experimentation to deployment and monitoring. Our guide equips you with the knowledge to harness their potential, elevating your machine-learning projects to new heights of efficiency and reliability.
Set up the Environment
1- Instal libraries, in cmd run
pip install mlflow
pip install hyperopt
pip install xgboost
pip install prefect
pip install evidently
2- Lunch mlflow server, in cmd run
mlflow server --backend-store-uri sqlite:///backend.db --default-artifact-root ./mlruns
This command starts an instance of the MLflow server with the following configurations:
--backend-store-uri sqlite:///backend.db
: specifies the backend store URI where the MLflow server should persist metadata related to experiments, runs, parameters, metrics, and artifacts. In this case, the backend store uses an SQLite database file namedbackend.db
.--default-artifact-root ./mlruns
: specifies the default artifact store location where the MLflow server should store artifacts generated by runs. In this case, the default artifact store location is the./mlruns
directory relative to the current working directory.
3- Lunch prefect server, in cmd run
prefect server start
The prefect server start
command is used to start the Prefect Server. The Prefect Server is a central daemon that provides a variety of features for managing and executing Prefect flows, including:
- Flow execution: The Prefect Server can be used to execute flows, both locally and in a distributed fashion.
- Flow monitoring: The Prefect Server can be used to monitor the execution of flows, providing information such as the status of each task, the logs for each task, and the metrics for each task.
- Flow scheduling: The Prefect Server can be used to schedule the execution of flows, either on a recurring basis or on demand.
- Flow versioning: The Prefect Server can be used to version flows, providing a way to track changes to flows over time.
To view the Prefect UI, open a web browser and navigate to http://127.0.0.1:4200/
3- Install Docker
- Download the Docker Desktop for Windows installer from the Docker website: Docker Desktop for Windows.
- Run the installer and follow the installation wizard.
- Ensure that you have enabled virtualization in your BIOS settings if required.
Configure MLflow
This Python code defines a Prefect task, marked with @task
, that sets up the environment for using MLflow, a tool for managing machine learning experiments. It does the following:
- Sets the MLflow tracking URI to “http://127.0.0.1:5000," assuming a local MLflow tracking server is running on that address.
- Specifies the active experiment by name. If the experiment doesn’t exist, it creates one. All subsequent MLflow operations within this task are associated with this experiment.
- Retrieves and returns the ID of the experiment, which can be useful for further interactions.
Load and split the data
This code defines two Prefect tasks within a workflow for handling a machine learning dataset:
load_data
Task:
- This task loads a dataset using Scikit-learn’s
datasets.load_digits()
function. The dataset contains handwritten digit images. - It extracts the features (pixel values) from the dataset and the corresponding target labels.
- The data is organized into a Pandas DataFrame for further processing.
- The task returns this DataFrame.
split_data
Task:
- This task takes the DataFrame returned by the
load_data
task as input. - It splits the dataset into training and testing subsets using Scikit-learn’s
train_test_split
function. This is a common step in preparing data for machine learning. - The split is performed with 80% of the data used for training and 20% for testing, and a random seed is set for reproducibility.
- The task returns four variables:
x_train
,x_test
,y_train
, andy_test
, representing the training features, testing features, training labels, and testing labels, respectively.
Train and tune the model
Hyperparameter Search Space Definition:
- This task defines a search space for hyperparameter optimization. It specifies different hyperparameters and their possible values as options for optimization. These hyperparameters include learning rate, max depth, gamma, colsample bytree, reg_alpha, reg_lambda, and seed.
Objective Function:
- An objective function (objective) is defined within the train_hyperparameter_tuning task. This function takes a set of hyperparameters as input.
- Inside the objective function:
- A new MLflow run is started to log parameters and metrics.
- An XGBoost classifier is created with the given hyperparameters.
- The classifier is trained on the training data (x_train, y_train) and evaluated on the testing data (x_test, y_test).
- Metrics like accuracy and F1 score are calculated and logged to MLflow.
- The trained model is also logged to MLflow as an artifact.
- The objective function returns a dictionary with the negative accuracy value, which Hyperopt tries to minimize.
Hyperparameter Optimization:
- The fmin function from Hyperopt is called to perform Bayesian hyperparameter optimization (tpe.suggest) using the defined search space.
- The optimization aims to find the hyperparameters that minimize the negative accuracy (maximize accuracy).
Return Best Result:
- The best set of hyperparameters found by Hyperopt is returned as best_result.
Choose the best model
MLflow Client Setup:
- The task sets up an MLflow client to interact with an MLflow tracking server located at “http://127.0.0.1:5000".
Retrieve the Best Run:
- searches for runs within a specific MLflow experiment (
experiment_id
). - The runs are sorted by accuracy in descending order (
order_by=["metrics.accuracy DESC"]
), and the top run with the highest accuracy is selected as the best run.
Get the Run ID and Model URI:
- The run ID of the best run is extracted.
- A model URI is constructed based on the run ID.
Search for Model Versions:
- The task searches for model versions associated with the specific run.
- It constructs a filter string to search for versions linked to the identified run.
- The results are returned as a list.
Return the Best Model Version and Model URI:
- The version number of the best model (
model_version
) is obtained from the search results. - The constructed model URI (
model_uri
) and the model version are returned as a tuple.
Promote the best model for production
Define the New Stage:
- The task specifies the target stage to which the model version will be promoted, which is set to “Production” as indicated by
new_stage
.
MLflow Client Setup:
- It initializes an MLflow client to interact with an MLflow tracking server located at “http://127.0.0.1:5000".
Promote the Model Version:
- The task calls the
transition_model_version_stage
method of the MLflow client to change the stage of a specific model version (model_version
) associated with a given model name (model_name
) to the new stage specified. - The
archive_existing_versions
parameter is set toFalse
, which means that existing versions of the model will not be archived when promoting this version.
Deploy Grafana Dashboard and Postgres database using Docker Compose
1- Check container status
checks the status of specified Docker containers by their names.
- It uses the
docker.from_env()
method to create a Docker client to interact with the Docker daemon running on the local system. - The function takes a list of container names (
container_names
) as input. - It initializes a counter
running_containers
to zero, which will keep track of how many of the specified containers are currently running. - For each container name in the input list:
- It uses the Docker client to list containers that match the provided name using the
client.containers.list()
method. - If there is at least one container with that name, it checks if the first container in the list (assuming no duplicate names) is running by inspecting its state with
client.api.inspect_container()
. - If the inspected container is running, it increments the
running_containers
counter. - After each container check, it waits for 1 second using
time.sleep(1)
before checking the next container. - Finally, it closes the Docker client connection with
client.close()
and returns the count of running containers (running_containers
).
2- Build Docker
- It takes a list of container names (
container_names
) as input. - It calls the
container_status
function to check the current status of the specified containers. The result is stored inrunning_containers
. - It checks if the count of running containers (
running_containers
) is not equal to the expected count. This condition is used to determine whether the desired containers are already running. - If the count of running containers is not equal to the expected count (indicating that the desired containers are not running), it uses the
docker-compose
command to start the Docker containers defined in thedocker-compose.yml
file, potentially rebuilding the associated Docker images if necessary.
version: '3.7'
volumes:
grafana_data: {}
networks:
front-tier:
back-tier:
services:
db:
container_name: postgres
image: postgres
restart: always
environment:
POSTGRES_PASSWORD: example
ports:
- "5432:5432"
networks:
- back-tier
adminer:
container_name: adminer
image: adminer
restart: always
ports:
- "8080:8080"
networks:
- back-tier
- front-tier
grafana:
container_name: grafana
image: grafana/grafana
user: "472"
ports:
- "3000:3000"
volumes:
- ./config/grafana_datasources.yaml:/etc/grafana/provisioning/datasources/datasource.yaml:ro
networks:
- back-tier
- front-tier
restart: always
This is a Docker Compose file (docker-compose.yml
) that defines a multi-container application. Here's a brief explanation of what this file does:
- Version: Specifies the version of the Docker Compose file format being used, which is ‘3.7’ in this case.
- Volumes: Defines a Docker volume named
grafana_data
. Volumes are used to persist data generated by containers. - Networks: Defines two Docker networks named
front-tier
andback-tier
. These networks can be used to isolate and connect containers. - Services: Specifies the different services (containers) that make up the application:
- db (PostgreSQL): This service uses the official PostgreSQL image, sets a password, maps port 5432 to the host, and connects it to the
back-tier
network. It's named "postgres." - adminer: This service uses the Adminer image, maps port 8080 to the host, and connects it to both
back-tier
andfront-tier
networks. It's named "adminer." - grafana: This service uses the Grafana image, sets a user, maps port 3000 to the host, mounts a configuration file, and connects it to both
back-tier
andfront-tier
networks. It's named "grafana."
3- Wait for the containers to start running.
- It uses the docker.from_env() method to create a Docker client to interact with the Docker daemon running on the local system.
- The task takes a list of container names (container_names) as input.
- For each container name in the input list, it enters a while loop.
- Inside the loop, it repeatedly checks the status of the specified container: It lists containers that match the provided name using the client.containers.list() method.If there is exactly one container with that name and it’s in a running state, the loop exits with break.
- If the container is not found or not in a running state, the task sleeps for 120 seconds (time.sleep(120)) before checking the container status again.
- After the loop exits, the Docker client connection is closed with client.close().
Setting up the Postgres Database
- It defines a SQL statement
create_table_statement
that drops a table if it exists and creates a new table named "predictions_metrics" with specified columns. - It establishes a connection to a PostgreSQL database running locally with connection details such as the host, user, password, database name, and port.
- It sets the isolation level to
ISOLATION_LEVEL_AUTOCOMMIT
, ensuring that database operations like creating a new database can be executed. - It executes a SQL query to check if a database named ‘test’ exists.
- If the ‘test’ database does not exist, it creates it using a SQL query.
- It establishes a connection to the newly created ‘test’ database and creates the ‘predictions_metrics’ table within it.
- It loads a machine learning model specified by the
model_uri
using MLflow'smlflow.pyfunc.load_model
method. - It uses the loaded model to make predictions on the
x_train
dataset and adds these predictions as a new column called 'prediction' inx_train
. - It establishes a connection to a PostgreSQL database running locally with connection details such as the username (‘postgres’), password (‘example’), host (‘localhost’), and port (‘5432’).
- It stores the
x_train
dataset with the added 'prediction' column into a table named 'reference' in the 'test' database. If the 'reference' table already exists, it is replaced. - It also stores the
x_test
dataset in a table named 'production' in the same 'test' database, again replacing it if it already exists.
Monitor model performance and serve model
It uses the os.system
function to execute two separate commands:
- The first command (
start killport 8000
) is intended to release or "kill" the port 8000 if it's already in use. This ensures that the specified port is available for the subsequent FastAPI service. - The second command (
start uvicorn main:app
) starts the Uvicorn web server, serving the FastAPI app defined in themain
module. This effectively launches the model-serving API.
This code defines a FastAPI endpoint for making predictions using a machine learning model, recording prediction drift metrics, and storing data in a PostgreSQL database. Here’s a concise explanation:
- The code sets the MLflow tracking URI and initializes a FastAPI application.
- Utility functions are defined:
load_model
: Loads an MLflow model from a specified URI.get_data_from_db
: Retrieves data from a PostgreSQL database.calculate_metrics_postgresql
: Calculates drift metrics between reference and current data and stores them in the database.- An input data model
InputData
is defined for the POST request. It includes fields for input data. - The
/predict/
endpoint receives POST requests with input data, processes the data, and makes predictions using the loaded MLflow model. - Prediction drift metrics are calculated by comparing input data’s predictions to reference data stored in the database.
- The prediction and metrics are returned as a response from the endpoint.
Orchestrate the pipeline using Prefect
This code defines a Prefect flow and deploys it with a schedule for a machine learning pipeline. Here’s a concise explanation:
The main
function is the core of the flow. It orchestrates the following steps:
- Initializes an MLflow experiment and retrieves its ID.
- Loads data and splits it into training and testing sets.
- Performs hyperparameter tuning and selects the best model.
- Promotes the best model for production use.
- Starts Docker containers specified in
container_names
. - Waits for the containers to be up and running.
- Prepares a PostgreSQL database.
- Prepares reference data and serves the machine learning model using FastAPI.
The if __name__ == "__main__":
block builds a Prefect deployment:
- It constructs a Prefect
Deployment
object, specifying themain
function as the flow, with parameters for the experiment name, model name, and container names. - A schedule is set using a Cron expression (e.g., every Thursday at 12:00 AM) and a specific timezone.
- The deployment is named “model_training_and_tuning_weekly” and given a version.
- It specifies the work queue name as “ml.”
- Finally, the deployment is applied, meaning the flow will execute as scheduled.
Run the following command in cmd
prefect agent start --pool default-agent-pool --work-queue ml
This command starts a Prefect agent to manage the execution of Prefect flows on a specific pool and work queue.
The prefect agent start
command starts a new Prefect agent process.
The --pool
flag specifies the name of the pool that the agent should use for executing flows. In this case, the pool is named default-agent-pool
.
The --work-queue
flag specifies the name of the work queue that the agent should use for receiving work. In this case, the work queue is named ml
.
Run the following command in cmd
python app.py
app.py
script contains a Prefect flow defined with the @flow
decorator, the flow will be registered with the Prefect backend and scheduled to run according to its defined schedule.
open a web browser and navigate to http://127.0.0.1:4200/
Simulating a Production Environment
get_data_from_db(table_name)
:
- This function connects to a PostgreSQL database on the local host with specified credentials.
- It reads data from a table named
table_name
and returns it as a Pandas DataFrame.
simulate_poduction()
:
- This function simulates a production environment by sending data to a specified API URL for predictions.
- It first retrieves production data from the ‘production’ table in the database using
get_data_from_db
. - Then, it iterates over the rows of the production data, converts each row to a JSON-like dictionary, and sends it to the API using a POST request.
- The predictions received from the API response are printed, and there’s a sleep of 120 seconds (2 minutes) between each request.
The if __name__ == "__main__":
block executes the simulate_poduction()
function when the script is run.
Run this script in cmd python test.py
Refresh http://localhost:8080/ to find prediction_metrics data populated
Open Grafana (http://localhost:3000/) to build a dashboard
Choose Postgres database connection then choose the columns you need to build the graph
conclusion
In conclusion, this blog has introduced a powerful combination of tools for building, tracking, orchestrating, and monitoring machine learning pipelines. MLflow, Hyperopt, Prefect, Evidently, and Grafana offer a comprehensive solution for improving the efficiency, reproducibility, and performance of your machine learning projects. By implementing these tools in your workflow, you can enhance collaboration, automate pipeline management, and ensure the ongoing reliability of your machine-learning models in production environments. This ultimate guide equips you with the essential tools and knowledge to elevate your machine-learning pipelines to a new level of sophistication and effectiveness.
you can find Github Repo