Part 5 Part 7

ML model serving

The melanoma prediction results from my trained models can be received by users in production using model serving tools. With reference to (tensorflow.org, serving, 2022), model serving is a system for ML models, which allows to run them in production.

There are different ML model serving tools, which can be used for production. One of the most popular tools is TensorFlow serving (tensorflow.org, serving, 2022). Other popular tool is KServe (kubeflow.org, kserve, 2022). Also, there are open-source tools for deep learning models serving as Multi Model Server (github.com, multi-model-server, 2022).

KServe provides functionality for serving based on Kubernetes (kubernetes.io, 2022). This is good choice when the software product is deployed using orchestrator as Kubernetes. Also, KServe has detailed documentation.

“TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments.” (tensorflow.org, serving, architecture, 2022). TensorFlow Serving supports many versions of models. Also, TensorFlow Serving can be deployed fast using docker containers.

The resource (martinfowler.com, 2019) describes a new kind of Continuous Delivery as Continuous Delivery for Machine Learning. CD4ML is approach for combining code, data and models in small adaptable cycles for delivery. To decrease the human mistakes and improve the flow between dataset preparation and ML application delivery, the CD4ML can help to improve development processes in professional teams.

TensorFlow serving should be deployed. With reference to (martinfowler.com, 2019), there are many different deployment models as multiple models, shadow models, competing models, online learning models. The AWS resource (Aws.amazon.com, shadow deployment, 2021) provides an example of system design for shadow deployment.

References

tensorflow.org, serving, 2022, Serving Models. Available at: https://www.tensorflow.org/tfx/guide/serving

[Accessed 10 July 2022]

kubeflow.org, kserve, 2022, KServe. Available at: https://www.kubeflow.org/docs/external-add-ons/kserve/kserve/

[Accessed 10 July 2022]

github.com, multi-model-server, 2022, Multi Model Server. Available at: https://github.com/awslabs/multi-model-server

[Accessed 10 July 2022]

kubernetes.io, 2022, Production-Grade Container Orchestration. Available at: https://kubernetes.io/

[Accessed 10 July 2022]

tensorflow.org, serving, architecture, 2022, Architecture. Available at: https://www.tensorflow.org/tfx/serving/architecture

[Accessed 13 July 2022]

Martinfowler.com, 2019, Continuous Delivery for Machine Learning. Available at: https://martinfowler.com/articles/cd4ml.html

[Accessed 14 July 2022]

aws.amazon.com, shadow deployment, 2021, Deploy shadow ML models in Amazon SageMaker. Available at: https://aws.amazon.com/blogs/machine-learning/deploy-shadow-ml-models-in-amazon-sagemaker/

[Accessed 14 July 2022]

--

--