How to deploy (almost) any Hugging face model 🤗 on NVIDIA’s Triton Inference Server with an application to Zero-Shot-Learning for Text Classification

SUMMARY
In this blog post, We examine Nvidia’s Triton Inference Server (formerly known as TensorRT Inference Server) which simplifies the deployment of AI models at scale in production. For the purpose of…