Deploy Hugging face model at scale quickly and seamlessly using Syndicai

I’ve trained my model, and need to deploy it!

Developing Machine Learning solutions comes with many complexities, especially in the deployment phase. In general, developers need to worry about designing the architecture and building the infrastructure, taking care of environments and all necessary dependencies. With tools like Flask Docker, the whole process is much faster; however, it still requires you to have a perfect understanding of the topic. Therefore, many AI creators finish with just a trained model since the deployment is time-consuming and costly.

To solve that problem, I would like to explore a new solution called Syndicai. It makes the ML deployment phase easier and faster since it manages your infrastructure. Therefore the scope of this article will focus on the Machine Learning deployment phase, where we will explore the Syndicai approach of going from a model to a Production-ready API.

To make the whole workflow interesting, let me start with some use-case…

Say we are building a new product for an EduTech company, and we need to create a new feature that helps fill in the blank space in sentences. In addition, the complete solution needs to be in the form of an API. So all we will need to do is deploy our model to Syndicai and integrate that API into our product. Easy peezy, right? Well, we’re about to find out that shortly.

If you don’t want to go through the whole article you can just fork the ready-made code, and jump straight to the deployment section.

The first step, prepare a model for deployment

The first thing we need is a machine learning model that is already trained. For this purpose, we will use RoBERTa masked language modeling model from Hugging Face, which is already pre-trained with weights, and it is one of the most popular models in the hub, we will use it for this tutorial’s purpose.

For those who don’t know, a Hugging Face is an open-source hub with a library of state-of-the-art ML. Most of those models operate in the space of natural language processing (NLP), Audio, and Computer vision, and they are written in PyTorch, TensorFlow, and JAX.

Create a git repository

Syndicai Platform requires us to have a git repository with requirements.txt and Typically, we would start a new project, but we can fork the repository since the model is already on GitHub. You can access it here.

Add requirements.txt

First, let’s add the requirements.txt file to our repository. It contains all libraries and frameworks required to recreate the model’s environment in the cloud.

Please have a look at the file below.


Note, that all libraries consist of exact versions!


Later add, a python script that contains the PythonPredictor class with a __init__ constructor and predict method.

def __init__(self, config):
device = "cuda" if torch.cuda.is_available() else "CPU"
print(f"using device: {device}")
self.device = device
self.model = RobertaForMaskedLM.from_pretrained('roberta-base',return_dict = True)
self.tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

The constructor is the best place to initialize our weights since Syndicai runs it only once — when a deployment is created. The predict method is responsible for taking the input, parsing through the model, and sending a response. It is run every time we send a request to a model, so it needs to be fast and lightweight from a technical point of view. Therefore, in our case, it is only tokenizing the input data, extracting the masked tokens, and processing the output using the Hugging Face API.

In addition, predict method has only one required argument (payload), which allows us to access our input data in the form of a dictionary when we send a request with JSON file that looks as follows.

{"text": "My name is Nwoke Tochukwu, I'm a Machine Learning Engineer What skillsets are <mask> to be a software Engineer."}

Check the model, before the deployment

Check your model locally before deploying; execute the following command: python3 It will send JSON data through your model and test whether everything is okay.

import os
import argparse
from transformers import RobertaTokenizer
from syndicai import PythonPredictor
# tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
sample_data = (
"My name is Nwoke Tochukwu, I'm a Machine Learning Engineer "
"What skillsets are <mask> to be a software Engineer."
def run(opt): # Convert image url to JSON string
sample_json = {"text": opt.text}
# Run a model using PythonPredictor from
model = PythonPredictor([])
response = model.predict(sample_json)
# Print a response in the terminal
if opt.response:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--text', default=sample_data, type=str, help='URL to a sample input data')
parser.add_argument('--response', default=True, type=bool, help='Print a response in the terminal')
opt = parser.parse_args()

It’s time to deploy RoBERTa!

Finally, we are in the long-awaited last step, deployment on Syndicai. We will connect the git repository to the platform, create a deployment, and run the deployment to validate whether it is working correctly.

Connect the git repository to the platform

In order to connect a git repository, we need to log in to the Syndicai Platform, go to Models page, and then click the Add model button. In the pop-up form, paste the following data and click Add:

The path needs to target the & requirements.txt.

Adding a new model by connecting a git repository with a path.

After connecting a repository, you will be redirected to the Model Profile.

Model Profile — Connected repository with a path “my-model”

Create a deployment

On the Model Profile page, click a Deploy button. Fill out the deployment form with the Name and a branch. In general, the deployment is connected to a branch.

Creating a new deployment on the “main” branch

Consequently, click Add, and you will be redirected to Deployment Profile with the new release in the Releases tab. If you click on the recent release e.g. #1 you will see logs from the building and starting process. The building phase covers wrapping a model with the web service and docker container. At the same time, a starting phase covers serving a docker container in the cloud.

Deployment Profile — The release is still in progress…

Now, you need to wait for a couple of minutes to get the Running status of the deployment.

Run and Validate the deployment

When a deployment is Running, we can send requests and later integrate in the form REST API with our product that we described at the beginning.

We can go to Postman or terminal and paste a complete request to test the deployment. However, it is not necessary since you can quickly test the running deployment on the platform.

Go to the Overview tab in the Deployment Profile and scroll down to Validate & Integrate section, and click on the “pencil” icon in the dark box for sample data. Paste the following input code.

Deployment Profile — Add input data in the form of JSON in the “Validate & Integrate” section

Then click Update to save. Now, click Send a request to test the deployment. You should get the following response.

Deployment Profile — Response from the deployment

Remember that your deployment needs to have a Running status to work!


Congratulation! You were able to deploy a scalable machine learning model seamlessly without worrying about the underlying infrastructure that generally requires specialized knowledge in DevOps and Software Engineering. Syndicai helped us go from a trained model to a scalable API in minutes. And the beautiful thing is that you don’t have to worry about maintenance.

I highly recommend joining Syndicai slack channel and exploring the platform on your own!




Love podcasts or audiobooks? Learn on the go with our new app.

Chatbot for Covid19 cases in india and hosting it in the website angetting real time information

Understanding customer purchase behavior at Starbucks

Leveling up on SageMaker


Unit 3 Application) Evolving Neural Network for Time Series Analysis

Deep Learning: The Transformer

Did Stacking Improve My PySpark Churn Prediction Model?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tochukwu Nwoke

Tochukwu Nwoke

More from Medium

My OIBSIP Experience

Smoke Test Vs Sanity Test — Breakdown And Comparison

CS371p Spring 2022 Week 4/18–4/24: Blog #13

CS371p Spring 2022: Justin Milushev