Deploy Hugging face model at scale quickly and seamlessly using Syndicai
I’ve trained my model, and need to deploy it!
Developing Machine Learning solutions comes with many complexities, especially in the deployment phase. In general, developers need to worry about designing the architecture and building the infrastructure, taking care of environments and all necessary dependencies. With tools like Flask Docker, the whole process is much faster; however, it still requires you to have a perfect understanding of the topic. Therefore, many AI creators finish with just a trained model since the deployment is time-consuming and costly.
To solve that problem, I would like to explore a new solution called Syndicai. It makes the ML deployment phase easier and faster since it manages your infrastructure. Therefore the scope of this article will focus on the Machine Learning deployment phase, where we will explore the Syndicai approach of going from a model to a Production-ready API.
To make the whole workflow interesting, let me start with some use-case…
Say we are building a new product for an EduTech company, and we need to create a new feature that helps fill in the blank space in sentences. In addition, the complete solution needs to be in the form of an API. So all we will need to do is deploy our model to Syndicai and integrate that API into our product. Easy peezy, right? Well, we’re about to find out that shortly.
If you don’t want to go through the whole article you can just fork the ready-made code, and jump straight to the deployment section.
The first step, prepare a model for deployment
The first thing we need is a machine learning model that is already trained. For this purpose, we will use RoBERTa masked language modeling model from Hugging Face, which is already pre-trained with weights, and it is one of the most popular models in the hub, we will use it for this tutorial’s purpose.
For those who don’t know, a Hugging Face is an open-source hub with a library of state-of-the-art ML. Most of those models operate in the space of natural language processing (NLP), Audio, and Computer vision, and they are written in PyTorch, TensorFlow, and JAX.
Create a git repository
Syndicai Platform requires us to have a git repository with requirements.txt and syndicai.py. Typically, we would start a new project, but we can fork the repository since the model is already on GitHub. You can access it here.
Add requirements.txt
First, let’s add the requirements.txt file to our repository. It contains all libraries and frameworks required to recreate the model’s environment in the cloud.
Please have a look at the file below.
numpy==1.19.4
transformers==4.15.0
torch==1.10.2
Note, that all libraries consist of exact versions!
Add syndicai.py
Later add syndicai.py, a python script that contains the PythonPredictor
class with a __init__
constructor and predict
method.
def __init__(self, config):
device = "cuda" if torch.cuda.is_available() else "CPU"
print(f"using device: {device}") self.device = device
self.model = RobertaForMaskedLM.from_pretrained('roberta-base',return_dict = True)
self.tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
The constructor is the best place to initialize our weights since Syndicai runs it only once — when a deployment is created. The predict
method is responsible for taking the input, parsing through the model, and sending a response. It is run every time we send a request to a model, so it needs to be fast and lightweight from a technical point of view. Therefore, in our case, it is only tokenizing the input data, extracting the masked tokens, and processing the output using the Hugging Face API.
In addition, predict
method has only one required argument (payload
), which allows us to access our input data in the form of a dictionary when we send a request with JSON file that looks as follows.
{"text": "My name is Nwoke Tochukwu, I'm a Machine Learning Engineer What skillsets are <mask> to be a software Engineer."}
Check the model, before the deployment
Check your model locally before deploying; execute the following command: python3 run.py
. It will send JSON data through your model and test whether everything is okay.
import os
import argparse
from transformers import RobertaTokenizer
from syndicai import PythonPredictor# tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
sample_data = (
"My name is Nwoke Tochukwu, I'm a Machine Learning Engineer "
"What skillsets are <mask> to be a software Engineer."
)def run(opt): # Convert image url to JSON string
sample_json = {"text": opt.text} # Run a model using PythonPredictor from syndicai.py
model = PythonPredictor([])
response = model.predict(sample_json) # Print a response in the terminal
if opt.response:
print(response)if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--text', default=sample_data, type=str, help='URL to a sample input data')
parser.add_argument('--response', default=True, type=bool, help='Print a response in the terminal')
opt = parser.parse_args()
run(opt)
It’s time to deploy RoBERTa!
Finally, we are in the long-awaited last step, deployment on Syndicai. We will connect the git repository to the platform, create a deployment, and run the deployment to validate whether it is working correctly.
Connect the git repository to the platform
In order to connect a git repository, we need to log in to the Syndicai Platform, go to Models page, and then click the Add model button. In the pop-up form, paste the following data and click Add:
- git repository: https://github.com/Tob-iee/syndicai-tutorial/
- path: my-model/
The path needs to target the syndicai.py & requirements.txt.

After connecting a repository, you will be redirected to the Model Profile.

Create a deployment
On the Model Profile page, click a Deploy button. Fill out the deployment form with the Name and a branch. In general, the deployment is connected to a branch.

Consequently, click Add, and you will be redirected to Deployment Profile with the new release in the Releases tab. If you click on the recent release e.g. #1 you will see logs from the building and starting process. The building phase covers wrapping a model with the web service and docker container. At the same time, a starting phase covers serving a docker container in the cloud.

Now, you need to wait for a couple of minutes to get the Running status of the deployment.
Run and Validate the deployment
When a deployment is Running, we can send requests and later integrate in the form REST API with our product that we described at the beginning.
We can go to Postman or terminal and paste a complete request to test the deployment. However, it is not necessary since you can quickly test the running deployment on the platform.
Go to the Overview tab in the Deployment Profile and scroll down to Validate & Integrate section, and click on the “pencil” icon in the dark box for sample data. Paste the following input code.

Then click Update to save. Now, click Send a request to test the deployment. You should get the following response.

Remember that your deployment needs to have a Running status to work!
Conclusion
Congratulation! You were able to deploy a scalable machine learning model seamlessly without worrying about the underlying infrastructure that generally requires specialized knowledge in DevOps and Software Engineering. Syndicai helped us go from a trained model to a scalable API in minutes. And the beautiful thing is that you don’t have to worry about maintenance.
I highly recommend joining Syndicai slack channel and exploring the platform on your own!