TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Deploying Gradio App on Spaces Using DagsHub: A Beginner’s Tutorial

8 min readMar 1, 2022

--

Cover by author

The demand for MLOps tools is high as companies are looking for easy-to-deploy solutions for their complex machine learning (ML) models — AIM. To make things simple and effective, we will be integrating DVC, DagsHub, Gradio, and Hugging Face Spaces into our collaborative ML project.

In this tutorial, we will learn about end-to-end machine learning integrations and use them in creating image generation web applications. We will be using Gradio to develop model inference on open-source project SavtaDepth which will provide a user-friendly web interface. SavtaDepth converts 2D images into 3D by predicting their depth dimension. This is a beginner-friendly tutorial which means we will learn tricks to simplify the deployment process for complex ML models.

SavtaDepth Project

SavtaDepth is a collaborative open-source data science project for monocular depth estimation. It takes a 2D image and estimates the depth of objects. The depth estimation is used for creating 3D images, 3D mapping, security surveillance, and self-driving cars. The goal in monocular depth estimation is to predict the depth value of each pixel or infer depth information, given only a single RGB image as input — (keras.io).

The project uses the U-Net model and the author has changed the last layer from object segmentation to depth estimation. The model was trained on NYU Depth Dataset V2 under the CC BY 4.0 license. The dataset contains video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. The project is still active, so contributors can help in improving the results. If you are interested in the project, please read the contribution guide.

Image by author | Monocular Depth Estimation

Integrations

In this section, we will learn about various integrations that will make our project unique. We will also learn how these tools fit into our project ecosystem.

  • DVC is an open-source version control machine learning system. It comes with data and model versioning, model metrics monitoring, and reproducing the experiments. It works as an extension to Git, so learning DVC will come naturally.
  • DagsHub is a community-first platform like GitHub for machine learning and data science projects. It enables its users to leverage popular open-source tools to version datasets & models, track experiments, label data, and visualize results.
  • Gradio is the fastest way to create a user-friendly web interface for your machine learning models. You can build and share your app within minutes. It also comes with FastAPI support which means you can access the model anywhere.
  • Spaces is a new machine learning application sharing platform by Hugging Face. You can build Streamlit, Gradio, or HTML web apps and deploy them to Spaces with few lines of code.

In this project, we will use DVC for data and model versioning, DagsHub for the remote storage, Gradio for ML application interface, and Spaces for a web server.

Gradio WebApp

Gradio is a lightweight powerful web interface for creating machine learning demos. In this section, we are going to learn how to add an image, run model inference using FastAI, and output the generated result. To run the web app without dependency issues, we need to first install PyTorch, FastAI, and Gradio.

Model Inference

To make things simple, I have added only the important part and removed the data loader. You can learn more about the image data loader and create_data function by following the SavtaDepth notebook.

We are simply using FastAI’s function unet_learner to create model architecture and then load the latest model checkpoint. Finally, we are going to create a gen function that will intake images in the form of a Numpy array and run a prediction to generate a black and white 3D image.

Web Interface

Creating a Gradio web interface is easy. We just need to customize the gradio.Interface function for our use case and Vallah!!! Your web app is ready.

Let’s closely look at the parameters used in Interface function:

  • First parameter is fn. We need to provide it with the model inference, which takes input and return output. In our case, it gen function.
  • Second parameter is inputs. We are converting the image shape to a 640X480 Numpy array. This function automatically does most of the image processing jobs for us.
  • Third parameter is outputs. In our case, it’s an image. All of the post-processing is done by Gradio automatically. So we do not have to worry about using the PIL package to display images.
  • title takes a String to display the app name or title.
  • description takes a simple String, markdown, or HTML to display subheading or image under the title.
  • article is the footer of the app where you write about application information such as links to your research and project repository. It also takes in simple text, markdown, or HTML.
  • examples can be used as sample inputs so that we don’t have to find images to run the app. In our case, we have created a folder and copied two images from a test subset of the dataset. The examples take in an array of relative file paths as shown below.
  • theme is for customizing UI, you can find more about themes in docs.
  • allow_flagging is a cool feature that helps you keep track of the model performance. You can flag the wrong prediction so that the developer can improve the model based on the feedback.
  • enable_queue was used to prevent inference from time out.

In the final version, I have made some changes to description and article by using HTML script. Feel free to check out my code.

Gif by author

Deployment

In this section, we are going to learn how to deploy our Gradio app to Hugging Face Spaces. For that we need to create new Spaces using Hugging Face website and then make changes in savta_app.py, README.md, and requirement.txt files.

Hugging Face Spaces

Spaces by Hugging Face machine learning app sharing platform where people can create apps on various Python web frameworks and deploy using simple Git functions. You can also view featured Spaces and experience state-of-the-art ML models in action.

Image from Spaces — Hugging Face

Before we start with deployment, we need to first create a new Space then we need to add the name of the Space, license, and select SDK. After that, we can clone Space or add a remote to the current Git repository.

Image by author

Setting Up DVC

Integrating DVC is easy with Spaces. We just need to run a shell script within the main Python file. The code below only pulls a model and a few samples from the train and test dataset. In the end, it removes the .dvc folder to optimize the storage. Make sure you are adding this code before the model inference function.

Custom Environment

For deploying the Gradio app to Spaces, we need to make a few changes to avoid errors and dependency issues.

Adding Hugging Face Remote

First, we need to add our Space remote to the current project. The Space remote address should look like

README.md

Then go to your README.md file and add the metadata in the form of yml. This metadata will tell Space the location of the app file, emoji for the cover, color gradient for the app thumbnail, SDK, and license information. You can customize the color and emoji to make your thumbnail stand out.

requirements.txt

We are limited to CPU, and to optimize the storage, we will use the PyTorch CPU version.

Image from Get Started | PyTorch

We will only include packages to requirements.txt file that are necessary for running model inference and web interface.

Finalizing

After making all changes, it’s time to commit and push the code to the Space remote server. The initial remote Space has README.md and .gitattributes, so to avoid conflicts, we will use the -f flag. We will send all the files from the local (master branch) to the remote(main branch) server using master:main.

Warning: Please use -f flag only once at the start, avoid using it as it will override other people’s work.

After pushing the code, you will see a “Building” sign on your app. It will take approximately 3 minutes to build.

Image by author

Congratulations, your app is successfully deployed and it can be shared among friends and colleagues.

Image by author | Hugging Face Space

Bonus

The bonus part is for MLOps enthusiasts who are always eager to learn more ways to integrate new tools and databases. In this section, we are going to integrate the Hugging Face Dataset to collect all the flags. The flags will include input image, output image, and CSV file. The flag option helps us track our model performance and we can later use this dataset to improve model performance.

We need to add the code below to the savta_app.py file for the integration to work. It will use HuggingFaceDatasetSaver to create and update flag dataset. The function requires two parameters HF_TOKEN that you can find in settings and the name of the dataset. Finally, you can add the object to flagging_callback in the Gradio Interface function.

You also need to set the HF_TOKEN environment variable in Space settings. Just copy and paste the user access token.

Image by author

After that, go to the app and flag the few predictions so that you can see the flagged dataset under your profile. In our case, it’s savtadepth-flags that can be accessed publicly.

Image from savtadepth-flags

Conclusion

Hugging Face Space provides an easy-to-deploy platform where data scientists can test and share machine learning models in the form of a user-friendly interface. If you are a beginner to machine learning and want to experience end-to-end product development, then try to develop your app on Gradio, integrate it with DagsHub, and finally deploy it to Spaces. It will also help you create a strong portfolio where you can mention the experience of deploying machine learning models.

In this tutorial, we have learned how to use Git, DVC, DagsHub, Gradio, and Hugging Face Space to create and deploy complex machine learning applications. It is your first step toward becoming an MLOps engineer.

Project Resources:

Reference to dataset: Nathan Silberman, P. K., Derek Hoiem en Fergus, R. (2012) “Indoor Segmentation and Support Inference from RGBD Images”, in ECCV.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Abid Ali Awan
Abid Ali Awan

Written by Abid Ali Awan

I love building machine learning solutions and write blogs on Data Science. abid.work