Empowering Your Models: Uploading to Hugging Face and Creating Dynamic Gradio Apps
In my previous article, I delved into the world of large language models, exploring how to enhance their performance through a quantization approach. Now, the exciting part comes after your model is polished and ready: deploying it on Hugging Face and crafting a user-friendly interface for seamless testing.
Over the past few months, my focus has centered around these expansive language models. From the initial selection of the perfect model for our unique challenge to meticulously fine-tuning it for specific tasks, the journey has been both intriguing and rewarding. With the model now primed and prepared, the pivotal task at hand involves creating an intuitive testing interface. This interface needs to seamlessly integrate the model, provide a user-friendly experience, and ultimately host it for widespread access. Thankfully, Hugging Face effortlessly fulfills all these functions.
This article aims to guide you through the process of uploading your comprehensive model and shaping a user interface with the help of Gradio. So, let’s embark on this enlightening journey together.
Uploading model to hugging face
- Login to Hugging Face:
To begin, you’ll need to log in to your Hugging Face account. This allows you to securely manage your models on the platform. Here is the colab code, when you run it asks for your credentials to log into the hugging face account.
username = "" # Replace with your Hugging Face username
from huggingface_hub import notebook_login
notebook_login()
2. Saving and Pushing the Fine-Tuned Model:
With login completed, you can save your fine-tuned model and then push it to your Hugging Face repository for publication. Replace “hugging_face_repo/model_name” with your desired model names. We can either use save_pretrain with push_to_hub=True or model.push_to_hub.
model.save_pretrained("hugging_face_repo/model_name", push_to_hub=True)
model.push_to_hub("hugging_face_repo/model_name")
3. Reloading the Base Model and Tokenizer:
I loaded the model in 4-bit quantized version so for that I need to reload the base model and tokenizer from Hugging Face’s repository to load the config files of the original model. This is crucial for subsequent steps. Adjust “base_model” to the appropriate model name.
model = AutoModelForCausalLM.from_pretrained("base_model", device_map={"":0}, trust_remote_code=True, torch_dtype=torch.float16)
4. Loading a Performance Model with New Adapters:
Here I am using peft to load the based model with my fined-tuned model’s adapter. It enhance the capabilities of my model by incorporating new adapters designed to boost performance. “hugging_face_repo/model_name” is the model that you currently pushed on hugging face.
from peft import PeftModel
# load perf model with new adapters
model = PeftModel.from_pretrained(
model,
"hugging_face_repo/model_name",
)
5. Loading the Tokenizer:
Load the tokenizer corresponding to your base model and push it to the same hugging face repository.
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("base_model", trust_remote_code=True)
tokenizer.push_to_hub("hugging_face_repo/model_name", use_auth_token=True)
6. Merging Adapters and Unloading:
The final step is the merge the Adapters with the already pushed model and upload it on Hugging Face. After that, if we want to load the model, we can simply use AutoModelForCausalLM and give it the link to your model.
model = model.merge_and_unload() # merge adapters with
model.push_to_hub("hugging_face_repo/model_name")
Building Gradio App
Now that you’ve successfully fine-tuned and uploaded my model to Hugging Face’s repository, the next exciting phase involves creating a user-friendly interface to showcase the model’s capabilities. Gradio, a powerful Python library, makes this process incredibly intuitive and efficient. Let’s dive into the steps involved, with accompanying code and extra insights:
- Understanding Gradio and Its Significance:
Gradio is a game-changer when it comes to creating interactive interfaces for machine-learning models. It allows us to bridge the gap between complex models and users who may not be familiar with coding or intricate model operations.
2. Installing Gradio:
First things first, you need to install the Gradio library if you haven’t already. This can be done with a simple pip command:
pip install gradio
3. Creating the Interface with Gradio:
Now, you are ready to craft a captivating interface for your model. I’ll start by importing the necessary libraries and defining a function that takes an input and produces an output using my fine-tuned model. For faster inference, I am using torch inference mode.
import gradio as gr
def generate_response(input_text):
# Use my model to generate a response based on the input text
with torch.inference_mode():
response = my_model.generate(input_text)
return response
4. Defining Input and Output Components:
Gradio makes it straightforward to define input and output components. In this case, I’m setting up a simple textbox for users to input text and a text element to display the generated response.
input_textbox = gr.inputs.Textbox(lines=5, label="Input Text")
output_text = gr.outputs.Textbox(label="Generated Response")
5. Creating the Gradio Interface:
Combining the input and output components, I can now create the Gradio interface using the function I defined earlier. In the lunch, we can pass the parameters like share=True which generates a shareable link and debug=True which shows if any error occurs on UI.
gr.Interface(fn=generate_response, inputs=input_textbox, outputs=output_text).launch()
6. Sharing and Deploying:
Once I run the code, a local web app launches, if this is run on colab, allowing users to input text and see the model’s generated response. But the magic doesn’t stop there. Gradio also provides an easy way to share and deploy this app online, making it accessible to anyone with a web connection.
7. Deploying on hugging face
Go to your hugging face account, go to new, select spaces and select ‘create new space’. Add the name of the space, specify its license,select gradio and check the private or public option. In the ‘space hardware’ by default free CPU is provided, you need to pay for higher hardware.
Once the space is created, create app.py file and paste the gradio code here. For the best coding practice create a requirement.txt file place all the required libraries here and also create the additional file if required like model.py that has all the functions related to model or style.css for a better UI design. After this commit the change and app.py automatically start running, we can see the error in the logs section.
We can change the hardware or make the space private or public after creating it, go to the setting on that particular space and all the options are available there.
Additional Insights:
- Custom Styling: Gradio lets us customize the appearance of our interface to match our preferences or brand identity. This can enhance the user experience and make the app more visually appealing.
- Interactive Visualizations: Gradio supports various output formats, including images and plots. This means we can create interactive visualizations that respond to user input, offering a dynamic and engaging experience.
- Multi-Model Interfaces: Gradio allows us to build interfaces with multiple models, enabling users to compare different models’ outputs side by side.
Conclusion:
As I conclude this rewarding journey, it’s evident that the fusion of Gradio’s interactive prowess and Hugging Face’s collaborative platform has brought forth a new era of AI engagement. From intricate model refinement to user-friendly interfaces, each phase has been driven by the shared goal of accessibility and impact.
Gradio’s elegance lies in its ability to render complex models into interactive interfaces. Through just a few lines of code, I’ve opened doors to user interaction, irrespective of coding skills. This coupling of finely-tuned models and interactive interfaces holds immense potential, fostering innovation and practical solutions.
Yet, this journey wasn’t solitary; Hugging Face played a pivotal role. Its repository enabled me to share my model globally, encouraging collaboration and knowledge exchange. The platform’s convenience in storing and publishing models bolsters AI progress by nurturing a community of like-minded researchers and developers.
As I wrap up, I’m exhilarated by the prospect of AI’s accessibility evolving further. Gradio and Hugging Face, united, have not only empowered me to curate, fine-tune, and exhibit my model but have also cultivated an avenue for interaction, exploration, and shared discovery. In this synergy, we’re shaping a future where AI is not just proficient but approachable to all.