Build an Interactive Gradio App for Python LLMs and FastAPI Microservices in less than 2 minutes!

9 min read6 days ago

Grdaio Web APP for your ML/LLM project for prototying in less than 2 mins.

In the world of Machine Learning (ML) and Large Language Models (LLMs), building user-friendly, interactive applications is critical to demonstrating the capabilities of your models or services. Often, developers need an interface to interact with their models in real-time without having to build complex frontend applications. This is where Gradio comes in handy.

Gradio is a powerful Python library that lets you quickly create web-based user interfaces (UIs) for your ML models or APIs with minimal effort. Whether you’re working with a language model like GPT, a machine learning algorithm, or a FastAPI microservice, Gradio can be a simple and effective tool to enhance user interaction.

In this guide, we’ll walk you through how to use Gradio to create an interactive web app that can be integrated with an LLM, ML script, or a FastAPI microservice.

Why Use Gradio for LLMs and ML Projects?

Gradio is ideal for creating a user interface for ML models and services for several reasons:

Quick setup: With just a few lines of code, you can create a fully functioning web app for your machine learning models.
Real-time interaction: You can test and interact with your models in real-time, allowing for immediate feedback.
Easy deployment: Gradio apps can be deployed locally or hosted on a public server for easy sharing.
Customizable UI: Gradio provides simple ways to customize the user interface to fit your project’s needs.

Now, let’s dive into a scenario where you might use Gradio in combination with a backend FastAPI service for LLM or other ML-based models.

Scenario: Using Gradio with an LLM Model or FastAPI Microservice

The Scenario

Imagine you’re working on a project involving a large language model (LLM) that is powered by a FastAPI microservice. The FastAPI service runs on a backend server, processing user messages and returning model responses. You want to build an interactive web interface where users can type in messages and receive responses from the LLM in real-time.

This is a perfect scenario for Gradio, where it acts as a frontend that communicates with your FastAPI backend. You can use the LLM model hosted on FastAPI and handle various actions like sending messages, confirming actions, and getting responses, all within the Gradio app.

Here’s a step-by-step guide on how you can create such an app.

Setting Up the Project

To get started, ensure you have the necessary Python packages installed:

pip install gradio requests fastapi uvicorn

Basic Project Structure

Your project will consist of two main components:

FastAPI Backend: A microservice that handles session creation, message exchanges, and action handling.
Gradio Frontend: A web-based user interface that interacts with the FastAPI backend.

In this tutorial, we’ll focus on building the Gradio app and how it interacts with the backend API.

Building Backend API Interactions with FastAPI

Let’s assume you have a FastAPI backend running at http://0.0.0.0:8080. The backend has endpoints for:

Creating a session
Sending user messages
Accepting or rejecting actions

Below is a simplified example of what your FastAPI backend might look like:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()class Message(BaseModel):
    message: strclass ActionResponse(BaseModel):
    response: str
    state: str@app.post("/api/v1/agents/{agent_id}/sessions")
def create_session(agent_id: str):
    return {"id": "session123"}@app.post("/api/v1/sessions/{session_id}/messages")
def send_message(session_id: str, message: Message):
    # Simulate LLM response
    return {"response": f"Bot reply to '{message.message}'", "state": "ACTIVE"}@app.post("/api/v1/sessions/{session_id}/actions/accept")
def accept_action(session_id: str):
    return {"response": "Action accepted"}@app.post("/api/v1/sessions/{session_id}/actions/reject")
def reject_action(session_id: str, reason: str):
    return {"response": f"Action rejected because: {reason}"}

Now that we have our backend API set up, let’s build the Gradio frontend that interacts with it.

Creating the Gradio Interface

Here’s the Gradio app that will act as the frontend for our FastAPI backend. The main components of the app include a chatbot for interactions, buttons for accepting or rejecting actions, and a textbox for input.

Step 1: Set Up API Communication

First, we define the API endpoints and create functions to interact with the FastAPI backend.

import requests
import gradio as gr

BASE_URL = 'http://0.0.0.0:8080'
AGENT_ID = 'your-agent-id'
HEADERS = {'Content-Type': 'application/json'}def create_session():
    url = f"{BASE_URL}/api/v1/agents/{AGENT_ID}/sessions"
    response = requests.post(url, headers=HEADERS)
    if response.status_code == 201:
        return response.json().get("id")
    return Nonedef send_message(session_id, message):
    url = f"{BASE_URL}/api/v1/sessions/{session_id}/messages"
    payload = {'message': message}
    response = requests.post(url, headers=HEADERS, json=payload)
    return response.json()def accept_action(session_id):
    url = f"{BASE_URL}/api/v1/sessions/{session_id}/actions/accept"
    response = requests.post(url, headers=HEADERS)
    return response.json()def reject_action(session_id, reason):
    url = f"{BASE_URL}/api/v1/sessions/{session_id}/actions/reject"
    payload = {'reason': reason}
    response = requests.post(url, headers=HEADERS, json=payload)
    return response.json()

Step 2: Design the Gradio Interface

Now, let’s build the Gradio interface where users can interact with the chatbot.

with gr.Blocks() as demo:
    gr.Markdown("# Chat with an LLM via FastAPI")
    chatbot = gr.Chatbot()
    state = gr.State()

    with gr.Row():
        user_input = gr.Textbox(show_label=False, placeholder="Type your message here...")
        submit_button = gr.Button("Send")    with gr.Row(visible=False) as action_buttons:
        accept_button = gr.Button("Accept")
        reject_button = gr.Button("Reject")    def handle_message(user_message, chat_history, state):
        if "session_id" not in state:
            session_id = create_session()
            state["session_id"] = session_id        session_id = state.get("session_id")
        response_data = send_message(session_id, user_message)
        bot_reply = response_data.get("response", "No response")
        chat_history.append((user_message, bot_reply))        if response_data.get("state") == "WAITING_FOR_CONFIRMATION":
            return chat_history, state, gr.update(visible=True)
        return chat_history, state, gr.update(visible=False)    submit_button.click(
        handle_message,
        inputs=[user_input, chatbot, state],
        outputs=[chatbot, state, action_buttons],
    )    def handle_accept(chat_history, state):
        session_id = state.get("session_id")
        response = accept_action(session_id)
        chat_history.append(("Action accepted", response.get("response")))
        return chat_history, state, gr.update(visible=False)    def handle_reject(chat_history, state):
        session_id = state.get("session_id")
        response = reject_action(session_id, "User rejected the action")
        chat_history.append(("Action rejected", response.get("response")))
        return chat_history, state, gr.update(visible=False)    accept_button.click(handle_accept, inputs=[chatbot, state], outputs=[chatbot, state, action_buttons])
    reject_button.click(handle_reject, inputs=[chatbot, state], outputs=[chatbot, state, action_buttons])demo.launch()

1. Import Gradio and Initialize Blocks

To begin with, you need to import Gradio and use the Blocks() container. Gradio’s Blocks is a layout API that allows you to arrange multiple UI components in a structured manner, making it easy to build complex user interfaces.

import gradio as gr

with gr.Blocks() as demo:

gr.Blocks(): This is a flexible container that allows for advanced layouts like rows, columns, and nested components. It helps in defining multiple UI components like buttons, textboxes, and chatbots in a well-organized fashion.
as demo: The demo is a reference to the Gradio app, which we’ll use later when launching the application.

2. Adding Markdown for Titles and Descriptions

You can add a title or introductory text to your app using Gradio’s Markdown() component. This is helpful for providing users with some context or instructions on how to use the app.

gr.Markdown("# Chat with an LLM via FastAPI")

gr.Markdown(): This is a Gradio component that allows you to include formatted text (using Markdown). In this case, we use it to display a header for the app.
The # symbol creates an <h1> header, which will be displayed at the top of the app.

3. Creating a Chatbot Component

The core component of this interface is the Chatbot(). This is where users will see the conversation history with the model or backend service.

chatbot = gr.Chatbot()

gr.Chatbot(): This component is used to simulate a chat conversation between the user and the backend service (LLM or any other model). It renders as a box where user and bot messages are displayed.
By default, the chatbot component will show a two-column view, one for the user’s message and one for the bot’s response.

The chatbot will be updated dynamically as the user interacts with the app, and new messages are added to the conversation.

4. Creating a State to Manage Data

In Gradio, the State() component is used to persist information between interactions. Here, we use it to store data such as the session ID and any intermediate state (e.g., whether a user needs to confirm an action).

state = gr.State()

gr.State(): This creates a state object that is persistent across user interactions. In this case, it stores information like the current session ID, which is needed to keep track of the conversation context with the backend.

5. Adding a Textbox for User Input

We need a place for users to input their messages. The Textbox()component provides an easy way to capture text input from users.

user_input = gr.Textbox(show_label=False, placeholder="Type your message here...")

gr.Textbox(): This component is used to create an input field where users can type their messages.
show_label=False: This hides the label for the textbox to create a cleaner UI.
placeholder="Type your message here...": This is the text that appears inside the textbox before the user types anything.

This component will capture the user’s input and send it to the backend (FastAPI service or LLM) when the user submits it.

6. Adding a Submit Button

Once the user types a message, they need a way to submit it. For this, we add a button:

submit_button = gr.Button("Send")

gr.Button("Send"): This component creates a clickable button labeled “Send”. When clicked, it triggers the action to send the user’s message to the backend.

This button will be linked to a function that processes the message and updates the chatbot conversation.

7. Adding Accept/Reject Buttons (Optional)

In some cases, your backend might ask the user for confirmation (e.g., for an action approval). You can add two buttons — Accept and Reject — that become visible only when needed:

with gr.Row(visible=False) as action_buttons:
    accept_button = gr.Button("Accept")
    reject_button = gr.Button("Reject")

gr.Row(): This is a layout container that arranges components (buttons) in a horizontal row.
visible=False: Initially, the buttons are hidden. They only become visible when the backend requires confirmation from the user.
gr.Button("Accept") and gr.Button("Reject"): These buttons let the user either accept or reject an action. When clicked, they trigger backend functions that handle these actions (e.g., sending an accept or reject API call to FastAPI).

8. Defining the Message Handling Logic

Once the user sends a message via the Textbox and clicks the "Send" button, we need to handle this interaction. This is where we define how to update the chatbot conversation and communicate with the backend.

def handle_message(user_message, chat_history, state):
    if "session_id" not in state:
        session_id = create_session()  # Create a new session if it doesn't exist
        state["session_id"] = session_id

    session_id = state.get("session_id")
    response_data = send_message(session_id, user_message)  # Send the message to the backend    bot_reply = response_data.get("response", "No response")  # Get the bot's response
    chat_history.append((user_message, bot_reply))  # Update the conversation history    # Handle confirmation if needed
    if response_data.get("state") == "WAITING_FOR_CONFIRMATION":
        return chat_history, state, gr.update(visible=True)  # Show accept/reject buttons    return chat_history, state, gr.update(visible=False)  # Hide buttons if no confirmation needed

handle_message(): This function handles the user’s input and interacts with the backend.
If there’s no session yet, it creates one using create_session().
The message is sent to the backend via send_message(), which calls the FastAPI service.
The response from the backend (e.g., from the LLM) is extracted, and the chatbot’s history is updated to show the conversation.
If the backend requires user confirmation (e.g., for an action), the accept/reject buttons become visible.

9. Connecting the Components with Interactivity

Next, we link the submit_button and the Textbox to the handle_message()function, so when a user clicks the "Send" button or presses enter, the message gets processed.

submit_button.click(
    handle_message,
    inputs=[user_input, chatbot, state],
    outputs=[chatbot, state, action_buttons],
)

submit_button.click(): This is how Gradio connects buttons to their corresponding functions.
inputs: Specifies which components to pass as inputs to the function. Here, we pass the user_input (text typed by the user), chatbot (current chat history), and state (session info).
outputs: Specifies which components are updated by the function. In this case, the chatbot (conversation history) and action_buttons (whether the buttons should be shown) will be updated.

Similarly, we connect the accept and reject buttons to their respective handlers:

accept_button.click(handle_accept, inputs=[chatbot, state], outputs=[chatbot, state, action_buttons])
reject_button.click(handle_reject, inputs=[chatbot, state], outputs=[chatbot, state, action_buttons])

10. Launching the App

Finally, to make the app live, you call the launch() method:

demo.launch()

This starts a local Gradio server where the user can access and interact with the application via a web interface.

Final Gradio Interface Structure

Here’s a visual structure of how the interface works:

Markdown Header: Displays the app title.
Chatbot Component: Displays the ongoing conversation between the user and the backend service.
Textbox: Allows the user to type and submit their message.
Send Button: Submits the user’s message to the backend.
Accept/Reject Buttons: Optional buttons that appear when the backend requires user confirmation for an action.

This simple and intuitive design provides real-time interaction with the backend, enhancing the usability of your LLM or ML models.

Step 3: Running the Gradio App

You can run the Gradio app locally by launching the script:

python llm_ui_chat.py

Once launched, you will be able to interact with the LLM through the Gradio interface, and behind the scenes, the app will communicate with your

FastAPI service.

Advantages of Using Gradio

Using Gradio in combination with a backend service like FastAPI offers multiple benefits:

Seamless Frontend for FastAPI: Gradio easily integrates with FastAPI or any backend service, providing an intuitive interface without writing HTML, CSS, or JavaScript.
Rapid Prototyping: Test your LLM or ML models quickly by simply wrapping them in a Gradio interface.
Customization and Flexibility: Customize the UI to match your brand or project needs using Gradio’s simple API.
Real-time Feedback: With Gradio, users can interact with your model in real-time, providing immediate feedback and improving the user experience.

Conclusion

In this blog, we walked through building an interactive web app using Gradio to communicate with a FastAPI microservice for an LLM-based project. We demonstrated how you can set up backend interactions and create a user-friendly interface for sending and receiving messages. Whether you’re working with LLMs or other ML models, Gradio offers an elegant solution for building and deploying web apps quickly and easily.

Gradio and FastAPI together provide a powerful combination for delivering scalable, interactive machine learning experiences.

Happy coding! Let me know if you have any questions or need help setting up your Gradio app.