Gemini has entered the chat: building an LLM-powered Discord bot

Published in

Google Cloud - Community

8 min readApr 25, 2024

After ignoring the LLM-revolution for most of 2023, I finally decided to check out what this GenAI hype is all about. Since I like to show off the things I do and I’m active on Discord, I decided to build a Discord Bot which will use various AI products from Google Cloud to do cool things, amaze my friends and help us with moderation of our server. While working on the bot, I realized that using AI is easier than I thought and that it’s an awesome introduction to the AI world for new developers.

In this series, I will show you how to build a Discord Bot in Python using Google’s Vertex AI API and other AI-related products. The articles assume you have a solid grasp of Python and know at least the basics of how Discord and Discord bots work. By following along, you can accomplish several things like: learn about easy-to-use AI tools, learn how to integrate those tools with your programs, impress your Discord friends and end up with a useful and fun Discord bot on your server.

Screenshot of a Discord chat conversation where user Maciek asks GeminiBot to say something about itself. GeminiBot replies: “Hi there! I’m still under development, and I’m always learning and improving. I can’t wait to see what the future holds for me, and I’m excited to see how I can help people in all sorts of ways.” — Hello, GeminiBot!

Preparations

There are a couple of steps to complete before you can start coding a Discord bot in Python. Here’s a quick list:

Discord

If you don’t have an account, create one.
Discord servers are the spaces where communities come together and can be private or public. In this tutorial, you’ll need your own private server. You can set it up for free.
Register an application in the Discord Dev portal and follow the instructions to configure your bot. Complete the first step of the Getting Started guide, right until you have your bot join your server.
Create a new application.

New application creation screen — Creating new application

Configure the details of your application by providing a description or an icon.
Open the Bot tab of the Application.

On that page, pick a username for your bot. This name will be visible to users interacting with your bot and also listed in the server members list.
Use the Reset token button to generate a new token for your bot. Copy it and store it somewhere safe. Keep it secret, this is the “key” to your bot.

Go to the OAuth2 menu.

In the checkbox field Scopes pick Bot, and from the checkboxes in bot permissions section, select: Send Messages and Send Messages in Threads

At the bottom of the page you will see the generated URL. Copy that and open it in your browser. This will allow you to invite your newly created bot to your private server.

At this point, your bot should be visible on the Member List in your server as being offline.

List of server members, the bot is currently offline

Vertex AI SDK

If you don’t have one, create a Google Cloud project.
Download and install the gcloud CLI.
Authenticate and configure your gcloud using: gcloud init
Authenticate your future Vertex AI API Python requests with: gcloud auth application-default login

Local development environment

The bot will be created in Python, so you need to start a new virtual environment. Once you have the environment configured, you can install two key libraries that will be used. hikari, a Discord microframework for Python and google-cloud-aiplatform to provide you with AI functionality.

If you don’t want to work on the bot locally on your machine, you can leverage Google Cloud Shell with its Cloud Shell Editor. This little VM you get by default with your Google Cloud project is more than enough to try out this simple bot implementation. If you decide you want to keep the bot running in a more permanent fashion, you’ll need a Compute Engine instance. See this article for instructions on how you can set up a VM to run your bot.

You can create the new virtual environment and install both libraries with following code (written for Linux):

mkdir <your-working-directory>
cd <your-working-directory>
python3 -m venv venv  # Creates new virtual environment in folder named venv
source venv/bin/activate  # Activates virtual environment in shell session
pip install -U hikari google-cloud-aiplatform  # Installs libraries to be used

Let’s get coding!

Let’s start with a basic example from hikari documentation, to have a working starting point. You will need the secret token that you copied from the Discord developer portal.

import hikari

bot = hikari.GatewayBot(token="<paste your bot token here>")

@bot.listen()
async def ping(event: hikari.GuildMessageCreateEvent) -> None:
    """If a non-bot user mentions your bot, respond with 'Pong!'."""
    # Do not respond to bots nor webhooks pinging us, only user accounts
    if not event.is_human:
        return

    me = bot.get_me()

    if me.id in event.message.user_mentions_ids:
        await event.message.respond("Pong!")

bot.run()

Save the code snippet above as main.py and run it with python main.py.

Once you run it, you should see the hikari startup process and the bot should be visible as online in the member list of your server. To try it out, try pinging (@-mention) the bot, by writing a message with @<bot name> in it. If everything worked fine, the pinged bot should reply with a simple “Pong!” message.

To stop the bot, you just need to send it an interrupt signal in your console (Ctrl+c in Linux terminals).

Let’s quickly go over what’s happening here. When you execute the script, it starts with importing hikari module and initializing GatewayBot. This is an object that represents your bot and is used to define the bots functionality. Next we see a coroutine (async def ping(…)) decorated with @bot.listen(). This decorator is used to register event handlers that the bot uses for its interactions with Discord. It is relying on the type annotations of the decorated methods to route events. In this case the event type GuildMessageCreateEvent tells it to execute the ping coroutine when the bot receives a new guild message event (guild is the legacy name for Discord server, that’s still used in the API).

Once a new message is received, the bot checks if it came from a human user (you don’t want it to react to other bots). Then it checks if it was mentioned in the received message. Finally, if it was mentioned, it responds with the “Pong!” message.

The last line bot.run()initiates the connection between your machine and the Discord servers and starts an event loop in which all interactions are handled.

Adding Gemini

With a working basic bot, you can now get to wielding the power of Gemini! This will allow your bot to reply to almost anything users write with polite and, if possible, useful answers. An amazing improvement compared to a Pong! response!

I will demonstrate all the required steps and provide you with the new main.pyat the end, so don’t worry about implementing those changes in your code right away.

To be able to communicate with Gemini, you need to import the library (the authentication should happen automatically, if you followed the gcloud setup steps).

import vertexai.generative_models as genai

Next step, is to create a model object that will be used to communicate with Gemini:

model = genai.GenerativeModel(
    model_name="gemini-1.0-pro", 
    generation_config={"max_output_tokens": 1900}
)

For simplicity, this code creates a model object with mostly default settings. The only non-default setting is the max_output_tokensvalue. By default it would be 2048, but Discord doesn’t allow messages longer than 2000 characters, so you need to make sure Gemini’s responses fit that. See documentation to learn more about available parameters.

With the model object initialized, all that’s left is for you to make use of it. In the section where your bot is replying to an incoming message, you need to use following code to communicate with Gemini:

# trigger_typing() lets Discord users know that your bot is preparing an answer
# since the call to Gemini might take couple of seconds.
await event.get_channel().trigger_typing()
result = await model.generate_content_async(event.message.content)
# Cutting the answer short, if it's longer than 2000 characters.
await event.message.respond(result.text[:2000])

And that’s it. Couple lines of code and your bot is using an advanced large language model!

Note: Right now your bot has no memory, so it will reply to every message as if it was the first message it received. It also has no knowledge that it is a Discord bot on your server, it will reply as “Gemini, a multimodal AI language model”. It is also limited in what it knows about the world.

The final code in your main.pyat this point should look like this:

import hikari
import vertexai.generative_models as genai

model = genai.GenerativeModel(
    model_name="gemini-1.0-pro", 
    generation_config=genai.GenerationConfig(max_output_tokens=1900)
)

bot = hikari.GatewayBot(token="<paste your bot token here>")

@bot.listen()
async def ping(event: hikari.GuildMessageCreateEvent) -> None:
    """If a non-bot user mentions your bot, forward the message to Gemini."""

    # Do not respond to bots nor webhooks pinging us, only user accounts
    if not event.is_human:
        return

    me = bot.get_me()

    if me.id in event.message.user_mentions_ids:
        await event.get_channel().trigger_typing()
        result = await  model.generate_content_async(event.message.content)
        # Cutting the answer short, if it's longer than 2000 characters.
        await event.message.respond(result.text[:2000])

bot.run()

Give it a try by starting the bot again with python main.pyand ask some questions or give it a task to complete. Replies will be slower than in the “Pong!” version, but much more enjoyable 🙂.

What next?

In this article, I showed you how to integrate AI capabilities into your Discord bot. In the next post, I’ll explain configuration options and show some basic prompting techniques to support better bot responses to your community.