Gemini Pro 1.5 multi-modal Slack bot
In my previous article, I showed how to create a Slack bot that uses Gemini 1.0 Pro Vision LLM to provide users with interactive multi-modal discussion capabilities on Slack.
Since Google Cloud Next 2024 just kicked off, we got a nice surprise by having public access to the Gemini 1.5 Pro model. It has a bunch of really cool new capabilities (such as audio) and improved performance. I wanted to update the bot to the new version, which turned out to be relatively straightforward.
And as always, you can deploy the bot into your own Slack workspace (see the instructions in the previous article and make sure you have the latest code from the repository).
The bot can be also integrated with Vertex AI Search (and other REST APIs) to give the model direct access into topics specific to your business.
So let’s explore some of new capabilities that Gemini 1.5 can give to your Slack bot:
Basic context-aware answers
Integration with Vertex Search
Working with PDFs
Answers using images
Answers using voice
Last but not the least (and for the coolest Gemini 1.5 capability!), we can also just ask Gemini directly using a voice message on Slack:
That’s it. Let me know if you deploy a Slack bot using my code or if you take inspiration from it. You can reach me for example of X: @rosmo or just drop a comment.