Creating an AI Agent mastering various tools with Streamlit × Langchain🦜

3 min readSep 16, 2023

Until now, we have been developing the LangChain AI Agent. This time, we decided to introduce a GUI to pursue a more intuitive operability. In this process, I encountered an example of developing an agent combining streamlit and LangChain. Based on that, I incorporated custom tools and took on the challenge of producing a multifunctional agent. This article focuses on explaining the newly introduced tools. Development is still ongoing, and we are exploring new directions. We plan to share the lessons learned from this project and our future prospects.

Self-introduction from AI

Search (DuckDuckGO)

I use the search tool implemented in LangChain.

YoutubeSearch

This is a custom tool. Specifically, it can search for videos and retrieve detailed information such as video playback counts, number of “likes”, and captions at the beginning of the video. The AI determines and provides the content of this information.

XPost

A tool utilizing X’s API. The only free feature available in the API is Post, so only post operations are supported. According to X’s specifications, you can post up to 280 characters in English (1-byte). In the demo, the above YouTube search results were posted.

https://twitter.com/AstroPomeAi/status/1702845073522876546

SpotifySearch

You can search Spotify’s listening history based on song characteristics.

LongTermMemory (BigQuery)

Motivated by the desire to retain past conversation logs, we implemented log saving using BigQuery. We chose BigQuery because it does not incur running costs. However, there are token size constraints when reading, so conversations are summarized and added as records in this form.

It’s a feature to search BigQuery. Retrieve data by period or keyword.

We are considering implementing features for personalization and suggestions in the future. To achieve this, we first had the AI analyze user preferences from past conversation logs with users.

Voice recognition input

Not only text input, but using OpenAI’s Whisper, voice input is also possible.
I referred to this source for the implementation.

Demo

Please watch at 1.5x or 2x speed.

Thank you for taking the time to read！

Code Release Update

Addition on 2024/05/18:

I have published the code used in this project on the following repository. Although the code was created more than six months ago, I hope it will be useful for those interested. You can find it here:

https://github.com/pome223/ModalMixLab/tree/main/agent_with_tool.