Creating an AI Agent mastering various tools with Streamlit × Langchain🦜
Until now, we have been developing the LangChain AI Agent. This time, we decided to introduce a GUI to pursue a more intuitive operability. In this process, I encountered an example of developing an agent combining streamlit and LangChain. Based on that, I incorporated custom tools and took on the challenge of producing a multifunctional agent. This article focuses on explaining the newly introduced tools. Development is still ongoing, and we are exploring new directions. We plan to share the lessons learned from this project and our future prospects.
Table of Contents:
- Self-introduction from AI
- Search (DuckDuckGO)
- YoutubeSearch
- XPost
- SpotifySearch
- LongTermMemory (BigQuery)
- Voice recognition input
- Demo
- Code Release Update(2024/05/18)
Self-introduction from AI
Search (DuckDuckGO)
I use the search tool implemented in LangChain.
YoutubeSearch
This is a custom tool. Specifically, it can search for videos and retrieve detailed information such as video playback counts, number of “likes”, and captions at the beginning of the video. The AI determines and provides the content of this information.
XPost
A tool utilizing X’s API. The only free feature available in the API is Post, so only post operations are supported. According to X’s specifications, you can post up to 280 characters in English (1-byte). In the demo, the above YouTube search results were posted.
https://twitter.com/AstroPomeAi/status/1702845073522876546
SpotifySearch
You can search Spotify’s listening history based on song characteristics.
LongTermMemory (BigQuery)
Motivated by the desire to retain past conversation logs, we implemented log saving using BigQuery. We chose BigQuery because it does not incur running costs. However, there are token size constraints when reading, so conversations are summarized and added as records in this form.
It’s a feature to search BigQuery. Retrieve data by period or keyword.
We are considering implementing features for personalization and suggestions in the future. To achieve this, we first had the AI analyze user preferences from past conversation logs with users.
Voice recognition input
Not only text input, but using OpenAI’s Whisper, voice input is also possible.
I referred to this source for the implementation.
Demo
Please watch at 1.5x or 2x speed.
Thank you for taking the time to read!
Code Release Update
Addition on 2024/05/18:
I have published the code used in this project on the following repository. Although the code was created more than six months ago, I hope it will be useful for those interested. You can find it here:
https://github.com/pome223/ModalMixLab/tree/main/agent_with_tool.