How to Create Tools for Your AI Team: A YouTube Blog Post Generator using CrewAI and Gemini Pro

Fatih KIR
4 min readMar 3, 2024

--

Imagine a team of highly specialized assistants, each with a unique skillset, working together to complete a complex task. This is the essence of AI agents — virtual collaborators within Crew AI that tackle specific objectives in a coordinated fashion.

Each agent possesses its own expertise. One might be a master of information retrieval, sifting through mountains of data to find the perfect nugget. Another might be a gifted writer, crafting compelling narratives from raw materials. Yet another could be a critical evaluator, ensuring the final product is polished and error-free.

These agents communicate and collaborate, passing information back and forth, to achieve a shared goal. Like a well-oiled machine, each agent contributes its unique strength, resulting in a more efficient and effective outcome.

But why are tools so crucial for this AI team? Simply put, tools empower each agent to excel at its specific role.

Think of it like equipping your assistants with the right equipment. A transcript retrieval agent might leverage a tool like a YouTube API to access video transcripts swiftly. A content writer agent might use a grammar checker or style guide to ensure their writing is flawless. By providing the right tools, we unlock the full potential of each agent within the Crew AI team.

This collaborative approach, powered by specialized agents and the right tools, promises exciting possibilities for content creation, research, and more. In the next section, we’ll delve into how this translates to a practical tool for building educational content using the power of YouTube and Crew AI.

Before we begin, I highly recommend to check my initial article about CrewAI and Gemini Pro:

Setting Up the Environment

First, we will need some libraries to use. You can load these libraries via pip:

pip install crewai
pip install langchain-google-genai
pip install youtube-transcript-api

Now, we can start building our project. First, we would start by importing our necessary libraries and functions and creation our custom tool:

First, we need a helper function to parse the given youtube url into youtube video id since youtube_transcript_api module works with youtube video ids, not with the youtube urls.

Since urls comes with a fixed structure, we can extract the video id information using regex. Python has a built-in package for this and if you do not what a regex are you can check this module

After converting the youtube video url into video id, we can directly get the transcript with timestamps using a simple function call like here:

transcript = YouTubeTranscriptApi.get_transcript(video_id)

This trancript will be an array of dictionaries that has a text and timestamp information for that text. We need to convert this data into a single string so that our agents can use it easily by joining strings to each other.

When the logic of the function is ready, we need a decorator function to mark our custom function as a langchain tool function. For that, we will use tool decorator from langchain like this:

@tool("Useful to retreive youtube video transcriptions from a youtube url")

Important point here is that, we need to make sure that the explanation that we will give to the decorator should represent what the custom function does clearly. CrewAI agents will use this information to select necessary tools when they are needed.

Then, we can continue with connecting to the Gemini Pro and building our agents. We will create 4 agents in our task, a search agents that will retrieve the transcription from the youtube, a content writer that will use that transcription to write content, a content evaluator to review the content before outputting an a blog post creator to organize the other agents.

We can do all these things using this code block:

The important part is here is that, for the search agent, we added our tool by giving an extra property like this:

tools=[retreive_youtube_transcription]

Every agent can use multiple tools and it can be given as a list. Our custom tool will be used when necessary. After defining our agents, we need to create a user input prompt to enter the url in order to be used by the agents. You can create a nice visual interface using Streamlit or Gradio, but for the sake of simplicity we will add a terminal prompt only.

Then, we will create 2 separate tasks, first task being the transcription retrieval and the second task would be creating the content. We can do all these things by this code snippet:

Finally, we will construct our crew with our agents and submit our tasks to our crew and start the process like this:

Important part here is that, since this is a sequential process, meaning that we need to retrieve the transcription data before the content writing, we need to write the retrieval task before the content task in the tasks parameter like this:

tasks=[retreival_task, content_task]

and that’s it. Our crew is ready to create content from youtube videos. You can create any number of tools and use them into building AI teams of your own using this structure and automate your tasks easily. You can find the full code below:

If you like this article and excited about the more advanced implementations you can visit the CrewAI website. Finally, if you wanna connect with me on Linkedin and share thoughts about ml/data engineering applications, you can find me in here

--

--