Expanding the Horizon for Open Source LLMs

Aman Sharma
5 min readSep 6, 2024

--

Acting LLMs: Giving your LLM tools to work with and extend its capabilities. Function Calling with Open Source LLMs!

Hey there, fellow code wizards and AI enthusiasts! 🧙‍♂️✨ Are you tired of your open-source LLMs being all talk and no action? Well, grab your digital wands, because we’re about to perform some serious AI magic!

The Open Source Struggle Is Real

We’ve all been there. You’re chatting with your favorite open-source LLM, asking it to book a flight to Canada or create a clone of yourself to attend that boring party (we’ve all dreamed of it, right?). But alas, your silicon friend just stares back at you blankly, muttering something about not having API access. Womp womp. 😞

Enter the Hero We Deserve: EdgeRunner Command!

But fear not, brave coders! The knights of EdgeRunner AI, with their noble quest to make AI “safe, secure, and transparent,” have bestowed upon us a gift: EdgeRunner Command! This lightweight champion weighs in at a mere 7 billion parameters (practically a digital featherweight) and comes with superpowers that’ll make even ChatGPT jealous.

What’s Function Calling, Anyway?

Imagine if your LLM had a Swiss army knife of APIs at its disposal. That’s function calling in a nutshell! It’s like giving your AI assistant a phonebook of expert friends it can call for help. Need to book a flight? Call the travel agent API! Want to generate an image? Ring up the art department! It’s like your LLM just gained a whole team of interns. Let’s Get This Party Started!

Enough chit-chat! Let’s dive into the code and turn your local machine into an AI powerhouse. We’ll focus on adding image generation to our open-source LLM because, let’s face it, sometimes words just aren’t enough to express our AI-induced joy.

Step 1: The Magical System Prompt

First, we need to give our LLM its marching orders. Think of this as the job description for your new AI assistant:

def _create_system_prompt(self):
tools_json = json.dumps([
{
"name": "generate_image",
"description": "Generates an image based on the given prompt using a Stable Diffusion model.",
"parameters": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "The text prompt to generate the image from."
}
},
"required": ["prompt"]
}
}
# … other tools omitted for brevity
])
return f"You are a helpful assistant with access to the following functions. Use them if required:\n[AVAILABLE_TOOLS] {tools_json}[/AVAILABLE_TOOLS]"

This is like giving your AI a toolbelt. We’re saying, “Hey AI, you’ve got this cool `generate_image` function. Use it wisely, and don’t try to generate images of my browser history!”

Step 2: Decoding the AI’s Mysterious Ways

Now, we need to interpret what our AI is trying to tell us. It’s like being a teenager’s parent — you need to decode their grunts and eye rolls:

def parse_funcs(result: str):
pattern = r"\[TOOL_CALLS\] (.*?)$"
match = re.search(pattern, result, re.DOTALL)
raw_input = match.group(1).strip("\n")
funcs = json.loads(raw_input)
return funcs
# Later in the code…
for func in funcs:
if func["name"] in self.tools:
tool_content = self.tools[func["name"]](**func["arguments"])
tools_messages.append(
{"role": "tool", "name": func["name"], "content": tool_content}
)
self._messages.extend(tools_messages)
assistant_message = self._generate_current()

This part is like having a translator for AI-speak. When it says “I want to use the `generate_image` function,” we say “Say no more, fam” and make it happen.

Step 3: The Picasso of Pixels

Here’s where the magic really happens. We’re turning our AI into a digital Picasso (minus the cubism, unless that’s your thing):

def generate_image(prompt: str) -> str:
"""
Generates an image based on the given prompt using a Stable Diffusion model.

Args:
prompt: The text prompt to generate the image from.

Returns:
A string containing the path to the generated image.
"""
try:
model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(prompt).images[0]

# Save the image and return the path
image_path = "generated_image.png"
image.save(image_path)
return image_path
except Exception as e:
logger.error(f"Error generating image: {str(e)}")
return f"Error generating image: {str(e)}"

This function is like giving your AI a set of digital paintbrushes. You provide the prompt, and it goes to town creating a masterpiece. Or at least something that vaguely resembles what you asked for (let’s be real, AI art can be… interesting).

Putting It All Together

Now that we’ve got all the pieces, it’s time to assemble our AI Voltron! Here’s how to bring this digital dream to life:

1. Here is the link to the files you would require for this. Set up your environment (it’s like preparing a cozy home for your AI):

conda env create -f requirements.yaml
conda activate your_env_name # Replace with your actual environment name (see the requirements.yaml file for the name of env)
conda install pytorch torchvision torchaudio pytorch-cuda=12.3 -c pytorch -c nvidia

Don’t forget to feed your conda! A well-fed conda is a happy conda.

2. Load up the EdgeRunner Command model (it’s like choosing your starter Pokémon, but nerdier):

from edgerunner import Model
model = Model(
"edgerunner-ai/EdgeRunner-Command-7B",
{
"search": search,
"media_search": media_search,
"generate_image": generate_image,
},
)

This is where the magic happens! We’re summoning the EdgeRunner-Command-7B model and equipping it with our custom tools. It’s like giving Gandalf a smartphone — powerful and slightly ridiculous.

3. Create a Gradio app (because every AI deserves a pretty face):


import gradio as gr
def chat(message, history):
response = model.chat(message)
return response
demo = gr.ChatInterface(chat)
demo.launch()

This sets up a simple chat interface. It’s like giving your AI its own talk show, minus the uncomfortable celebrity interviews.

4. Run your script and watch the magic unfold:

python your_script_name.py

Now, users can input prompts and watch in awe as your AI generates both witty responses AND images. It’s like having a conversation with a particularly artistic magic 8-ball!

The Grand Finale

And there you have it, folks! You’ve just turned your humble open-source LLM into a multitasking marvel. It’s like upgrading from a flip phone to a smartphone, but for AI!

With the EdgeRunner Command model at your fingertips, you’re not just talking to an AI — you’re orchestrating a whole digital symphony. Need a web search? Bam! Want some media recommendations? Pow! Craving a generated image of a cat wearing a sombrero while riding a unicycle? …Okay, maybe let’s not push it too far.

Remember, with great power comes great responsibility. Use your newfound AI abilities wisely. Don’t generate too many images of cats wearing sunglasses… or do, I’m not your boss.

Stay tuned for our next tutorial: “Teaching Your AI to Do Your Taxes (Legal Disclaimer: Please Don’t Actually Do This)”!

Happy coding, and may your functions always call and your images always generate! 🚀🎨

--

--

Aman Sharma

A data science and Machine Learning enthusiast. Love to work on Robotics and Machine Perception and Cognition.