Intro to function calling with Vertex AI and Gemini

Published in

Echomotion AI and Data Tech Blog

9 min readJun 12, 2024

Photo by Possessed Photography on Unsplash

Code Repo for this blog post: https://github.com/dianaartiom/function-calling-with-vertex-ai

Hey there, tech enthusiasts! In this blog post I’m exploring function calling with Gemini on Google Cloud — it’s really cool to try the latest AI technologies to solve real-world problems. Today, we’re diving into Vertex AI and how you can set up everything you need to start calling functions for GenAI Models like a pro.

So what are function callings? A function calling is a fundamental concept where specific tasks are executed by invoking predefined functions. This allows us to break down complex processes into manageable actions. Building on this, agents are intelligent entities that autonomously utilize these functions to perform intricate operations. By equipping agents with a suite of tools, or predefined functions, we enable them to handle diverse tasks efficiently, from retrieving product details to placing orders. This combination of function calling and agent-driven execution not only enhances the capabilities of AI systems but also simplifies our interaction with technology, making it more intuitive and powerful. Now, let’s dive into some code to see how this works in practice.

But before we get into the cool stuff, let’s make sure we’ve got our foundation solid. You wouldn’t build a rocket without checking your fuel, right? So, we’re going to set up our Google Cloud environment, install the necessary tools, and get everything ready to launch. Trust me, the setup might seem tedious, but it’s like stretching before a workout — essential for optimal performance.

So, let’s roll up our sleeves and get started with the prerequisites. After that, we’ll go into how you can use Gemini for function calling, and by the end of this post, you’ll be all set use Gemini AI models with function tools and see them in action. Ready? Let’s go!

Prerequisites

Create a Google Cloud Project

(Feel free to skip this introductory section if you’re already knowledgeable in GCP and GCP Projects.)

First things first, we need a Google Cloud Platform (GCP) project. If you haven’t created one yet, follow the instructions here to set it up. This will be our launchpad. In the console, you’ll navigate to https://console.cloud.google.com/projectcreate, and you’ll be displayed with an interface like the one below:

Figure 1: Example for creating GCP Project

Note the Project ID under the Project name — this would be the unique identifier for your project, and you’ll be asked later for this Project ID when configuring other resources.

Setting Up a Python Virtual Environment

To keep our dependencies clean and manageable, we’ll use a Python virtual environment. Run the following command to create one:

$ python3 -m venv venv

Installing Necessary Tools

Install Google Cloud SDK

The Google Cloud SDK is our toolkit for managing cloud resources. Follow the Google Cloud CLI Documentation to get it installed.

Initialize the Google Cloud SDK

Once you’ve got the SDK, initialize it with:

$ gcloud init

This command will list all billing accounts associated with your Google account.

Set the Default Project

Now, let’s set our default project. Remember the Project ID I told you about earlier? You’ll need it here. You can also find this ID when you click on the project name in the top left(ish) corner:

$ gcloud config set project [PROJECT_ID]

Install the Vertex AI SDK for Python

To interact with Vertex AI services, we’ll need the Vertex AI SDK. Install it with pip:

$ pip3 install google-cloud-aiplatform

Configuring Billing

Create a Billing Account

We need an active billing account to use Google Cloud services. Create one here.

Enable Billing for Your Project

Enable the Cloud Billing API:

$ gcloud services enable cloudbilling.googleapis.com

Link your billing account to your project:

$ gcloud billing projects link [PROJECT_ID] --billing-account [BILLING_ACCOUNT_ID]

If you don’t know your BILLING_ACCOUNT_ID, get the info from the following command:

gcloud billing accounts list

Going onto the cool stuff — Function calling

Function calling enables LLMs more capabilities and make them more powerful. Let’s take a look how function calling works with Vertex AI. For that, let us consider a hypothetical use case: imagine having a personal assistant that can handle all your restaurant reservations seamlessly. Whether you’re planning a dinner date, a family gathering, or a casual meal with friends, this assistant can check table availability, find the closest restaurant location, and even place a reservation for you. By leveraging AI agents and tools, we can create a sophisticated restaurant reservation system that simplifies the entire process. Let’s explore how this can be implemented through some code.

First, let’s import the necessary client libraries we need:

import requests
from vertexai.generative_models import (
    Content,
    FunctionDeclaration,
    GenerationConfig,
    GenerativeModel,
    Part,
    Tool,
)

We start by importing the vertexai library, which we will use to initialize and interact with Vertex AI. The library provides the necessary tools and functions to work with generative models and function calls. We also import requests, which could be useful if our functions need to fetch or send data over the web. The from vertexai.generative_models import statement brings in several essential classes, including Content, FunctionDeclaration, GenerationConfig, GenerativeModel, Part, and Tool. These classes will help us define functions, configure our generative model, and create a chat instance.

Next, we’ll define the functions that our AI agent will use to handle restaurant reservations:

# Declaring a function for checking table availability
check_table_availability = FunctionDeclaration(
    name="check_table_availability",
    description="Check the availability of tables for a given date and time",
    parameters={
        "type": "object",
        "properties": {
            "restaurant_name": {"type": "string", "description": "Name of the restaurant"},
            "date": {"type": "string", "description": "Date for the reservation"},
            "time": {"type": "string", "description": "Time for the reservation"},
            "party_size": {"type": "integer", "description": "Number of people"}
        },
    },
)

# Declaring a function for placing a reservation
place_reservation = FunctionDeclaration(
    name="place_reservation",
    description="Place a reservation at a restaurant",
    parameters={
        "type": "object",
        "properties": {
            "restaurant_name": {"type": "string", "description": "Name of the restaurant"},
            "date": {"type": "string", "description": "Date for the reservation"},
            "time": {"type": "string", "description": "Time for the reservation"},
            "party_size": {"type": "integer", "description": "Number of people"},
            "contact_info": {"type": "string", "description": "Contact information for the reservation"},
        },
    },
)

Here, we define two functions that the AI will use to interact with the restaurant reservation system. Each function is declared using the FunctionDeclaration class, which outlines the function's name, description, and parameters.

check_table_availability: This function checks the availability of tables for a given date and time at a specified restaurant. It takes parameters like restaurant_name, date, time, and party_size to perform its task.
place_reservation: This function places a reservation at a specified restaurant. It takes parameters like restaurant_name, date, time, party_size, and contact_info to complete the reservation process.

These function declarations are crucial as they define the operations our AI agent can perform and the data it needs to execute these operations.

We need to create a Tool instance and add the declared functions to it. This allows our model to use these functions during interactions.

# Creating a Tool instance and adding the declared functions to it
restaurant_tool = Tool(
    function_declarations=[
        check_table_availability,
        place_reservation,
    ],
)

The Tool class is used to bundle the declared functions into a single tool that the generative model can use. By creating an instance of Tool and passing the list of function declarations, we provide our model with the necessary tools to perform restaurant reservation-related tasks. This step ensures that our AI agent has access to the specific functions it needs to check table availability, find restaurant locations, and place reservations.

Now, let’s initialize a GenerativeModel with a specified model name and generation configuration. This step sets up our model with the necessary parameters and tools.

# Initializing a GenerativeModel with a specified model name and generation configuration
model = GenerativeModel(
    "gemini-1.5-pro-001",
    generation_config=GenerationConfig(temperature=0),
    tools=[restaurant_tool],
)

Here, we initialize the generative model by creating an instance of the GenerativeModel class. We specify the model name ("gemini-1.5-pro-001") and the generation configuration. The GenerationConfig class allows us to configure various parameters for the generative model, such as temperature, which controls the randomness of the model's output. In this case, we set the temperature to 0 for more deterministic responses. We also pass the restaurant_tool we created earlier, giving the model access to the defined functions.

Next, we start a chat session with the initialized model. This session allows us to interact with the model using natural language prompts.

# Starting a chat session with the initialized model
chat = model.start_chat()

# Defining the prompt to be sent to the chat model
prompt = """
Is there a table available at La Piazza for 4 people on June 15th at 7 PM?
"""

# Sending the prompt message to the chat model and getting the response
response = chat.send_message(prompt)

# Printing the first part of the response content
print(response.candidates[0].content.parts[0])

The start_chat method initializes a session where we can send prompts and receive responses. We define a prompt asking about table availability at a specific restaurant and send this prompt to the model using the send_message method. The model processes the prompt, uses the relevant functions to gather the necessary information, and responds accordingly. We then print the first part of the response content to see the model's output. Here’s how the output looks like when I run $ python main.py:

function_call {
  name: "check_table_availability"
  args {
    fields {
      key: "time"
      value {
        string_value: "7 PM"
      }
    }
    fields {
      key: "restaurant_name"
      value {
        string_value: "La Piazza"
      }
    }
    fields {
      key: "party_size"
      value {
        number_value: 4
      }
    }
    fields {
      key: "date"
      value {
        string_value: "June 15th"
      }
    }
  }
}

The output indicates that the AI model correctly interpreted the prompt and generated a function call to check table availability. Here’s a breakdown:

function_call: The AI generated a function call.
name: “check_table_availability”: The function being called is check_table_availability.
args: Contains the parameters for the function:
time: “7 PM”
restaurant_name: “La Piazza”
party_size: 4
date: “June 15th”

It detected the correct function, identified parameters accurately and structured the output — just what the Chef ordered! 🤩

Now, let’s make some fake API Calls ^^ For this, we’ll create a mock API to simulate the responses from a real restaurant reservation system.

First, we’ll create a MockRestaurantAPI class to handle our mock API calls. This class will have methods to check table availability and place a reservation. Note: the functionality is just some dummy implementation, in a proper system you’ll write your function logic here.

class MockRestaurantAPI:
    @staticmethod
    def check_table_availability(restaurant_name, date, time, party_size):
        # Mock response to simulate table availability
        if restaurant_name == "La Piazza" and date == "June 15th" and time == "7 PM" and party_size == 4:
            return True, "Table is available"
        else:
            return False, "Table is not available"
    
    @staticmethod
    def place_reservation(restaurant_name, date, time, party_size, contact_info):
        # Mock response to simulate reservation placement
        if restaurant_name == "La Piazza" and date == "June 15th" and time == "7 PM" and party_size == 4:
            return "Reservation confirmed with ID: 12345"
        else:
            return "Reservation failed"

# Importing necessary utility functions
from utils import is_function_calling

Next, we’ll define helper functions to handle the function calls and user messages. These functions will interact with the MockRestaurantAPI to simulate real API behavior.

# Handle function call
def handle_function_call(response):
    if is_function_calling(response):
        function_call = response.candidates[0].content.parts[0].function_call
        kwargs = dict(function_call.args.items())

        if function_call.name == "check_table_availability":
            available, msg = MockRestaurantAPI.check_table_availability(**kwargs)
            response_content = json.dumps({"available": available, "message": msg})
        
        elif function_call.name == "place_reservation":
            msg = MockRestaurantAPI.place_reservation(**kwargs)
            response_content = json.dumps({"message": msg})

        response = chat.send_message(
            Part.from_function_response(
                name=function_call.name,
                response={
                    "content": response_content,
                },
            ),
        )

        return response

# Handle user message
def handle_user_message(response):
    # Extract user message
    user_input = input(response.candidates[0].content.parts[0].text)
    return chat.send_message(user_input)

# Complete the process by checking if the reservation is confirmed
def complete_process(response):
    reservation_text = response.candidates[0].content.parts[0].text
    if "Reservation confirmed" in reservation_text:
        print(reservation_text)
        return True
    return False

Now we tie everything together in the main process loop. This loop will handle the interaction flow, making function calls as needed and handling user input.

# Define the initial prompt
prompt = """
Is there a table available at La Piazza for 4 people on June 15th at 7 PM?
"""

# Send the initial message to the chat model
response = chat.send_message(prompt)

# Main process loop
while True:
    if is_function_calling(response):
        response = handle_function_call(response)
    else:
        if complete_process(response):
            break
        else:
            response = handle_user_message(response)

This loop sends the initial prompt to the chat model, then continuously checks if the model is calling a function or needs further user input. It handles function calls by invoking the appropriate methods from MockRestaurantAPI and sending the results back to the chat model. If the reservation is confirmed, it prints the confirmation and exits the loop.

He’re what happents when I run the code:

Summary

By following this blog post, you’ve set up a Google Cloud environment, configured Vertex AI, and implemented function calling to handle restaurant reservations. This showcases how powerful AI models can be when integrated with external functionalities, allowing for seamless and intelligent task automation.

I hope you enjoyed this journey into function calling with Vertex AI and Gemini on Google Cloud. Stay tuned for more exciting tech posts!