Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Function calling with Gemma3 using Ollama

--

Function Calling allows model to act as a bridge between natural language and real-world actions and data.

Introduction

In this article, we’ll explore how to enable function calling with Gemma 3, — Google’s powerful open-source LLM — using Ollama to perform realtime search . We’ll walk through a hands-on example to demonstrate how a local LLM model can interact with external tools like APIs or Python functions.

Image by Author

What is Function Calling ?

Function calling enables the model to go beyond generating text — it allows the model to interact with external tools, APIs, and services, transforming natural language into real-world actions.

Primary Use Case of Function calling :-

  1. Data Retrieval — Fetch information from APIs, databases, or other sources dynamically (e.g., weather, stock prices, documentation).
  2. Tool Execution — Trigger Python functions or scripts to perform calculations, automate tasks, or control devices.
  3. Multi-Agent Collaboration — Enable agent systems to delegate tasks among themselves and call specialized tools when needed.

Implementation Guide

Let’s go through a step by step breakdown of enabling function calling with Gemma3

Step-by-Step Breakdown

Below decision flow illustrates how Gemma 3 running via Ollama decides whether to respond directly to a user’s prompt or trigger a function call (e.g., an external API like Serper).

Image by Author

Pre-Requisites

Before diving into the code , please ensure following pre-requisires are met in local computer

  1. Python 3.8+
  2. Ollama installed (Download Gemma3–1B, 4B ,12B ,27B)-I am using 27B parameter for this demo
  3. Serper.dev API key for Google Search

Step 1: Setting up environment

First , let’s create a virtual environment

   python -m venv venv
source venv/bin/activate

Install required python packages

pip install gradio ollama requests pydantic python-dotenv
import gradio as gr
import ollama
import requests
import json
import os
from dotenv import load_dotenv
from pydantic import BaseModel, Field
from typing import Optional, Dict, Any, List

Step 2 : Set up Environment variable

For SERPER_API_KEY — Download private key from https://serpapi.com/dashboard 100 free search /month

SERPER_API_KEY=your_serper_api_key_here

Step 3 : Setup Ollama with Gemma 3

  1. Install Ollama from https://ollama.ai/

2. Pull the Gemma3 model

ollama pull gemma3:27b

Step 4 : Define Data model using Pydantic

Use Pydantic to define structured inputs and outputs for search and function calls:

from pydantic import BaseModel, Field
from typing import Optional, Dict, Any

class SearchParameters(BaseModel):
query: str = Field(..., description="Search term to look up")

class FunctionCall(BaseModel):
name: str
parameters: Dict[str, Any]

class SearchResult(BaseModel):
title: str
link: str
snippet: str

def to_string(self) -> str:
return f"Title: {self.title}\nLink: {self.link}\nSnippet: {self.snippet}"

Step 5 : Build the Google Search Function with Serper API

When a function call is triggered by Gemma 3, the chatbot passes the query to the google_search function. This function uses the Serper API to fetch real-time search results and returns the first result in a structured format.

def google_search(query: str) -> SearchResult:
url = "https://google.serper.dev/search"
headers = {
'X-API-KEY': SERPER_API_KEY,
'Content-Type': 'application/json'
}
payload = json.dumps({"q": query})
response = requests.post(url, headers=headers, data=payload)
results = response.json()

if not results.get('organic'):
raise ValueError("No search results found.")

first = results['organic'][0]
return SearchResult(
title=first.get("title", "No title"),
link=first.get("link", "No link"),
snippet=first.get("snippet", "No snippet")
)

Step 6 : Define the System Prompt and Function Call Format and Search Function

SYSTEM_MESSAGE acts as the instructional guide for the Gemma 3 LLM. like a set of rules for model’s decision-making process — whether to answer directly or trigger a function call (like a search).

This system prompt helps the model:

  • Decide when to rely on its internal knowledge.
  • Know when to call an external search function.
  • Format the function call output in a precise JSON structure.
# System message for the model
SYSTEM_MESSAGE = """You are an AI assistant with training data up to 2023. Answer questions directly when possible, and use search when necessary.

DECISION PROCESS:
1. For historical events (pre-2023):
→ Answer directly from your training data

2. For 2023 events:
→ If you have clear knowledge → Answer directly
→ If uncertain about details → Use search

3. For current events (post-2023):
→ Always use search

4. For timeless information (scientific facts, concepts, etc.):
→ Answer directly from your training data


FUNCTION CALL FORMAT:
When you need to search, respond WITH ONLY THE JSON OBJECT, no other text, no backticks:
{
"name": "google_search",
"parameters": {
"query": "your search query"
}
}

SEARCH FUNCTION:
{
"name": "google_search",
"description": "Search for real-time information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search term"
}
},
"required": ["query"]
}
}

Step 7 : Chat flow from UI Interface (gradio) to LLM

  • User input
  • Gemma’s decision to respond or call a function
  • Search execution and response generation
# Model name
MODEL_NAME = "gemma3:27b"

def process_message(user_input):
"""Process user message and return response"""
try:
response = ollama.chat(
model=MODEL_NAME,
messages=[
{"role": "system", "content": SYSTEM_MESSAGE},
{"role": "user", "content": user_input}
]
)

# Get the model's response
model_response = response['message']['content']

# Try to parse the response as a function call
function_call = parse_function_call(model_response)

if function_call and function_call.name == "google_search":
search_params = SearchParameters(**function_call.parameters)
search_query = search_params.query
..
...
....

Step 8: Start the Gradio UI

python function-calling-gemma.py

Step 9: Demo

1.User enters query → What are the dates for Google Cloud Next 2025? -> Gemma3 triggers →Function Call → Google Search

2.User enters query → who won superbowl in 2019 ? Gemma3 response from Training Data [ No Function Call ]

Function Calling with Gemma 3 — Demo

GitHub Repository

You can find complete source code on GitHub Function-Calling-Gemma3

Conclusion

In this article, we demonstrated how to use Gemma 3, Google’s open-source LLM, with function calling capabilities using Ollama, Gradio, and the Serper API. We can extend same for file summarization, code execution, task automation and extend further to multi-agent frameworks like CrewAI or AutoGen.

--

--

Google Cloud - Community
Google Cloud - Community

Published in Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Arjun Prabhulal
Arjun Prabhulal

Written by Arjun Prabhulal

Explore AI/ML and Open Source tools and breakdown into simple, practical guides so that anyone can follow

Responses (3)