Geospatial Function Calling with ChatGPT

11 min readFeb 10, 2024

Most of the code in this example is based directly from this cookbook and has been adapted to work as a python terminal assistant executing basic geospatial functions for the sake of illustration.

Function calling with ChatGPT grants us the ability to write custom software and methods that we can have a large language model determine the appropriate function and the parameters needed based on natural language inputs by a user.

So take the basic idea for this tutorial — we want to build a geospatial chat bot with the capacity to repair a geodataframe geometry, and perform some simple geospatial algorithms (Buffer by a distance, and return the bounding box of the geodataframe). Now this is pretty simple but purely to demonstrate the concept of function calling — try to imagine calling a series of geospatial api’s that perform complex algorithms all by the flow of natural language!

CLONE THE REPO HERE

Basic Requirements

Now the basic requirements here are that you have an OpenAI Developer key — HERE is a great article showing you how to get setup and I also have the full code to this project where you can clone and pip install -r requirements. txt to get all the dependencies you’ll need.

You will need to update the geo_chat.py file and replace the openai.api_key = “YOUR_API_KEY” with your own OpenAI key

Let’s look at our spatial tools…

In our spatial_functions.py file we have our functions that perform our fun geospatial work:

check_geom — used in check_geodataframe_geom to evaluate validity of individual geometries.
check_geodataframe_geom — check_geodatafrane_geom used in repair_geodataframe_geom to evaluate if there are any invalid geometries in the geodataframe.
repair_geodataframe_geom — repairs the geodataframe if there are any issues.
buffer_gdf — as the name suggests this function buffers the geodataframe by a specified distance (we aren’t doing any fancy unit conversions for simplicity).
bounding_box_of_gdf — generates a geospatial bounding box for the geodataframe

If we run this file we can see it execute the logic against a hard coded geodataframe that has basic geometry errors and we repair it, buffer it and return the bounding box geometries — you can spice these up however you wish but I am keeping things very simple for this article.

from shapely.geometry import Polygon, LineString
import geopandas as gpd

# Define the initial set of invalid geometries as a GeoDataFrame
invalid_geometries = gpd.GeoSeries([
    Polygon([(0, 0), (0, 2), (1, 1), (2, 2), (2, 0), (1, 1), (0, 0)]),
    Polygon([(0, 2), (0, 1), (2, 0), (0, 0), (0, 2)]),
    LineString([(0, 0), (1, 1), (1, 0)]),
], crs='EPSG:3857')
invalid_polygons_gdf = gpd.GeoDataFrame(geometry=invalid_geometries)

# Function to check geometry validity
def check_geom(geom):
    return geom.is_valid

# Function to check all geometries in a GeoDataFrame
def check_geodataframe_geom(geodataframe: gpd.GeoDataFrame) -> bool:
    valid_check = geodataframe.geometry.apply(check_geom)
    return valid_check.all()

# Function to repair geometries in a GeoDataFrame
def repair_geodataframe_geom(geodataframe: gpd.GeoDataFrame)-> dict:
    if not check_geodataframe_geom(geodataframe):
        print('Invalid geometries found, repairing...')
        geodataframe = geodataframe.make_valid()
    return {"repaired": True, "gdf": geodataframe}

# Function to buffer all geometries in a GeoDataFrame
def buffer_gdf(geodataframe: gpd.GeoDataFrame, distance: float) -> gpd.GeoDataFrame:
    print(f"Buffering geometries by {distance}...")
    # Check type of distance
    if not isinstance(distance, (int, float)):
        raise TypeError("Distance must be a number")
    
    # Applying a buffer to each geometry in the GeoDataFrame
    buffered_gdf = geodataframe.copy()
    buffered_gdf['geometry'] = buffered_gdf.geometry.buffer(distance)
    return {"message": "Geometries buffered successfully", "gdf": buffered_gdf}

# Function to get the bounding box of all geometries in a GeoDataFrame
def bounding_box_of_gdf(geodataframe: gpd.GeoDataFrame):
    # get bounding box of geodataframe 
    bbox = geodataframe.total_bounds
    return {"message": "Bounding box obtained successfully", "bbox": bbox}

# Main execution block
if __name__ == '__main__':
    print("Checking and repairing geometries...")
    repaired_polygons_gdf = repair_geodataframe_geom(invalid_polygons_gdf)
    
    all_geometries_valid = check_geodataframe_geom(repaired_polygons_gdf)
    print(f"All geometries valid: {all_geometries_valid}")
    
    # Example of buffering the geometries
    buffered_polygons_gdf = buffer_gdf(repaired_polygons_gdf, 0.1)
    
    # Getting the bounding box of the geometries
    bbox = bounding_box_of_gdf(buffered_polygons_gdf)
    print(f"Bounding box: {bbox}")

But now I want a user to be able to use these functions easily — by simply instructing a chat bot that it wants these actions performed.

Building our Chat Bot

This is everything we’ll need to make a bare bones chat bot capable of calling our geospatial functions — now the hardest part of this is conversation flow/handling. We could spend a whole heap of time optimizing the way the conversation flows but we are just doing bare minimum here

First things first lets import and declare everything we need to make this work — bare in mind we are using gpt-3.5-turbo-0613 for this tutorial:

import json
import openai
import requests
from tenacity import retry, wait_random_exponential, stop_after_attempt
from termcolor import colored

# Import our spatial functions
from spatial_functions import *

# Just Hard coding our example data for this demo from our spatial_functions.py
DEMO_GEODATAFRAME = invalid_polygons_gdf

# GPT Variables
GPT_MODEL = "gpt-3.5-turbo-0613"
openai.api_key = "YOUR_API_KEY"

Now let’s declare some functions that make handling the conversation a bit easier:

chat_completion_requests

A Python function that uses the @retry decorator to automatically retry a request to the OpenAI Chat API, it will stop retrying after 3 attempts — this fucntion primarily handles how we communicate with the chat api and handles how we send and receive our data for the conversation

@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, tools=None, tool_choice=None, model=GPT_MODEL):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer " + openai.api_key,
    }
    json_data = {"model": model, "messages": messages}
    if tools is not None:
        json_data.update({"tools": tools})
    if tool_choice is not None:
        json_data.update({"tool_choice": tool_choice})
    try:
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=json_data,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

pretty_print_conversation

The primary purpose of this function is to make our chat conversation look good in our terminal — it returns colors the messages based on the role the message has been assigned (basically who said what? You, the Chat Bot or a generic system message) making the conversation logical to us the user:

def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
    }
    
    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))

Okay amazing but how do we get Chat GPT to understand our functions?

We need to instruct ChatGPT on what it needs in order to execute the geospatial functions based on the conversation with the user there is a nifty thing called tools and if we set this up we basically instruct ChatGPT of the functions we want the users to execute, the context in which they exist and any parameters they may have that we want the large language model to derive from the users text

tools = [
    # Our Repair Function
    {
        "type": "function",
        "function": {
            "name": "repair_geodataframe_geom",
            "description": "Repair invalid geometries in a GeoDataFrame",
            "parameters": {},
        }
    },
    # Our Bounding Box Function
    {
        "type": "function",
        "function": {
            "name": "bounding_box_of_gdf",
            "description": "Get the geospatial bounding box of a GeoDataFrame",
            "parameters": {},
        }
    },
    # Our Buffer Function
    {
        "type": "function",
        "function": {
            "name": "buffer_gdf",
            "description": "Buffer a GeoDataFrame by a specified distance",
            "parameters": {
                "type": "object",
                "properties": {
                    "distance": {
                        "type": "string",
                        "description": "A specific distance as a number that will be used to buffer the GeoDataFrame",
                    },
                },
                "required": ["distance"],
            },
        }
    },
]

Okay so in order of the tools we have declared — they are obviously all functions right? and we have to specify the name of the function that way if the model identifies the context of the function in the users messaging it will know okay this is the name of the function I need to call and thus I can look up what it needs to run.

Description Gives Context:

Now the description essentially gives the context to the model on what the user would ask right so if we look at repair_geodataframe_geom we will see that the description is “Repair invalid geometries in a GeoDataFrame” So when a user asks the question can you repair my geodataframe it will look at the tools its been given and go okay great I have one that matches that context.

Parameters but how:

Now for the sake of simplicity with repair_geodataframe_geom and bounding_box_of_gdf we aren’t deriving parameters from the user (we can if we want but didn’t need to for this demo) BUT for illustration our buffer function requires a distance — it needs to know from the user what radius it needs to buffer by SO we set parameters for our tool:

"parameters": {
    "type": "object",
    "properties": {
        "distance": {
            "type": "string",
            "description": "A specific distance as a number that will be used to buffer the GeoDataFrame",
        },
    },
    "required": ["distance"],
},

And what this does is we can see that the description again gives context so if a user says I want to buffer my geodataframe by a distance of 20” the model will understand that the number 20 belongs to the distance parameter so it will extract 20 and assign it as the distance — distance is required and obviously we don’t want the function to execute without getting everything it needs, so we’llmake sure our instructions have enough detail to give ChatGPT what it needs

Building a Conversation

Okay so as a I mentioned conversation flow and handling the messages could be a series of articles on its own so I’ll just say trust me on how I have set up the flow of code here — it’s very simple and there are many ways we can handle the conversation BUT we’ll focus only on the specific logic for getting our Model to execute our functions

So here is our conversation (big code block will discuss key things):

if __name__ == "__main__":

  messages = []
  messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})

  while True:
      user_input = input("You: ")
      if user_input.lower() == "exit":
          break

      messages.append({"role": "user", "content": user_input})

      chat_response = chat_completion_request(messages, tools=tools)

      if chat_response.status_code == 200:
          response_data = chat_response.json()
          assistant_message_content = response_data["choices"][0].get("message", {}).get("content")

          # Ensure assistant_message_content is not None before appending
          if assistant_message_content:
              messages.append({"role": "assistant", "content": assistant_message_content})
          else:
              messages.append({"role": "assistant", "content": "Of course I can!"})

          tool_calls = response_data["choices"][0]["message"].get("tool_calls", [])
          
          for tool_call in tool_calls:
              function_call = tool_call["function"]
              tool_name = function_call["name"]

              if tool_name == "repair_geodataframe_geom":
                  repair_result = repair_geodataframe_geom(DEMO_GEODATAFRAME)
                  tool_message = "GeoDataFrame repair completed."

              elif tool_name == "bounding_box_of_gdf":
                  response = bounding_box_of_gdf(DEMO_GEODATAFRAME)
                  message = response["message"]
                  bbox = response["bbox"]
                  tool_message = f"{message} {bbox}"

              elif tool_name == "buffer_gdf":
                  function_arguments = json.loads(function_call["arguments"])
                  distance = function_arguments["distance"]
                  response = buffer_gdf(DEMO_GEODATAFRAME, int(distance))
                  DEMO_GEODATAFRAME = response["gdf"]
                  tool_message = f"The GeoDataFrame has been buffered by {distance}."
              
              else:
                  tool_message = f"Tool {tool_name} not recognized or not implemented."
              messages.append({"role": "assistant", "content": tool_message})

          # Print the conversation with the assistant
          pretty_print_conversation(messages)

      else:
          print(f"Failed to get a response from the chat service. Status Code: {chat_response.status_code}")
          try:
              error_details = chat_response.json()
              print("Response error details:", error_details.get("error", {}).get("message"))
          except Exception as e:
              print(f"Error parsing the error response: {e}")

  print("\nConversation ended.")

Note we append this message to our conversation before anything messages.append({“role”: “system”, “content”: “Don’t make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.”}) — when our conversation with ChatGPT starts this is the first prompt it will process.

This is what we are classifying as a system message that will instruct ChatGPT to be sure that a user has provided all details it needs regarding parameters for functions — so imagine if a user says “I want to buffer my geodataframe” our model will know that it needs to call buffer_gdf but it does not have the distance so therefore ChatGPT needs to clarify that with the user and we’ll show that in the demonstration — so keep this in mind

The Converstation starts with YOU

while True:
  user_input = input("You: ")
  if user_input.lower() == "exit":
      break

  messages.append({"role": "user", "content": user_input})

So our application is a terminal chat bot so basically we wait for the user to specify a message — when they write a message we initialize our ChatGPT model — we pass the messages of the chat and the tools we declared that way the Model knows okay “I am having a conversation and here are the tools I need to be mindful of and watch for in the conversation”:

chat_response = chat_completion_request(messages, tools=tools)

This next part handles the response from the ChatGPT model and dictates how the conversation needs to proceed:

if chat_response.status_code == 200:
    response_data = chat_response.json()
    assistant_message_content = response_data["choices"][0].get("message", {}).get("content")

    # Ensure assistant_message_content is not None before appending
    if assistant_message_content:
        messages.append({"role": "assistant", "content": assistant_message_content})
    else:
        messages.append({"role": "assistant", "content": "Of course I can!"})

    tool_calls = response_data["choices"][0]["message"].get("tool_calls", [])
...

else:
  print(f"Failed to get a response from the chat service. Status Code: {chat_response.status_code}")
  try:
      error_details = chat_response.json()
      print("Response error details:", error_details.get("error", {}).get("message"))
  except Exception as e:
      print(f"Error parsing the error response: {e}")

If the response is good we get the chat data from the response and we identify the message from the ChatGPT model and we want to check to see if the model has identified if one of our tools has been called. If the repsonse has a problem of any kind we just do some error handling.

So if I say “hello” there are no tool calls in the response so I will get back a normal message from ChatGPT

If there are tool calls in the response — it means that the ChatGPT model has identified that the user has specified context that aligns with what it was instructed regarding our tools:

This block of code goes through the response and if there are tool calls it identifies which tool the ChatGPT model identified and then executes it accordingly:

for tool_call in tool_calls:
    function_call = tool_call["function"]
    tool_name = function_call["name"]

    if tool_name == "repair_geodataframe_geom":
        repair_result = repair_geodataframe_geom(DEMO_GEODATAFRAME)
        tool_message = "GeoDataFrame repair completed."

    elif tool_name == "bounding_box_of_gdf":
        response = bounding_box_of_gdf(DEMO_GEODATAFRAME)
        message = response["message"]
        bbox = response["bbox"]
        tool_message = f"{message} {bbox}"

    elif tool_name == "buffer_gdf":
        function_arguments = json.loads(function_call["arguments"])
        distance = function_arguments["distance"]
        response = buffer_gdf(DEMO_GEODATAFRAME, int(distance))
        DEMO_GEODATAFRAME = response["gdf"]
        tool_message = f"The GeoDataFrame has been buffered by {distance}."
    
    else:
        tool_message = f"Tool {tool_name} not recognized or not implemented."
    
messages.append({"role": "assistant", "content": tool_message})

So let’s ask the model to buffer our geodataframe — by default our code knows to simply use our DEMO_GEODATAFRAME

Notice how initially our request is ambiguous (we neglected to specify a distance) and ChatGPT needed us to clarify some things — so we have to specify the distance of 20 in this case, and then with the information it needs ChatGPT executes the function and performs a buffer of 20 on the DEMO_GEODATAFRAME

It’s like magic right? — let’s look at a full conversation with our new geospatial chat bot:

To Conclude:

The evolution of AI tools has certainly opened the door for continuous innovation — this is just a small demonstration of an extremely potent feature developed by the Open AI team and has a whole range of potential applications. I am currently working on a larger project right now leveraging this functionality so stay tuned for that.

I hope articles like this can inspire you to leverage these amazing toolsets to expand your work and continue to develop innovative solutions for the Geospatial world.

If you have any questions please feel free to reach out, as always thanks everyone!

CLONE THE REPO HERE

Socials: