Empowering Human Led Decision Flows with Open Source Model Enabled Function Calls in Langgraph

Discussion Paper by Daisha Drayton and Simon Marius Galyan

Daisha Drayton
9 min readMar 7, 2024

Introduction

LangChain recently introduced LangGraph, an innovative agent library designed to empower developers in crafting complex, multi-step applications powered by large language models (LLMs). LangGraph transcends traditional LLM applications, enabling the construction of sophisticated AI agents capable of intricate interactions. This article delves into a novel concept within LangGraph, leveraging function calling with Ollama running open source models. This approach fosters increased safety and accessibility in the domain of language modeling.

Human-Guided Decision Flows:

LangGraph facilitates a unique paradigm: human-led decision flows. This empowers users to guide and influence the actions of LangGraph’s AI agents throughout their operation. This is achieved through a four-step process:

  1. User Input: Users initiate the process by providing input, typically via text messages or commands.
  2. Input Processing: The chat executor system meticulously analyzes the provided input to grasp its context, intent, and any specific instructions or requests.
  3. Execution: Based on the processed input, the system determines the appropriate action or response. This could involve task execution, response generation, or actions tailored to the user’s input.
  4. Output: Finally, the system delivers its output or response back to the user. This output can manifest in various forms, such as text messages, information, completed tasks, or any relevant feedback based on the user’s input and the system’s actions.

Benefits of Utilizing Open-Source LLMs with Ollama:

While LangGraph is currently designed to work with OpenAI, integrating function calling with open source models unlocks several noteworthy advantages for building complex, multi-step applications. This approach allows you to leverage the capabilities of open-source LLMs within your LangGraph projects, fostering greater flexibility and control over your AI application development by:

  • Customization and Control: Ollama allows for local deployment of open-source LLM models, granting you more control over model configurations and potentially tailoring them to your specific project needs.
  • Enhanced Privacy and Security: Local deployment with Ollama can offer increased control over data privacy and security compared to cloud-based LLM access, especially crucial for sensitive applications.
  • Cost-Effectiveness: Depending on your usage patterns and specific requirements, Ollama might be a more cost-efficient solution for frequent use cases compared to cloud-based LLM access.
  • Seamless Integration: Ollama functions might integrate well with your existing infrastructure and tools, streamlining development workflows.
  • Specialized Features: Open-source LLMs often cater to specific functionalities or research areas. Ollama allows you to leverage these specialized features within your LangGraph project, potentially exceeding the capabilities of generic LLM offerings.
  • Regulatory Compliance: Local deployment with Ollama can potentially facilitate compliance with relevant regulations related to data residency for your project.
  • Local Deployment Options: Ollama allows for local deployment of open-source LLMs, offering greater control over data residency, potentially lower latency, and potentially aligning with specific project requirements.
All visuals are created by the authors.

Step 1: Setting Up Ollama

The first step in integrating open source LLMs with LangGraph involves setting up Ollama, a platform designed to host LLMs. Ollama provides a convenient environment for hosting and interacting with various language models.

To begin, visit https://ollama.com

And follow the instructions to set up your account. Once set up, you can use the following command to download an LLM model:

```

ollama run neural-chat:7b

```

This command will initialize the LLM, allowing you to directly communicate with it via your terminal. It’s essential to ensure that your system meets the requirements for running the LLM smoothly, including approximately 4 GB of RAM.

For users with less RAM, using q2 will be perfect for your device. More about q2 here:

https://ollama.com/library/openhermes:7b-mistral-v2-q2_K

Step 2: Installing Langchain and LangGraph

With Ollama set up, the next step involves installing Langchain and LangGraph, the python libraries essential for LLM usage within the LangGraph framework.

Open your terminal and execute the following commands:

```

pip install langchain

pip install langgraph

```

These commands will install the necessary libraries, enabling seamless integration of LLMs into LangGraph.

Step 3: Fork The Notebook

Begin by forking the notebook available at:

https://github.com/Daisha22d/ollama-loop

This notebook serves as one way to integrate the Ollama function into LangGraph.

Once you have the notebook open on your device, run the following command in your terminal:

```

ollama pull openhermes:7b-mistral-v2.5-q2_K

```

When you execute this command, it instructs the “ollama” tool to download or retrieve the specified version of the “openhermes” software package or container image from the repository or registry identified as “ollama.” The version tag “7b-mistral-v2.5-q2_K” represents a specific version of the software package or container image that you want to use or deploy in your environment. For this notebook we are using a version that can run on devices with less RAM. If you have more RAM you can use the mistral 7B 4 quantized version.

Delving into the Code: Putting Ollama and LangGraph in Action

As discussed, leveraging open-source LLMs through Ollama offers several advantages for building LangGraph applications. Now, let’s delve into the code behind this integration, using a specific example from the notebook.

This example demonstrates how to utilize the ChatOllama class within LangGraph to interact with an open-source LLM model hosted on Ollama. We’ll focus on a specific code snippet to illustrate the key functionalities:

Explanation:

  1. Local LLM Model: We define the local_llm variable containing the identifier of the open-source LLM model hosted on Ollama (“openhermes:7b-mistral-v2-q2_K” in this example).
  2. ChatOllama Instance: We create a ChatOllama instance named chat_model. This instance connects to the specified LLM model and allows interaction with it through LangGraph. The temperature parameter is set to 0, controlling the randomness of the model’s responses.
  3. Prompt and Tools: We define a prompt using the hwchase17/react-json hub, specifying the initial conversation format. This prompt is then customized using the partial method to include the duckduckgo_search tool. This means the agent will have access to search functionality using DuckDuckGo. The hwchase17/react-json prompt template is used to construct a ReAct agent based on Yao in 2022 who introduced a framework named ReAct where LLMs are used to generate reasoning traces allowing the model to induce, track, and update action plans, and even handle exceptions. Read more here: https://arxiv.org/abs/2210.03629.
  4. Building the Agent: We define the agent using the composition operator ( | ). This operator allows us to chain together different components to build the agent’s functionality. Here’s what each step does:
  • The first part extracts the user’s input from the data.
  • The second part processes the conversation history using format_log_to_str.
  • The third part applies the defined prompt format.
  • The fourth part interacts with the LLM through the chat_model instance, with the .bind(stop=[“\nObservation”]) ensuring the conversation ends when the LLM response contains a newline followed by “Observation”.
  • The final part parses the LLM output into a specific format using ReActJsonSingleInputOutputParser.

Example:

Imagine a user asks “What is the weather in San Francisco?”. The code snippet would:

  1. Extract the user’s question (“What is the weather in San Francisco?”).
  2. Apply the prompt format, potentially including instructions for the LLM and the duckduckgo_search tool.
  3. Send the processed input to the LLM through the ChatOllama instance.
  4. Receive the LLM response and parse it into the desired format.
  5. Potentially use the duckduckgo_search tool to retrieve additional information based on the LLM output or user’s request.

This is a basic demonstration of how Ollama and LangGraph work together to allow interaction with open-source LLMs and integrate them into complex conversation flows within your LangGraph applications.

Setting the Stage for LLM Interaction:

The code begins by establishing communication with the chosen open-source LLM model. OpenHermes is a 7B model fine-tuned by Teknium on Mistral with fully open datasets.

Building the Brain of the Agent:

Next, the code constructs the agent by defining a “prompt,” the initial conversation format, leveraging the “hwchase17/react-json” hub element. This initial prompt functions like a director setting the scene for the user’s interaction with the LLM, establishing the initial context and parameters for the conversation. The code then customizes this prompt using the partial method to integrate the duckduckgo_search tool. This empowers the agent to not only interact with the LLM but also access and potentially utilize information retrieved from web searches.

Ensuring Smooth Communication:

The code seamlessly integrates the ChatOllama instance, named chat_model, into the agent’s workflow. This instance acts as a bridge, facilitating communication between the agent and the LLM. To guarantee a smooth and structured conversation, the code employs a specific technique. It uses the .bind(stop=[“\nObservation”]) method on chat_model. This method instructs the agent to cease interaction with the LLM once a specific marker (“\nObservation”) appears within the LLM’s response. This mechanism effectively prevents the conversation from getting stuck in an endless loop.

Parsing the LLM’s Response:

The code doesn’t stop at just receiving the LLM’s response; it also ensures the information is interpreted and presented in a structured format. This is achieved using the ReActJsonSingleInputOutputParser. Imagine this as a translator, taking the raw output from the LLM and converting it into a language the agent (and potentially the user) can comprehend.

Equipping the Agent with Tools:

The code equips the agent with a valuable tool: the DuckDuckGoSearchRun instance named web_search. This tool empowers the agent to access and process information from the web, potentially enriching its responses and decision-making capabilities.

The Agent in Action:

The code defines two crucial functions: run_agent and execute_tools. As the names suggest, run_agent handles the core interaction with the LLM, including processing user input and executing the agent’s logic. The execute_tools function, on the other hand, takes charge of managing the utilization of tools like the web_search instance.

Decision Time: Charting the Conversation Flow:

The code implements a crucial decision-making process using the should_continue function. This function, as its name implies, determines the next step in the conversation based on the agent’s outcome. If the outcome indicates the conversation has reached a natural conclusion (AgentFinish), the function signals the end of the flow. Otherwise, it signals the need to continue processing and potentially interact with the LLM or utilize tools again.

The Conversation Workflow:

The code takes all the previously defined components and orchestrates them into a cohesive conversation flow using a state graph. This graph, functioning as a roadmap, defines the sequence of steps the agent takes during the interaction. The graph encompasses two key nodes: agent and action. The agent node handles interacting with the LLM, while the action node manages the execution of any available tools. The code utilizes conditional edges, guided by the should_continue function, to navigate the graph and determine the next step based on the conversation’s progress.

Bringing it All Together:

Finally, the code transforms the state graph into a LangChain runnable, essentially packaging all the components and logic into a readily usable format. This runnable empowers developers to easily integrate this functionality into their LangGraph applications.

By delving into the code and using the comments as a guide, we’ve unraveled the inner workings of this LangGraph agent empowered by Ollama. This exploration not only provides valuable insights into the project’s specific implementation but also showcases the potential of LangGraph and Ollama in crafting innovative, human-guided decision flows.

Resources for Further Learning:

LangChain Documentation:

This website provides comprehensive documentation on LangChain, including tutorials, examples, and API references.

Ollama Documentation:

The Ollama website offers documentation outlining its functionalities, installation instructions, and available LLM models.

LangChain Examples:

The LangChain GitHub repository provides various examples showcasing the usage of LangChain in different scenarios. These can be valuable resources for further exploration and learning.

Connect with us:

Simon’s medium: https://medium.com/@simon.marius.galyan

Daisha’s X (Twitter): https://twitter.com/daishadeniz.com

--

--

Daisha Drayton

🚀 Catalyst of Creativity | Spreading Positivity | Storyteller of Self-Discovery | Join me on this cosmic journey 🌌💫