Part 1.2 : Langchain w/Huggingface model on AWS

Mine Kaya
BosphorusISS
Published in
6 min readDec 6, 2023

Hi, in case you missed Part 1, here is a brief overview of what we’ve done so far. Recently, I was working on a project to build a question-answering model for providing responses to questions based on data from one of our internal projects, which we use as a project management tool. We explored how AWS Canvas works, created a forecast model in it, and deployed both that model and a Huggingface model.

Later, I changed my approach to the solution. I will now directly dive into the topic.

In the first part, I used Canvas and Sagemaker Studio, now I will switch the notebook instance within a Sagemaker. You can create your own from Sagemaker dashboard > Notebook instances.

What is Langchain?

LangChain is a framework for developing applications powered by language models. It is designed to simplify the process of building these applications by providing a set of tools and abstractions that make it easier to connect language models to other data sources. It is written in Python and JavaScript, and it supports a variety of language models, including OpenAI models and HuggingFace models.

LangChain provides standard, extendable interfaces and external integrations for the following modules, listed from least to most complex:

  • Model I/O : Interface with language models.
  • Retrieval : Interface with application-specific data.
  • Chains : Construct sequences of calls.
  • Agents : Let chains choose which tools to use given high-level directives.
  • Memory : Persist application state between runs of a chain.
  • Callbacks : Log and stream intermediate steps of any chain.

For my case, the Retrival module was looking pretty promising, but when I started to read about documentation, I see that it’s good with documents and structural data (aka db or csv). You need to perform text embedding or create a vector database. I didn’t know anything about that, no lies it seemed bit scary and confusing, I was like, ok no retrival for me

Agents

Agents are our middleman between our data and llm , using various tools to take actions over data and iterate until reaching an observation . There are pre-ready agents that you can use or crate your custom agent with tools of your choice. I will go with CSV agent but before stepping into a agents , what was ‘the tools’ ?

Tools are functions that an agent calls. Toolkit is a set of tools to use in an agent. Tools are defined like this :

//SerpApi is a real-time API to access Google search results
search = SerpAPIWrapper()
tools = [
Tool.from_function(
func=search.run,
name="Search",
description="useful for when you need to answer questions about current events"
),
]

And when you want use that in agent :

llm = ChatOpenAI(temperature=0)
agent = initialize_agent(
tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

You can see list of agents here. For integration you can check numerous use cases of agents and toolkits here.

There are several agent types for declaring agents. I can use two of them on csv_agent. I will talk about their differences on the below.

I will follow through CSV agent as I mentioned. I will create the files from database we have. I didn’t want to directly connect to db in the begining.

CSV Agent

We will create a csv agent to interact with data in CSV format.

There are two agent types to initialiaze csv_agent, one with ZERO_SHOT_REACT_DESCRIPTION other with OPENAI_FUNCTIONS.

- ZERO_SHOT_REACT_DESCRIPTION :agent type to implement the ReAct logic.

- OPENAI_FUNCTIONS : agent type, as you can guess from its name, for OpenAI models.

Here is the official doc for csv_agent.

Our model will be from Huggingface, La-Mini . Now we should use our model that we deployed within a SageMakerEndpoint. (I showed how we deploy huggingface model on Aws Sagemaker on Part-1).

SageMakerEndpoint

SakeMakerEndpoint lets us use models for any use case with fully managed infrastructure, tools, and workflows. We will connect the model endpoint via SageMakerEndpoint (needs boto3 for make that happen. Boto3 is AWS SDK for Python, find the docs here.)

pip install langchain
pip install boto3
pip install tabulate

import boto3
from langchain import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
import json

runtime = boto3.client('runtime.sagemaker')



class HFContentHandler(LLMContentHandler):
content_type = "application/json"
accepts = "application/json"

def transform_input(self, prompt: str, model_kwargs: dict) -> bytes:
self.len_prompt = len(prompt)
input_dict = {
"inputs": prompt,
"parameters": model_kwargs
}
input_str = json.dumps(input_dict)
print(input_str)
return input_str.encode('utf-8')

def transform_output(self, output: bytes) -> str:
response_json = output.read()
res = json.loads(response_json)
print(res)

# stripping away the input prompt from the returned response
ans = res[0]['generated_text']
return ans

#connect the model endpoint wia SageMakerEndpoint
llm = SagemakerEndpoint(
endpoint_name= f"{endpoint_name}",
region_name=f"{region}",
model_kwargs= { },
content_handler=HFContentHandler()
)
from langchain.agents import ccreate_csv_agent
from langchain.agents.agent_types import AgentType

file_path = {your_file_path}

csv_agent = create_csv_agent(
llm,
file_path,
verbose=True,
agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
handle_parsing_errors=True
)

csv_agent.run("how mant diffrent project_id are there? ")

I store the CSV files on S3 and change file_path with my file URLs, but you can give your local file URLs too.

I will put a few answers from different models , you can check the [{‘generated_text’}] to see the answer we got from that model. (sorry for not extracting, I took these screenshots earlier)

La-Mini :

I should mention that anwer that La-Mini gaves was wrong. So i changed the model with Falcon and Flan-Base and let’s see what was their answers.

Falcon:

(falcon got stuck in the loop of question-answering with the same code base)

Flan-Base:

And all of them were wrong. I blamed the models back then, but the problem was using an agent as a ZERO_SHOT_REACT_DESCRIPTION type. This includes really base of tools and for your HuggingFace or any other open-source model, you have to create custom tools and descriptions for your model’s requirements. It follows ReAct logic. You should define which tool to use to take an action to complete the observation until your chain finishes.

Lets take quick look to at the screenshots above again , I wanted you to look at the generated_text field, but whats up with the above it? If you read a bit the inputs field, you will see how the agent does its job. AgentExecuterChain mostly run in this cycle:

Action > Action Input > Observation > Thought >

Action > … > Thought

.

.

> Final Answer.

Each action needs a tool to do its job. So you have to have a really cool toolkit when you work on open-source models. But on the other hand, OpenAI has agent type OPENAI_FUNCTIONS , and it simplifies prompts and there is also a saving on tokens, unlike ZERO_SHOT_REACT_DESCRIPTION there is no need to describe to the LLM what tools it has at its disposal. Because no one can know chat-gpt better than them, right

We encountered a milestone decision once again, but this one out of my control, but glad that we had this switch. We have decided to use ChatGPT as our model from now on. Although I was highly motivated implement this using an open-source model, when comparing the costs of an OpenAI subscription and Sagemaker, I must say that OpenAI is far more cost-effective than running your own machine learning on AWS.

--

--