LangChains for NLP problems

vTeam.ai
Data Science in your pocket
4 min readAug 28, 2023
Photo by Markus Spiske on Unsplash

READ FULL BLOG HERE

We have already covered a beginner-friendly tutorial on LangChains on how to create different apps using the LangChain framework over LLMs.If you missed it, we have got your covered

https://vteam.ai/blog/posts/beginners-guide-for-lang-chain

So, as you must know LangChain is a powerful framework built around Large Language Models (LLMs), designed for tasks such as chatbots, generative question-answering, summarization, and more. The core idea of LangChain is to “chain” together different components, creating sequences of components or sub-chains to achieve specific tasks. These components include prompt templates, language models, and output parsers, working harmoniously to handle user input, generate responses, and process outputs. LangChain simplifies the customization of models like GPT-3 by providing an API for prompt engineering, enhancing the approachability of working with LLMs for various applications. It streamlines integration with different types of models and interfaces, allowing LLMs to take strings as input and produce strings as output, enhancing the development of applications powered by large language models.

In the last tutorial, we built 3 different apps i.e. Grammar Checker, Tone changer, and Language Translator for which the codes are available in the blog post mentioned above. Making things a little complicated, we will jump onto some NLP-related problem statements and how to implement them using LangChains & LLMs. But before jumping onto the codes, let’s understand a few fundamental concepts required to understand the tutorial.

Named Entity Recognition (NER): NER involves identifying and classifying named entities (such as names of people, places, organizations, dates, etc.) in text. Example: “Johnny lives in Florida” -> NER identifies “Johnny” as a PERSON and “Florida” as a GPE (Geopolitical Entity)

Text tagging; Text tagging in Natural Language Processing (NLP) involves assigning specific labels or tags to words in a text corpus to indicate their grammatical or semantic properties. These tags provide information about the part of speech (noun, verb, adjective, etc.) or other linguistic characteristics of each word.

LLMs: Large Language Models (LLMs) are a type of artificial intelligence that is characterized by their large size. They have the ability to process and generate human-like responses to natural language queries. LLMs are trained on vast amounts of text data, often scraped from the Internet, using AI accelerators. They are used in various natural language processing tasks and have the capability to generate coherent and contextually relevant text. Popular examples of LLMs include BERT, GPT-3, and T5. These models have revolutionized natural language processing by demonstrating the ability to generate human-like text and comprehend context, leading to applications in chatbots, content generation, translation, and more.

LangChains: LangChain is a framework designed to simplify the creation of applications using large language models (LLMs). It allows chaining together different components to create advanced use cases around LLMs, such as chatbots, generative question-answering, summarization, and more. LangChain offers a standard interface for building chains of models and integrates with various tools. It facilitates interactions between chains and external data sources for data-augmented generation.

Tutorial 1: Named Entity Recognition

In this short demo, we will extract the name of a person, organization, and city. Let’s get started with the required imports

#!pip install openai langchain
from langchain.chat_models import ChatOpenAI
from langchain.chains import create_extraction_chain
# Schema
schema = {
"properties": {
"name": {"type": "string"},
"city":{"type": "string"},
"organization":{"type":"string"}
}
}

Here, Schema refers to the entities we wish to extract in the form of a dictionary.

# Run chain
llm = ChatOpenAI(openai_api_key=api_key)
chain = create_extraction_chain(schema, llm)

Next, we are loading the LLM object. Remember we need an api_key at this point. Use the schema and LLM as a parameter to pass to create_extraction_chain().

Time for some testing

inp = """Being working at Salesforce, Bengaluru, Ravi has earned a fortune for himself"""
print(chain.run(inp))
inp = 'Diksha heads the Marketing department at DBS, Singapore'
print(chain.run(inp))
inp = 'How is life in Hyderabad, Mehul?'
print(chain.run(inp))

As you can observe, we have successfully retrieved all the entities present in the text. If you want other entities as well, mention them in the schema

Kindly read the full blog for tutorials on Text Tagging and Summarization !!

--

--