Named Entity Recognition with LLMs — Extract Conversation Metadata

Isidoro Grisanti
5 min readOct 20, 2023

--

Introduction to NER

Named Entity Recognition (NER), a fundamental task in natural language processing (NLP), plays a pivotal role in various language-related applications, ranging from information retrieval to question answering systems, and from sentiment analysis to machine translation.

The task of NER involves identifying and categorising named entities within a text, such as names of people, organisations, locations, dates, and more. These identified entities serve as the building blocks for numerous downstream NLP tasks, contributing to the extraction of structured information from unstructured text data.

Example of NER from Spacy.io

In recent years, NER has witnessed remarkable advancements, owing to the advent of deep learning techniques, the availability of large-scale annotated datasets, and increased computational resources.

This article aims to provide a comprehensive overview of how to solve the Named Entity Recognition task by leveraging on the latest trends and breakthroughs based on Large Language Models (LLMs).

Advanced Named Entity Recognition (NER) with Large Language Models (LLMs)

In recent months, Large Language Models have revolutionised the field of NLP by demonstrating remarkable capabilities in understanding and generating natural language text. These models, trained on vast corpora of text data, have learned to capture intricate patterns and semantic relationships within language, making them ideal for a wide range of NLP tasks.

In the context of NER, some specific LLMs excel at recognising and categorising named entities. This is a new cutting-edge approach that leverages on the power of state-of-art LLMs such as GPT-3.5 or GPT-4, to perform NER tasks with a high degree of accuracy and efficiency.

The advantages of using LLMs for NER are manifold:

  • they can handle a broad spectrum of entity types;
  • they are highly adaptable to various domains and languages;
  • their performance often surpasses that of traditional rule-based (e.g. regular expressions) or feature-based NER system;
  • they can capture contextual information and context dependencies more effectively (e.g. sentiment analysis or intent detection);
  • LLMs are capable of transfer learning, meaning they can be pre-trained on a general language corpus and fine-tuned for specific NER tasks, thus requiring fewer annotated data points for training.

However, working with LLMs for NER is not without challenges:

  • LLMs may raise concerns about model bias, model interpretability and ethical considerations, which require careful attention;
  • LLM responses may contain “hallucinations” that can lead to the spread of misinformation;
  • Fine-tuning requires designing appropriate training data, carefully selecting hyper-parameters, and often involves substantial computational resources.

Extract conversation metadata with OpenAI LLMs

In this section it is provided an example of python code solution to retrieve useful entities from a chat conversation.
Suppose you have a chat conversation between an User and an Assistant working for a Pizza Restaurant, let’s try to extract some sample entities (e.g. intent, sentiment and number of pizzas to order) from it, leveraging on OpenAI LLMs and LangChain framework:

1. Let’s import the needed python libraries first and set-up the OpenAI key to get access to OpenAI APIs:

from langchain.prompts import PromptTemplate
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser
import os
import openai

os.environ['OPENAI_API_KEY'] = "" # insert your opeanai api key
openai.api_key = os.environ["OPENAI_API_KEY"]

# function to call OpenAI APIs
def get_completion(prompt, model="gpt-3.5-turbo"):
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=0, # this is the degree of randomness of the model's output
)
return response.choices[0].message["content"]

Note that you also need to install openai and langchain libraries in your virtual environment (e.g. pip install openai langchain)

2. Define list of entities to retrieve:

# intent
intent_name_field = ResponseSchema(name="intent", description=f"Based on the latest user message, extract the user message intent. Here are some possible labels: 'greetings', 'booking', 'complaint' or 'other'")
# user need
user_need_field = ResponseSchema(name="user_need", description="Rephrase the latest user request and make it a meaningful question without missing any details. Use '' if it is not available")
# user sentiment
sentiment_field = ResponseSchema(name="sentiment", description="Based on the latest user message, extract the user sentiment. Here are some possible labels: 'positive', 'neutral', 'negative', 'mixed' or 'other'")
# number of pizzas to be ordered
n_pizzas_field = ResponseSchema(name="n_pizzas", description="Based on the user need, extract the number of pizzas to be made. Use '' if it is not available")

# schema with all entities (fields) to be extracted
conversation_metadata_output_schema_parser = StructuredOutputParser.from_response_schemas(
[
# user intent
intent_name_field,
# user need
user_need_field,
# user sentiment
sentiment_field,
# number of ordered pizzas
n_pizzas_field
# other extra fields to be extracted
# ...
]
)
conversation_metadata_output_schema = conversation_metadata_output_schema_parser.get_format_instructions()
print(conversation_metadata_output_schema)

Note that for each entity to retrieve you only need to define its field name and provide a clear description in natural language.

3. Define LLM system prompt:

conversation_metadata_prompt_template_str = """
Given in input a full chat history between a user and a customer service assistant, \
extract the following metadata according to the format instructions below.

<< FORMATTING >>
{format_instructions}

<< INPUT >>
{chat_history}

<< OUTPUT (remember to include the ```json)>>"""

conversation_metadata_prompt_template = PromptTemplate.from_template(template=conversation_metadata_prompt_template_str)

Note that the system prompt contains all instructions that the LLM must follow.

4. Extract entities from “Greetings conversation” example:

# example of Greetings conversation
messages = [
{'role':'assistant', 'content':'Hello! I am Isi, your digital assistant. \n How may I help you today?'},
{'role':'user', 'content':'Hi! my name is Isa!!'}
]

# init prompt
conversation_metadata_recognition_prompt = (
conversation_metadata_prompt_template.format(
chat_history=messages,
format_instructions=conversation_metadata_output_schema
)
)

# call openAI API to detect the conversation metadata (e.g. intent, user_need, entities, etc.)
conversation_metadata_detected_str = get_completion(conversation_metadata_recognition_prompt)

# conversion from string to python dict
conversation_metadata_detected = conversation_metadata_output_schema_parser.parse(conversation_metadata_detected_str)
print(conversation_metadata_detected)
# {'intent': 'greetings',
# 'user_need': '',
# 'sentiment': 'positive',
# 'n_pizzas': ''}

5. Extract entities from “Pizza order conversation” example:

# example of pizza order conversation
messages = [
{'role':'assistant', 'content':'Hello! I am Isi, your digital assistant. \n How may I help you today?'},
{'role':'user', 'content':'Hi, my name is Isa!'},
{'role':'assistant', 'content': "Hi Isa! It's nice to meet you. Is there anything I can help you with today?"},
{'role':'user', 'content':"Yes, I'd like to make an order. I'd like order 4 pizzas and 10 beers. Could you help me with that?"}
]

# init prompt
conversation_metadata_recognition_prompt = (
conversation_metadata_prompt_template.format(
chat_history=messages,
format_instructions=conversation_metadata_output_schema
)
)

# call openAI API to detect the conversation metadata (e.g. intent, user_need, entities, etc.)
conversation_metadata_detected_str = get_completion(conversation_metadata_recognition_prompt)

# conversion from string to python dict
conversation_metadata_detected = conversation_metadata_output_schema_parser.parse(conversation_metadata_detected_str)
print(conversation_metadata_detected)
# {'intent': 'booking',
# 'user_need': 'Could you help me make an order for 4 pizzas and 10 beers?',
# 'sentiment': 'positive',
# 'n_pizzas': '4'}

Wrap-up

In this article it was provided a comprehensive overview of how to use Large Language Models (LLMs) in order to solve NLP tasks such as Named Entity Recognition. An implementation is also provided leveraging on OpenAI models and LangChain, a framework for developing applications powered by Large Language Models (LLMs).

I hope you enjoyed this article! and found it informative and engaging. Thanks for reading 🙏
Keep on learning & Sharing 🤝

--

--