Controlling Creativity: How to Get Reproducible Outcomes from LLMs
Why am I getting different results for the same prompt with same model?
Why the Large Language Models (LLM) are random in nature?
What should I do in order to get predictable or same results from LLMs?
How should I reproduce the same output over and over again for demonstration purposes?
These are the most common questions comes to our minds while dealing with Large Language Models.
I have been working with LLMs for almost 6 months and have been able to get good set of results or output being generated by them. Recently I faced a problem where I was supposed to extract structured data from unstructured text like PDF’s. The content in the PDF looks like a table with fixed set of columns across all the pages, since while reading the PDF as string the structured part is lost and I need to reproduce the same structure using LLM. There are many online tools to extract the structured output from PDF’s e.g. Tables but those tools don’t understand the context of the document and extract’s only the data based on the structure. As an example, if a person’s name is spread across two rows, then LLM will understand the context but not the automation tools. I hope you might have got the gist of the problem I am trying to solve.
In this article we will discuss on some of the core concepts of LLM’s
- Inherent or default nature of LLMs
- What happens while predicting the next word?
- Understand some key parameters for predicting the next word?
- Playing with the key parameters
- Demo: A Jupyter notebook explaining how to solve the problem
Inherent or default nature of LLMs
Large Language Models are designed to understand and generate human-like text by processing vast amounts of textual data. Most of the times LLMs are using in creative writing e.g. summarizing the text, writing emails, answering questions etc. In these fields we need more creativity based on the situation or context of the input. Sometimes we don’t need any creativity for example extracting the same content out of a document, understanding financial data or legal data.
What happens while predicting the next word?
As per my understanding LLM maintain a list of words with probabilities to make a prediction. Every time we ask the same question LLM will randomly choose the next word with higher probabilities.
Next_Word = [Word1, Word2, Word3, … , WordN]
Next_Word(Probabilities) = [P1, P2, P3, … , PN]
where P1 > P2 > … > PN
Depending on the context of the question the probabilities might vary, or the probabilities of each word might be very close to each other in which case LLM’s depend on the randomness to make it more creative.
To understand more on this topic, I would recommend watching this video. What is Temperature, Top P, Top K in LLM? (From Concepts to Code) (youtube.com)
Understand some key parameters for predicting the next word?
Some of the key parameters for fine-tuning LLMs are below.
- Temperature — Temperature sets the creativity level of the LLMs, generally it ranges from 0–2, where lower values represent less creativity and higher values represents more creativity.
- Top P — Top P filters the words with cumulative probability of at least the given value, e.g. P(Word1) + P(Word2) ≥ TopP. Only those words are shortlisted if their cumulative probability are greater than the given value.
- Top K — Top K represents how many words to be filtered e.g. 1 or 2 or 3 or K. If K = 1 LLM will consider the best possible token based on the probability, this might result in more predictable output since there is no randomness involved
- Seed — A seed, often referred to as a random seed or seed value, in the context of randomness, is an initial value used by a pseudorandom number generator (PRNG) to start generating a sequence of numbers that has properties of randomness. If you use the same seed and the same sequence of random choices, you will get the same output each time, ensuring reproducibility.
Playing with the key parameters
While working with Chat GPT Models I was able to experiment with temperature and seed only, other parameters like TopP and TopK weren’t available, I tried to search many sources to configure TopP and TopK for ChatGPT but I couldn’t find any examples.
Defaults
The default value for temperature is 0.7 for ChatGPT (Azure), so when we try to run the same prompt again and again, we get different results.
#Creating LLM model with default configuration
llm = AzureChatOpenAI(
deployment_name=os.environ["AZURE_DEPLOYMENT_NAME"],
openai_api_version=os.environ["OPENAI_API_VERSION"],
openai_api_base=os.environ["OPENAI_API_BASE"],
openai_api_key=os.environ["OPENAI_API_KEY"],
openai_api_type=os.environ["OPENAI_API_TYPE"],
)
'''
Iteration - 1 Output
To get deterministic output from Language Models (LLMs),
you need to set the random seed before generating the output.
By setting a specific seed value, you can ensure that the same
sequence of random numbers is used each time you run the LLM.
This will result in deterministic output as the generated text will be the
same for a given seed. However, note that any changes in the model
architecture or input data can still produce different outputs even
with the same seed.
'''
# Running the same prompt again
'''
Iteration - 2 Output
To get deterministic output from Language Models (LLMs),
you need to set the random seed to a fixed value before generating text.
This ensures that the same sequence of random numbers is generated each time.
In most programming languages, you can use a random seed function to set
the seed value. By using the same seed, you will get the same output from
the LLM for a given input. However, keep in mind that the output may still
vary if the LLM has any sources of non-determinism, such as random sampling
during decoding or the presence of randomness in the underlying model.
Temperature = 0
When temperature is set to 0, the LLM was able to produce the same output again and again which was great.
#Creating LLM model with temperature as 0
llm = AzureChatOpenAI(
deployment_name=os.environ["AZURE_DEPLOYMENT_NAME"],
openai_api_version=os.environ["OPENAI_API_VERSION"],
openai_api_base=os.environ["OPENAI_API_BASE"],
openai_api_key=os.environ["OPENAI_API_KEY"],
openai_api_type=os.environ["OPENAI_API_TYPE"],
temperature = 0
)
'''
Iteration - 1 Output
To get deterministic output from Language Models (LLMs),
you need to set the random seed before generating text.
This ensures that the same sequence of random numbers is used each time,
resulting in consistent output. Additionally, you should disable any
sources of randomness within the model, such as dropout or sampling.
By controlling the input prompt and model configuration, you can obtain
deterministic results from LLMs. However, it's important to note that
deterministic output may limit the creativity and diversity of the
generated text.
'''
# Running the same prompt again
'''
Iteration - 2 Output
To get deterministic output from Language Models (LLMs),
you need to set the random seed before generating text.
This ensures that the same sequence of random numbers is used each time,
resulting in consistent output. Additionally, you should disable any
sources of randomness within the model, such as dropout or sampling.
By controlling the input prompt and model configuration, you can obtain
deterministic results from LLMs. However, it's important to note that
deterministic output may limit the creativity and diversity of the
generated text.
'''
Seed
When seed is set to a constant value e.g. 100, the LLM was able to reproduce the same output.
#Creating LLM model with seed as 100
llm = AzureChatOpenAI(
deployment_name=os.environ["AZURE_DEPLOYMENT_NAME"],
openai_api_version=os.environ["OPENAI_API_VERSION"],
openai_api_base=os.environ["OPENAI_API_BASE"],
openai_api_key=os.environ["OPENAI_API_KEY"],
openai_api_type=os.environ["OPENAI_API_TYPE"],
model_kwargs={"seed": 100}
)
Temperature + Seed
I tried to set these both parameters simultaneously, but I noticed sometimes it works and sometimes it doesn’t work, not sure why probably I will try to figure this out in future.
#Creating LLM model with temperature as 0 and seed as 100
llm = AzureChatOpenAI(
deployment_name=os.environ["AZURE_DEPLOYMENT_NAME"],
openai_api_version=os.environ["OPENAI_API_VERSION"],
openai_api_base=os.environ["OPENAI_API_BASE"],
openai_api_key=os.environ["OPENAI_API_KEY"],
openai_api_type=os.environ["OPENAI_API_TYPE"],
temperature = 0,
model_kwargs={"seed": 100}
)
A full version of the code is available in my github gist.
In conclusion, achieving deterministic output from LLMs is a nuanced process that requires a deep understanding of the model’s parameters and the nature of randomness in language generation. By carefully adjusting factors such as temperature, top P, top K, and seed, we can steer the model towards more predictable results when necessary. However, it’s crucial to balance the need for consistency with the creative potential of LLMs. As we continue to explore and refine these models, the potential for both structured data extraction and creative content generation remains vast. Embracing the inherent randomness of LLMs while harnessing their power for deterministic tasks will be key to unlocking new frontiers in AI-assisted workflows.
Remember, the journey with LLMs is as much about the destination as it is about the exploration along the way.
Demo: A Jupyter notebook explaining how to solve the problem
What is Temperature, Top P, Top K in LLM? (From Concepts to Code) https://www.youtube.com/watch?v=lH9YPeSq6IA&t=1044s