Gen-AI in Action: Using LLM to Automate Coca-Cola Title Normalization

12 min readJul 21, 2024

Introduction

Ever found yourself drowning in a sea of oddly named Coca-Cola titles from various restaurants? Well, that’s my typical once-a-month chore! Our company constantly struggles with normalizing a massive list of Coca-Cola products: meal deals, solo drinks, and crazy combos. To rescue us from this constant headache, I built an LLM application using OpenAI’s 3.5 API — our tech superhero swooping in to save the day!

Working with the API was quite the roller-coaster. Feeding it examples felt like training a puppy with treats. Our main goal was simple. We wanted our model to normalize those wacky product titles to business standards with minimal prompting. Speed wasn’t a priority since this application runs periodically, not in real time.

And oh, let’s not forget the funniest part of this journey! Imagine my surprise when I realized that overloading the LLM with too many detailed instructions made it cranky. Yes, our dear LLM would also get “overfit” like other machine learning models and forget everything else — like a computer experiencing a caffeine crash! ☕💤

Stay tuned for more of this tech adventure (trust me), it gets even better!

Overview of the Technology Stack

For this normalization adventure, I chose OpenAI’s 3.5 API as the backbone of our project. This decision was primarily driven by budget constraints, but it turned out to be quite capable. The API allowed us to create prompts that the model could learn from (like training a puppy with treats, remember? 🐶). Besides the API, the usual suspects like Python for scripting and some data pre-processing tools were part of my toolkit. Simple yet effective!

Prompt Engineering Techniques

In crafting the prompt for the LLM application, several techniques were used. These can be categorized into General Techniques and Requirement-Based Techniques to ensure effective and standardized normalization of Coca-Cola product titles. Different delimiters like triple commas (‘ ‘ ‘) and triple angle brackets (<<<>>>) are used in the prompt to help the LLM easily identify different sections in the prompt.

Following is the final prompt of my LLM application:

system_message_content = """
    You are an advanced large language model tasked to follow the user instructions strictly.
    To perform the instructed task, learn the patterns from examples delimited by <<<>>>.
    <<<>>>
    Example 1: User input the title = 'Coca-Cola Zero Sugar Zero Koffeinfri 1,5 l'.
    The approach I used to normalize the input title for example 1 is based on following steps:
    First, I translated the whole title to make sure that I can easily understand each word.
    Second, I used the translated title and looked into the ground_truth_flavors which are enclosed in triple commas ''' and tried to match the best
    possible flavor to make sure that it meets the standard as per my organization.
    Third, I removed and add the words in translated title to exactly match with ground_truth_flavor and kept other words as it was in translated title.
    Fourth, After modifing the title in third step, I realized that the volume of drink doesn't look appropriate for display so I did some modifications.
    Finally, my normalized title became => Coca-Cola Zero Koffeinfri 1.5 l.

    Example 2: User input the title = 'Jag vill ha Coca-Cola på köpet'.
    The approach I used to normalize the input title for example 2 is based on following steps:
    First, I translated the whole title to make sure that I can easily understand each word.
    Second, I used the translated title and try to figure out that is it appropriate for display and found out that it sounds like a sentence which
    is not appropriate for a coca-cola product title display so, I removed the extra words and kept only Coca-cola. As there is no flavor or volume
    otherwise I would keep them as well. The extracted version in this case, was => Coca-Cola
    Third, I used the extracted version from second step and looked into the ground_truth_flavors which are enclosed in triple commas '''
    and tried to match the best possible flavor to make sure that it meets the standard as per my organization.
    Fourth, I modified the extracted version with the matched flavor from third step then my modified version was => Coca-Cola Original Taste
    Finally, I again carefully examined this modified version for display purpose so I realized that it misses volume then I added 33 cl into the
    modified version. But, one thing I always kept in my mind that if the volume was present then I wouldn't add 33 cl in this title.
    In this way, my final normalized version looked like this => Coca-Cola Original Taste 33 cl

    Example 3: User input the title = '1 Nduja och 1 Coca Cola'.
    The approach I used to normalize the input title for example 3 is based on following steps:
    First, I translated the whole title to make sure that I can easily understand each word.
    Second, I used the translated title and try to figure out that is it appropriate for display and found out that it sounds like a deal meal which
    is totally appropriate to display because restaurant could sell the coca-cola in deal, so, I decided to just modify the coca-cola title by keeping
    rest of the words same.
    Third, I used the translated title from first step and proceeded with my decision from second step. To modify the coca-cola title, I looked into the
    ground_truth_flavors which are enclosed in triple commas ''' and tried to match the best possible flavor to make sure that it meets the standard as
    per my organization.
    Fourth, I modified the translated title with the best matched flavor from third step then my modified version was => 1 Nduja and 1 Coca-Cola Original Taste
    Finally, I again carefully examined this modified version for display purpose so I realized that it misses volume then I added 33 cl into the
    modified version. But, one thing I always kept in my mind that if the volume was present then I wouldn't add 33 cl in this title.
    In this way, my final normalized version looked like this => 1 Nduja and 1 Coca-Cola Original Taste 33 cl

    Example 4: User input the title = '1 Margherita + 1 Coca Cola/Zero'.
    The approach I used to normalize the input title for example 4 is based on following steps:
    First, I translated the whole title to make sure that I can easily understand each word.
    Second, I used the translated title and try to figure out that is it appropriate for display and found out that it sounds like a deal meal which
    is totally appropriate to display because restaurant could sell the coca-cola in deal. I realized that the Coca-cola in this case have forward
    slash '/' which means that two versions of coca-cola are available in this deal, so, I decided to modify the both versions of coca-cola by keeping
    rest of the words same.
    Third, I used the translated title from first step and proceeded with my decision from second step. To modify the both versions of coca-cola, I
    looked into the ground_truth_flavors which are enclosed in triple commas ''' and tried to match the best possible flavors with each version to make
    sure that it meets the standard as per my organization.
    Fourth, I modified the translated title with the best matched flavors from third step then my modified version
    was => 1 Margherita + 1 Coca-Cola Original Taste/ Coca-Cola Zero
    Finally, I again carefully examined this modified version for display purpose so I realized that infact there are two versions of coca-cola but there should
    be volume present to clearly show the customer so, I added 33 cl into the modified version.  But, one thing I always kept in my mind that if the
    volume was present then I wouldn't add 33 cl in this title.
    In this way, my final normalized version looked like this => 1 Margherita + 1 Coca-Cola Original Taste/ Coca-Cola Zero 33 cl

    Example 5: User input the title = 'Coca-Cola glasflaska 0,5 liters'.
    The approach I used to normalize the input title for example 5 is based on following steps:
    First, I translated the whole title to make sure that I can easily understand each word.
    Second, I used the translated title and try to figure out that is it appropriate for display and found out that it sounds like a title with some
    unnecassary words which don't make sense in display as its obvious, so, I decided to remove the unnecessary words from the translated title. In
    this case the title after removing extra words looked like this 'Coca-Cola 0,5 liters'.
    Third, I used the modified version from second step and looked into the ground_truth_flavors which are enclosed in triple commas ''' and tried
    to match the best possible flavor to make sure that it meets the standard as per my organization as there is no flavor present in the modified version
    so I would choose the 'Coca-Cola Original Taste' from ground_truth_flavors.
    Fourth, I removed and add the words in modified version to exactly match with ground_truth_flavor and kept other words same as it was in modified version.
    Finally, I carefully examined at my modified version from fourth step and realized that the volume of the drink doesn't look appropriate so I decided to convert it,
    my final normalized title became => Coca-Cola Original Taste 50 cl.

    <<<>>>

    '''
    Ground_Truth_Flavors:
    {ground_truth}
    '''

    NOTE: Modify your normalized title by observing patterns from the following provided examples of user input titles on left side and their final normalized title on right side separated by '---'.
    a. Coca-Cola / Pepsi --- Coca-Cola Original Taste/ Pepsi 33 cl (In this example, the volume of drink was missing and there was no explicit word which could be interpereted as volume, therefore we added base volume which is 33cl)
    b. Coca-Cola Original Taste 1,5 liter --- Coca-Cola Original Taste 1.5L
    c. Coca cola sockerfri 0,5 liter --- Coca-Cola Zero 50 cl
    d. Coca-Cola Zero Sugar 1.5L --- Coca-Cola Zero 1.5L
    e. Stor Coca Cola Zero --- Coca-Cola Zero 2L (In this example, Stor means Large so we interpret Large 2L volume of coca-cola)
    f. Medium Coca Cola Zero --- Coca-Cola Zero 50 cl (In this example, medium is interpreted as 50cl)
    g. Coca-Cola Zero Sugar 0kr --- Coca-Cola Zero 33cl (In this example, removed 0kr as it is unnecessary word and the volume was missing so added base volume which is 33cl)
    h. Choose between Coca Cola/Zero/Sprite --- Coca Cola Original Taste/ Coca-Cola Zero/ Sprite (In this example, removed the extra words between and choose because by slashes(/) consumer can easily know it choosable)
    i. Coca-Cola 1/5 l --- Coca-Cola Original Taste 1.5L
    j. Coca Cola --- Coca-Cola Original Taste 33cl (In this example, there is no specific flavor given so I proceeded with base flavor which is 'Original Taste' then the volume of drink was missing and there was no explicit word which could be interpereted as volume, therefore we added base volume which is 33cl)
    k. 70 Deluxe 4 Coca-Cola zero 2 Edamame --- 70 Deluxe 4 Coca-Cola Zero 33cl 2 Edamame (In this example, the volume of drink was missing and there was no explicit word which could be interpereted as volume, therefore we added base volume which is 33cl)
    l. Dryck 0,4cl - Coca Cola Zero --- Coca Cola Zero 40 cl (In this example, the dryck keyword was unnecessary and 0,4cl as volume was written in the starting so I adjusted the positions of coca cola title and then its flavor and then its volume.)
    m. Coca Cola Stor 2L --- Coca-Cola Original Taste 2L (In this example, base flavor is choosen as there was no explicit mention of flavor, Stor and 2L both represents the volume of this drink So I decided to proceed with numerical volume 2L, there is no need to write both stor and 2L)

    Expected Output: Your output should be in JSON format: {response_format}
  """

user_message = f"Normalize the title: {input_title}"

Let’s understand the techniques, used in the above prompt.

General Techniques

System Instruction:

Sets the stage for the LLM, making sure it follows user instructions strictly.

system_message_content = """
 You are an advanced large language model tasked to follow the user instructions strictly. …
"""

Example-based Learning:

Helps the LLM recognize patterns and understand the kind of output expected.

<<<>>>
Example 1: User input the title = 'Coca-Cola Zero Sugar Zero Koffeinfri 1,5 l'.
The approach I used to normalize the input title...
<<<>>>

Requirement-Based Techniques

Translation for Clarity:

Allows the model to work effectively across different languages and formats.

First, I translated the whole title to make sure that I can easily understand each word.

Pattern Recognition:

Purpose: Guides the LLM in modifying titles to fit business standards.

Third, I removed and added the words in the translated title to exactly match the ground_truth_flavor...

Iterative Refinement:

Ensures the model methodically improves the title through each step.

Finally, my normalized title became => Coca-Cola Zero Koffeinfri 1.5L.

Conditional Formatting:

Addresses edge cases to ensure a polished final output.

a. Coca-Cola / Pepsi --- Coca-Cola Original Taste/ Pepsi 33 cl (In this example, the volume of drink was missing...

Final Format Check:

Ensures the output is in the desired format for easy business integration.

Expected Output: Your output should be in JSON format: {response_format}

By leveraging both general and requirement-specific techniques, I guided the LLM to consistently produce normalized product titles that meet the business standards and are ready for display.

Testing and Validation

After training the model, one crucial step is testing. This is the moment where we find out if our model is a genius or just good at bluffing! For this, I crafted a new prompt that compared the generated normalized titles with ground truth labels. By calculating the accuracy percentage out of 100%, I could gauge the precision of the generated outputs. This process was like a lie detector test for our LLM — making sure it wasn’t cutting corners!

Here’s the prompt I used for accuracy calculation:

accuracy_check_prompt = """
    You are a detailed-oriented large language model tasked to compare generated titles with ground truth labels.
    The generated title: {generated_title}.
    The ground truth label: {ground_truth}.

    Expected Output: Your output should be in JSON format: {response_format}
"""
usermessage = 'Compare the <generated title> with <ground truth label> and calculate the accuracy percentage from 100%.'

Running this prompt was like sending our LLM through an exam — no cheating allowed! It systematically compared the generated titles against our gold standard, giving us an accuracy score that told us if our LLM was top of the class or needed a bit more homework.

By doing this, we could easily identify titles that didn’t hit the 100% mark, allowing us to manually step in and correct only those specific cases for production. Because let’s face it, even superheroes need a sidekick sometimes!

Challenges Faced

Working with GPT-3.5 was like dealing with a smart but slightly overconfident student. One major challenge I faced was when I tried to incorporate a scoring system into the prompt. The idea was to assign points for each step followed correctly, totaling up to 10 points. The plan was to calculate the accuracy of the generated output out of 10 within the same prompt.

But guess what? Our dear LLM started handing out 10 or 9 points to itself every time, even when it was completely off-track! 🥴 It was like it had decided to be its own favorite teacher, always giving full marks for effort regardless of accuracy. In reality, it wasn’t even close!

To tackle this quirky challenge, I had to go back to the drawing board and write a new prompt specifically for testing and accuracy measurement. Because let’s face it, even the best models need a reality check sometimes!

Conclusion

No doubt, LLM applications are revolutionizing every industry they touch. If I had tried to solve this problem with a traditional machine learning solution focused on NLP, it would have required significant time to train from scratch, not to mention the hurdles of deployment and maintenance. But with an LLM, this problem was resolved in just a week!

Now, while GPT-3.5 might be the weakest link in the OpenAI family compared to models like GPT-4 and GPT-4-turbo, it’s also the fastest. This meant I had to draft detailed prompts to get it to perform optimally, whereas GPT-4 was surprisingly adept with just one example.

To write a comprehensive prompt, here are four basic rules to follow:

Clear Instructions: Define the task and the role of the LLM explicitly.
Relevant Examples: Provide several examples to teach the model the patterns and expected outputs.
Iterative Refinements: Break down the task into structured steps for the LLM to follow.
Edge Case Handling: Include special cases to ensure robustness and handle exceptions gracefully.

Remember, even the most advanced models need a little hand-holding to start with. And just like any good relationship, communication is key!

Cheers to smoother workflows and fewer manual headaches — thanks to the magic of LLMs!

Appendix

Following is the main function of our superhero:

def main(api_key, input_title, correct_titles, response_for, user_mes):

    client = OpenAI(api_key=api_key)

    system_message = system_message_content.format(ground_truth = correct_titles, response_format = response_for)

    user_message = user_mes

    response = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message}
      ],
      temperature = 0,
      seed = 1234,
      response_format={"type": "json_object"}
    )

    best_match = response.choices[0].message.content.strip()
    return json.loads(best_match)