Building a GenAI Solution — A Hackathon Success Story!

Published in

Engineers @ The LEGO Group

7 min readNov 24, 2023

I always get very hyped to attend tech events. MeetUps, Summits, Hackathons, I’m in! I like their aspect of discovery — you’re not building your own roadmap for learning and upskilling, but rather being presented with new tools and technologies you might not even have thought of exploring before.

Therefore, I naturally signed up for the Generative AI Hackathon hosted at the LEGO® Group, which occurred in early October 2023. We had teams joining from the LEGO Campus in Billund, the LEGO Digital office in Copenhagen, and the LEGO London Hub — all eager to learn, with great ideas in mind.

I hadn’t explored much with GenAI before other than a course on what it is and how it differs from traditional Machine Learning methods, so it was bound to be interesting. I decided to call my team “pAIrate” as I am from the Pirates squad and I love a bad pun.

As the use of the singular noun suggests, I ended up being the solo pirate on the team. I did not abandon ship, though, as I already had an idea in mind to build a solution for generating LEGO set descriptions and was really excited with the opportunity to make it happen.

In this article, I will share with you my learnings from the hackathon and talk about the service that I built on a high level.

Generative AI 101

Most of us in the hackathon had already tried out ChatGPT and other Generative AI tools, as you might have, too. Now, switching hats from user to developer, our first step was diving deeper into the topic and understanding better how it compares to other AI domains and how to interact with the models.

GenAI Models

In essence, what we learned is that those models are capable of generating new content that resembles and imitates human-created data. Unlike traditional Machine Learning, which primarily deals with making predictions or classifications based on existing data, Generative AI aims to produce original outputs such as images, texts, music, or even entire scenarios that mimic the patterns and styles inherent in the training data.

“Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.”
- Generative artificial intelligence on Wikipedia

For instance, if I provide the GenAI model with the contents of an email and ask it to generate a reply, it will:

First, comprehend the structure, tone, and content of the received email through the text analysis it has been trained on;
Based on this understanding, generate a contextually relevant response, matching the tone and style of the received email (or applying different levels of formality if we’ve specified a tone ourselves).

Fun fact: I asked ChatGPT what it thought of us leveraging GenAI models in the hackathon and it told me this:

“Generative AI is like having a robot artist on your team”.

Is this GenAI? — Photo by Possessed Photography on Unsplash

An interesting (..low-key polemic? 🤔) AI-generated answer. What do you think? Please let me know in the comments!

Large Language Models

For my application, I was looking for a model to deal with text. That’s where the concept of a Large Language Model, or LLM for short, comes up. Within the realm of Generative AI models, LLMs refer to models that apply a sophisticated Neural Network architecture designed to process and generate human-like text.

The term “large” in “Large Language Model” denotes the sheer scale of these Neural Networks in terms of parameters, layers, and computational resources required for training. These models consist of numerous interconnected nodes, allowing them to capture complex relationships within the input data and produce highly detailed outputs.

Prompt Engineering

I was particularly interested in Prompt Engineering concepts and how to leverage best practices to get good answers from the LLM model. I really like the following explanation of what prompting is all about:

“Even though generative AI attempts to mimic humans, it requires detailed instructions to create high-quality and relevant output.
In prompt engineering, you choose the most appropriate formats, phrases, words, and symbols that guide the AI to interact with your users more meaningfully. Prompt engineers use creativity plus trial and error to create a collection of input texts, so an application’s generative AI works as expected.”
- Prompt Engineering on AWS’ Cloud Computing Concepts Hub

Two techniques in this domain were particularly interesting to me:

Prompt Templates: through a template, you can easily create your prompt by having a generic base instruction (which will be used for any call to your service), with placeholders to take in your variables corresponding to each request. It is one of the main elements to help pivot your service from a generic Conversational Agent that takes in any instruction to the model, to a purpose-specific implementation.
Prompt Chaining: this means providing a sequence of prompts to guide the model’s understanding so that it generates more contextually relevant responses. To achieve this, we include the output of one prompt in the input of the next.

AI Description Generation Service

With those concepts in hand, I was able to understand what I could accomplish with the LLMs and how to better interact with them. If I were to feed my model some descriptive metadata and ask for a new LEGO set description, it would then use this input and mimic the style and structure of existing descriptions that are present in its training data, crafting a new one so on-point that even our stakeholders at the company might nod in approval (don’t quote me on this though, I promise I haven’t tried it 👀).

From there, I started building my solution to generate LEGO set descriptions, also leveraging other AI services outside the GenAI scope. I started by processing user inputs (text and image), through a Machine Learning platform, then creating instructions for the Large Language Model, and finally passing the response back to the user. The flow works as follows:

The user inputs the image of the new LEGO set that needs a description and basic set information, such as theme, age, and any brand guidelines. This is taken in by our instance of the Machine Learning service;
From the image input, we would like to detect labels (objects and concepts) so we can add more information to the instructions that will go to the model. An Image Label Detector service then returns what has been detected and the level of confidence (that can be compared to our desired threshold);
With the original text inputs and the labels from the image input, we can add them to a pre-built Prompt Template with general instructions for the Large Language Model, and send this prompt through. The model replies with the generated text;
The generated text is transmitted back to the user.

Hey, we’ve got content! 📬

Bonus Features

A couple of extra features I thought could be applied to this service are:

Translating the descriptions to other languages
To be able to serve different locales and translate the descriptions to their corresponding languages, I’ve experimented with both embedding a request for translations directly to the LLM (as part of the prompt) and instantiating a dedicated translation service.
The LLM provided better results, with a more fluid translation and adaptations to the language, instead of a word-by-word translation.
Prompt Chaining for a richer set description
For big LEGO sets, different scenes and features should be captured in the scope of the description. One way to do so would be to adapt the service from describing one picture to having multiple smaller descriptions of parts of the set.
With those in hand, we could leverage the Prompt Chaining technique and ask the model to come up with a final edit that has the context of all other descriptions it has presented thus far.

Developing the solution

The architecture presented here for this solution is agnostic on purpose — so one could use any set of tools and services to build it. Stay tuned for part 2 of this article, in which I will include a guide on how to leverage the corresponding AWS AI services via the console and build all necessary functionality so you can run your very own description generation service.

As mentioned in the “Bonus Features” section, there are also many different ways to adapt, customize, and add value to this service according to requirements.

A success story

I am happy to report that my solution won 1ˢᵗ place in the hackathon! 🎉 This was my amazing prize:

Of course, I wasn’t participating for the prize ( :wink wink: ), and the true reward for me was learning so much about GenAI. I hope you’ve enjoyed the ride, too!

“There’s that word again! GenAI. Why are things so full of GenAI in the future?”
— Dr. Emmett Brown (adapted)