Quizaic — A Generative AI Case Study

Part 3— Prompting and Image Generation

Marc Cohen
Google Cloud - Community
8 min readJun 21, 2024

--

This is the third in a series of articles about a demo application I created called Quizaic (rhymes with mosaic), which uses generative AI to create and play high quality trivia quizzes.

Here’s a table of contents for the articles in this series:

In the previous article in this series, we defined an application architecture and discussed the technology choices for our implementation. In this article, we’ll dive deeper into the newest element of our design: the generative AI component.

We need to solve two problems with genAI: quiz generation and image generation. We’ll examine each of these tasks separately.

Quiz Generation

Let’s start by visiting the Vertex AI page to experiment with different models to see if we can use them for quiz generation. On the Cloud Console, we can navigate to the Vertex AI Language page:

This gives us the option to select “TEXT PROMPT” in the “Generate text” card:

This takes us to Vertex AI Studio, which provides an interactive playground for experimenting with prompt engineering, which looks like this:

We can use this interface to experiment with prompts in an interactive, iterative fashion. Let’s start with a very simple prompt:

Prompt 1: Generate a multiple choice trivia quiz with five questions on the subject of popular music.

The response looks like this:

Pop Music Trivia Quiz

Instructions: Choose the best answer for each question.

1. Which iconic singer was known as the “Queen of Soul”? a) Aretha Franklin b) Tina Turner c) Whitney Houston d) Mariah Carey

2. What band’s “Thriller” album is the best-selling album of all time? a) The Beatles b) Michael Jackson c) Queen d) Pink Floyd

3. Which of these songs was NOT written by Taylor Swift? a) “Shake It Off” b) “Bad Blood” c) “Blank Space” d) “Roar”

4. Which artist is known for popularizing the genre of “reggae”? a) Bob Marley b) Elvis Presley c) Jimi Hendrix d) David Bowie

5. What popular music festival takes place annually in Coachella Valley, California? a) Glastonbury b) Coachella c) Bonnaroo d) Tomorrowland

Answers:

a) Aretha Franklin

b) Michael Jackson

d) “Roar”

a) Bob Marley

b) Coachella

This is a perfectly good trivia quiz but it’s formatted for human consumption. Using this quiz in our application will require some significant effort to parse the human readable format. Such code would also be brittle, because it would depend on the specific formatting encountered here. Also, it’s possible a different model, or even the same model over time, might make different formatting decisions.

We’d really like to specify a format that is expressive and more easily and reliably parsable by software. JSON is popular and widely supported and solves this problem nicely, so let’s take a look at the quiz in JSON format:

Prompt 2: Generate a multiple choice trivia quiz in JSON format with five questions on the subject of popular music.

This prompt results in the following response:

{
"questions": [
{
"question": "Who is the artist behind the hit song 'Bohemian Rhapsody'?",
"options": [
"Queen",
"The Beatles",
"Led Zeppelin",
"Pink Floyd"
],
"answer": "Queen"
},
{
"question": "Which iconic band released the album 'Thriller'?",
"options": [
"The Rolling Stones",
"Michael Jackson",
"The Jackson 5",
"Queen"
],
"answer": "Michael Jackson"
},
{
"question": "What is the name of the singer who rose to fame with the song 'Baby One More Time'?",
"options": [
"Britney Spears",
"Christina Aguilera",
"Beyoncé",
"Miley Cyrus"
],
"answer": "Britney Spears"
},
{
"question": "Which group is known for the hit songs 'I Want It That Way' and 'Show Me the Meaning of Being Lonely'?",
"options": [
"Backstreet Boys",
"NSYNC",
"Take That",
"New Kids on the Block"
],
"answer": "Backstreet Boys"
},
{
"question": "Who is the artist who holds the record for the most number one hits on the Billboard Hot 100 chart?",
"options": [
"Madonna",
"The Beatles",
"Mariah Carey",
"Elvis Presley"
],
"answer": "Mariah Carey"
}
]
}

Now we have something we can work with! Here’s a little bit of Python code we could imagine writing to parse this result:

import json

raw_quiz = generate_quiz()

# parse quiz
parsed_quiz = json.loads(raw_quiz)

# and display the results...
for question in parsed_quiz["questions"]:
print(f"question: {question}")
i = 1
for option in question.options:
print(f"response {i}: {option}")
i += 1
print(f"answer: {question.answer}")

This code works fine with the response provided above, but what happens if our large language model suddenly decides to use the tag responses instead of options, or correct instead of answer. Those terms are just as reasonable and we’ve left it up to the LLM to decide which ones to use. That makes our code brittle. Let’s iterate our prompt to stipulate the tag names and the general structure we want to see in the output JSON format:

Prompt 3: Generate a multiple choice trivia quiz in JSON format with five questions on the subject of popular music. The output should be a JSON array of questions, each of which is an object containing quoted keys “question”, “responses”, and “correct”.

This yields the following response:

[
{
"question": "Which band released the album \"Thriller\"?",
"responses": [
"The Beatles",
"Queen",
"Michael Jackson",
"The Rolling Stones"
],
"correct": 2
},
{
"question": "Who is the lead singer of the band Coldplay?",
"responses": [
"Chris Martin",
"Adam Levine",
" Bono",
"Dave Grohl"
],
"correct": 0
},
{
"question": "What year did the song \"Bohemian Rhapsody\" by Queen come out?",
"responses": [
"1975",
"1984",
"1969",
"1973"
],
"correct": 0
},
{
"question": "Who is known as the \"King of Pop\"?",
"responses": [
"Elvis Presley",
"Prince",
"Michael Jackson",
"David Bowie"
],
"correct": 2
},
{
"question": "Which band was originally called \"The Quarrymen\"?",
"responses": [
"The Beatles",
"The Rolling Stones",
"The Who",
"Led Zeppelin"
],
"correct": 0
}
]

Now we can more confidently rely on the structure and tag names in our JSON output because we’ve specified exactly what we want in our prompt.

We’d also like to specify various parameters about the quiz. In addition to the quiz topic and number of questions, both of which we’ve already included in the prompt, we might want to stipulate other things like:

  • the number of responses per question
  • the difficulty level
  • the desired (human) language

We can do this by adding a bullet list of parameters to our prompt, like this:

Category: Pop Music
Quiz difficulty level: medium
Number of questions: 5
Number of responses per question: 4
Quiz language: English

We also might like to include some rules to ensure our responses adhere strictly to our needs. We’ve already specified one rule about the desired JSON format. We can include additional rules and constraints in a bullet list, like this:

RULES:

- Accuracy is critical.
- Each question must have exactly one correct response, selected from the responses array.
- Output should be limited to a JSON array of questions, each of which is an object containing quoted keys “question”, “responses”, and “correct”.
- Don’t return anything other than the JSON document.

Combining all of the above revisions into one prompt gives us our final prompt:

You are a trivia expert. Generate a set of multiple choice quiz questions.

Category: {topic}
Quiz difficulty level: {difficulty}
Number of questions: {num_questions}
Number or responses per question: {num_answers}
Quiz language: {language}

RULES:

- Accuracy is critical.
- Each question must have exactly one correct response, selected from the responses array.
- Output should be limited to a JSON array of questions, each of which is an object containing quoted keys “question”, “responses”, and “correct”.
- Don’t return anything other than the JSON document.

Note that we’ve incorporated two additional tricks into this final prompt:

  1. We’re conditioning the model, which means giving it a specific role within which to formulate its response, by telling it “You are a trivia expert”.
  2. We’ve templatized our parameters by changing the hard-coded values into variables surrounded by curly braces, e.g. we converted “Pop Music” to “{topic}”. This enables us to easily substitute user-provided parameters into the prompt.

We now have a reasonably functional and reliable prompt, that we can combine with our application. Here’s a snippet of Python code that reads this prompt from a file and uses it to generate a custom trivia quiz for a given quiz topic and other parameters provided by the end user:

file_path = os.path.join(os.path.dirname(__file__), f"../prompt.txt")
with open(file_path, encoding="utf-8") as fp:
self.prompt_template = fp.read()

prompt = self.prompt_template.format(
topic=topic,
num_questions=num_questions,
num_answers=num_answers,
language=language,
difficulty=difficulty,
)
prediction = self.predict_llm(
MODEL, prompt, temperature, MAX_OUTPUT_TOKENS, TOP_P, TOP_K
)
quiz = json.loads(prediction)
return quiz

Image Generation

Most people are familiar with Large Language Models which generate text but similar technology can be used to generate imagery. In this app, we’ll use Google’s Imagen model to generate images from a text prompt to create a unique customized image associated with each quiz. These images will be displayed in quiz cards in our user interface, like this one:

This part is easy because the quiz topic gives us a little snippet of text to indicate the desired image. Here’s a simple prompt we can use, where the quiz topic is templatized:

Generate a photorealistic image about {topic}.

Here’s a snippet of Python code that uses this prompt with Google’s Imagen model to generate a custom image for a given quiz topic:

prompt = "Generate a photorealistic image about {topic}"
def generate_images(topic, number_of_images=1):
prompt = PROMPT_TEMPLATE.format(topic=topic)
print(f"Generating {number_of_images} image(s) with prompt: {prompt}")

model = ImageGenerationModel.from_pretrained(IMAGEGEN_MODEL)
images = model.generate_images(
prompt=prompt,
number_of_images=number_of_images,
negative_prompt=NEGATIVE_PROMPT,
)
return images

That’s all we need to get an AI-generated image for a specific quiz topic.

In the next article in this series, we’ll explore how to make sure our quizzes are of sufficient quality, and how to minimize hallucinations.

Next Article: Part 4— Assessing Quiz Quality

--

--

Marc Cohen
Google Cloud - Community

Engineer@Google - My mission is building tools, tutorials, and other educational artifacts to make computing and data science more accessible and inclusive.