Quizaic — A Generative AI Case Study

Part 2 — Architecture and Technology Choices

Published in

Google Cloud - Community

6 min readJun 14, 2024

This is the second in a series of articles about a demo application I created called Quizaic (rhymes with mosaic), which uses generative AI to create and play high quality trivia quizzes.

Here’s a table of contents for the articles in this series:

Part 1 — Background and Demo
Part 2 — Architecture and Technology Choices
Part 3 — Prompting and Image Generation
Part 4 — Assessing Quiz Accuracy
Part 5 — Lessons Learned

In this article, we’ll define our application architecture and summarize the technology choices we’ve made to implement Quizaic.

Data Model

Let’s start with a data model. In order to manage information flow in this app, we want to enumerate a set of objects we’ll use to maintain the state of the system and the operations we’ll need to perform. Spending some time thinking about those considerations yielded the following data structures:

admins — models system administrators
generators — models the resources used to generate quizzes and images
quizzes — models the content of trivia quizzes
sessions — models the notion of sequential quiz playing experience
results — models the in-progress or completed results for a given quiz session

We also identify four key personas for this app:

admin — system administrators
creator — quiz creators
host — quiz hosts
player — quiz players

Access Model

Now we can diagram the transaction flow through our system by defining the relationships between the data structures and the personas:

As you can see:

admins can read and write the admin data and the generators (e.g. add other admins, and define new quiz generators).
creators and hosts can read the set of generators (needed to list the models available for quiz creation), can read quizzes, sessions, and results (all of which are needed for quiz hosting ), and can write quizzes (for quiz creation) and sessions (for quiz hosting).
players can read quizzes and sessions (needed to participate in a given quiz) and write results (so their responses are recorded).

API

This analysis leads us to the specification of a RESTful API for supporting the access paths noted above, along with associated access requirements:

System Architecture

We can now define a system architecture leveraging the above data structures, personas, data flow, and our RESTful API:

We have a web browser user interface on the left (which could also be a mobile app). This browser or app uses HTTP to fetch the artifacts required to render the user experience from a component called the UI Server.

The UI server, in turn, uses the RESTful HTTP interface we defined above to perform the data management for quizzes, sessions, results, etc. Those requests are sent from the UI server to another component called the API server, which is responsible for implementing the RESTful API.

The API Server, in turn, makes calls to a storage engine (to create, replace, update, and delete data objects) and also sends requests to large language models to generate quizzes and images.

Note the top arrow from the storage module back to the web browser. This represents a special requirement for this app: we require an efficient way to directly and spontaneously notify (potentially many) players whenever certain state changes occur. For example, whenever the host advances the current question in a given quiz, all players must be notified with low latency. In order to support scaling to large numbers of concurrent players, we cannot tolerate the overhead associated with players polling for updates — asynchronous notification is critically important here.

Technology Choices

The next step is to choose a collection of implementation options for each of our system components. We’ll review those choices one at a time and explain our justifications for each.

User Interface

We chose to implement our client user experience using Flutter (flutter.io) for several reasons:

Flutter is widely used, highly regarded, and well supported.
Flutter gives us the ability to support numerous runtime platforms with one single codebase. This is very useful for our app because the typical creator/host will likely use a web browser, while the typical player will likely want to use a mobile app or mobile browser.
Flutter supports mobile-like behavior in web apps, which gives a native look and feel for both mobile and web environments. For example, Flutter makes it easy to perform and fine-tune native app-like transitions and animations.
Flutter provides strong support for Material Design, which is a mature and powerful framework developed by Google for designing user experience elements.
Flutter provides strong support for several Google Cloud components we’ll also be using, incuding Firestore and Vertex AI.

Computing

Google Cloud Run provides a perfect solution for hosting our UI and API servers for the following reasons:

It promotes the abililty to encapsulate our services in Docker containers, which make it easy to exercise the same code and configuration while testing locally, in a staging account, or running in production.
Cloud Run auto-scales our services, meaning it automatically provisions resources based on demand, so we’ll never have too few resources to serve all requests and we’ll never waste money paying for more resources we don’t need.
Cloud Run is economical because we only pay for resources we actually use. Our monthly bill will be proprotional to the amount of traffic we serve.
Cloud Run offers many useful and powerful support services, like logging, alerting, and analytics, which make it easy to monitor, manage, and debug your applications.

Storage

Cloud Firestore works very well for our storage mechanism because:

Our data is hierarchical and fits nicely into Firestore’s data representaton model. As you can see from the screenshot above, we can model our data very naturally in Firestore — we have one collection in Firestore corresponding to each of the key data structures in the abstract data model we defined above.
Firestore’s default encoding for reading and writing data is JSON, which is a good match for the code we’ll be writing to convey requests via RESTful HTTP.
Firestore offers real time alerting so that clients can be asynchronously notified whenever a given storage object changes. Recall that this is a critical requirement for our player UI, which needs to be able to respond to quiz hosting events (e.g. advancing to the next question) with low latency.

Generators

Generators are Python objects (not to be confused with the Python language feature of the same name) which encapsulate quiz generation and image generation functions. Under the hood, quiz generators and image generators use Vertex AI, Google’s platform for building modern AI applications.

We’re using Vertex AI because:

it provides access to Google’s foundation models, including modern state-of-the-art large language models.
it offers language bindings in many popular programming languages.
it gives us an interactive playground-style experience, which makes it easy to iterate on our prompts (we’ll see this in more detail in the next article in this series).
it has excellent throughput, scale, performance, and reliability, as do all Google Cloud service APIs.

In the next article in this series, we’ll dive deeper into the challenges of generative AI. We’ll explore prompting, how to generate a quiz according to our requirements, and how to get the most out of Vertex AI for quiz and image generation.

Next Article: Part 3 — Prompting and Image Generation