Generative-AI-based Application Architecture — 1

Published in

𝐀𝐈 𝐦𝐨𝐧𝐤𝐬.𝐢𝐨

6 min readSep 4, 2023

Human-AI interactions are swiftly going through a massive paradigm shift that Software development has not seen in the past two decades. From traditional rule-based systems to sophisticated generative AI (GAI) models, the approach to deciphering user intentions and delivering accurate, context-rich responses has become pivotal. In this series of articles, I will dive deep into an architecture that leverages GAI to reimagine the way applications interact with their users.

Generative AI (GAI) has a unique ability to craft content or responses that aren’t explicitly pre-programmed. This facilitates a chat-based interaction that’s intuitive and dynamic. It breaks away from the rigidity of traditional chatbots or application interfaces, offering a more organic and human-like conversation experience.

In the forthcoming articles, I will dissect the architecture that makes this possible, from understanding the user’s intent to delivering an apt AI-crafted response after performing the user-intended action.

Let’s start with the basics, in the first article I shall discuss how GAI-driven applications have the potential to transform human-computer interactions. I will be describing as an example two applications — a job board and a food delivery application, and how they can benefit from the GAI-based human-computer interaction.

In the upcoming articles, we shall deep dive into the architecture for GAI-based applications powered by open-source and openAI’s Large Language Models.

Let’s start:

What is Generative AI?

Generative AI is a subset of artificial intelligence that focuses on creating new content. Whether it’s an artwork, music, or human-like text, GAI models are designed to produce outputs that weren’t explicitly present in their training data. The potential of GAI is vast. It can be seen as a canvas where the boundaries are as expansive as our imagination.

Why it’s a Game Changer?

Unlike traditional models that make decisions or predictions based on inputs (like classifying an email as spam or not), generative models generate new content. This distinction has profound implications. From crafting realistic video game environments on the fly to assisting artists and musicians, and more pertinently, transforming user-application interactions, the applications are manifold.

The beauty of GAI lies in its unpredictability within controlled parameters. It does not just select a response; it crafts one, akin to how humans construct sentences in real-time during conversations.

GAI-based Human-Computer Interaction

Human-computer interactions are undergoing a significant transformation. As we transition from conventional rule-based systems to the future Generative AI (GAI) frameworks, the emphasis increasingly leans on their adeptness at unravelling user intentions. These next-generation GAI systems aspire not only to deeply interpret user prompts but also to adeptly undertake intent-aligned actions within the system. This forthcoming shift pledges to redefine the user experience, bridging the gap between human and machine interactions like never before.

GAI-driven applications will transform user interactions by:

Crafting Dynamic Responses: Moving beyond the confines of rule-based systems and their preset answers, GAI applications curate tailored responses in real-time based on the user’s context.
Supporting Continuous Learning: With each user engagement, these models continually adapt, refining their knowledge base and enhancing response accuracy.
Providing Contextual Depth: GAI processes multifaceted user contexts, paving the way for deeper, more insightful dialogues.
Executing User-Intended Actions: Not just limited to dialogue, GAI discerns users’ intentions from conversations and efficiently undertakes the corresponding actions, bridging the gap between interaction and execution.

Traditional vs. Modern Interaction Paradigms — Real-world examples

In traditional systems, applications relied heavily on fixed menus, forms, and rigid pathways for user interactions. A query would match a pre-defined set of rules to generate a response. In stark contrast, our modern GAI-driven approach mimics a fluid conversation and enriched user experience, much like interacting with a well-informed human assistant. The model doesn’t merely react to the user’s input; it understands, anticipates, and crafts unique responses based on a blend of prior interactions, current context, and vast training data and performs the user's intended actions.

Let’s take our first real-world example.

Job Board Website: A Real-world Illustration

Imagine a job board website, a domain that has typically been dominated by lists, filters, and forms. In the conventional setup, a user seeking employment would search for jobs using fixed criteria, browse through the listings, and fill out application forms — a static, often tedious process.

Enter our GAI-driven architecture.

A new user, Jane, visits our job board. Instead of confronting her with an overwhelming form or list, the application greets her with a friendly “Hello! How can I assist you today?” Jane responds, “Hi, I’m looking for a software engineering job in Dublin.” Immediately, the system discerns her intent, reviews available listings, and replies, “I found a Software Engineer position at Google in Dublin posted last week. Would you like more details or assistance applying?”

As the conversation progresses, Jane expresses her desire to learn more about the company culture. The GAI model, instead of just providing a generic company description, crafts a comprehensive answer based on recent reviews, company news, and more. This nuanced interaction continues as Jane updates her profile, inquires about other positions, and navigates the application process — all through intuitive, human-like dialogue.

Let’s move on to our next example

Food Delivery Platform: A Real-world Illustration

A new customer, Alex, logs onto our food delivery platform. Rather than bombarding him with an array of restaurant logos and menu items, the application cheerfully initiates, “Hello Alex! It’s almost lunchtime. Feeling like some fast food today?” Alex, a bit surprised by the personalized touch, responds, “Yeah, I’m thinking McDonald’s. What special offers do they have going on?” Without missing a beat, the system interprets his request, scans the current McDonald’s promotions, and responds, “McDonald’s is offering a Big Mac combo at a 20% discount today. Fancy that or need some other suggestions?”

Intrigued, Alex queries, “Sounds tempting! But I’ve got a meeting in an hour. How long would it take for it to get to me?” Instead of a generic reply, the GAI-powered attendant assures, “Considering your location and the current traffic, you’ll have it in about 30 minutes. That should give you plenty of time before your meeting. Shall I place the order for you?”

As Alex confirms, the system inquires, “Would you like any add-ons? Maybe a milkshake or some apple pie to round off the meal?” The conversation unfolds seamlessly, blending the efficiency of tech with the human touch of a diner interaction, making Alex’s ordering experience not just functional, but delightful.

User Interface
The innovative architecture of our GAI-powered application reimagines user interfaces by aligning them with the real-time chat conversation. As dialogues evolve, the left viewing pane dynamically reflects relevant information. For instance, while the user contemplates the Big Mac deal, details and potential add-ons appear in this space. Similarly, when a user expresses interest in a job on our job board, the pane promptly showcases pertinent details about that position.

Architecture Overview
Leveraging the modern single-page application (SPA) paradigm, GAI-based Application architecture is rooted in a microservices infrastructure. This design choice promotes modular scalability, fault isolation, and optimizes resource utilization, crucial for high-performance cloud-native applications. The user interface consists of two panes the chat pane and the user view pane. This dual-pane configuration ensures synchronous and efficient communication, facilitating real-time information display in parallel to ongoing dialogues.
The chat component integrates the open-source LLM (LLAMA-2), LLAMA-Index, and ChatGPT.

Within the chat component, there are three integral sub-components:

Intent Classification: Intent Classification is based on the RASA NLU and custom pre-trained intent classifier based on the application domain.
Prompt Selection Service: Determines the most relevant system prompts based on classified intent. The prompt selection service uses the llama index to query the most applicable prompt from a prompt index. These prompts are designed in the context of the application domain.
Prompt Execution Service: Dynamically generates contextually appropriate responses or actions in line with the selected prompt.

To provide a holistic view of the GAI-based application’s architectural foundation, I shall be detailing a reference blueprint in subsequent sections. This blueprint elucidates the structure and interaction patterns between various functional components and service modules. Further, it offers insights into data flow, component dependencies, and the technical intricacies that underpin the application’s robustness and efficiency.