Building an Intelligent, Contextual Help Service for SaaS using Generative AI

Published in

Skillwell

9 min readOct 5, 2023

In the era of digital transformation, software solutions are growing exponentially in complexity and capabilities. As this complexity increases, so does the challenge of providing intuitive, effective support resources to end-users. Traditional user manuals and FAQs often fall short, leaving users frustrated and increasing the burden on customer support teams. What if there was a smarter way to deliver pinpointed, context-relevant help at the click of a button? Enter Generative AI, a groundbreaking technology that has the potential to revolutionize how we approach contextual help in Software as a Service (SaaS) environments. In this blog post, we will delve into the mechanics of building an intelligent, context-aware help service using Generative AI and AWS technologies.

Let’s begin by painting a picture of what our ideal, intelligent, contextual help service would look like in a SaaS application.

Working Backwards: The Ideal Solution

Imagine a user navigating through a complex SaaS application for enterprise resource planning (ERP). They reach a dashboard filled with a myriad of options for generating various financial reports. Feeling overwhelmed, they click on a help icon. Instantly, a side panel appears on the screen, recognizing the user’s current context – ‘Financial Reporting Dashboard.’

Rather than offering a generic list of FAQs or directing the user to an extensive manual, the help panel provides immediate, context-sensitive assistance tailored to the specific financial reports available on the dashboard in use. For example, when a user enters a query like ‘How to create a quarterly revenue report?’, the system leverages contextual information and machine learning to present a step-by-step guide relevant to their current dashboard.

Components of the Ideal State

To realize this ideal state, several key components must come into play:

Context Awareness: The ability to understand the user’s current location within the application and the associated tasks.
Intelligent Query Processing: A system adept at interpreting user queries, even if they are not perfectly formulated, by leveraging machine learning and contextual information.
Dynamic Content Generation: The ability to generate real-time, context-specific guidance, as opposed to relying on pre-written, static information.
Seamless User Experience: The help should be integrated within the application’s UI, allowing users to receive guidance without disengaging from their current tasks.
Scalability, Performance, and Cost-Effectiveness: The infrastructure must be adaptable to varying loads, capable of delivering prompt responses, and maintain cost-effectiveness — a set of challenges readily addressed by AWS services.

By envisioning the ideal solution, we’ve established a clear target. Now, let’s work backwards to explore the technologies and methodologies required to bring this vision to life, focusing on the use of Generative AI and AWS services.

Generative AI and AWS Services

Now that we have a firm grasp of our ideal solution, let’s reverse-engineer our way through the technologies and methodologies needed to bring it to fruition. In particular, we’ll delve into the role of Generative AI and AWS services in achieving each of the key components outlined in our ideal state.

Context Awareness

For the service to be truly contextual, it needs to recognize where a user is within the application and what tasks they might be engaged in. This involves a two-pronged approach:

Identifying Context Parameters: The first step is to meticulously define the variables that encapsulate context within the application. This could include the user’s specific location within the software, the actions they have previously undertaken, and any modules or features currently accessed. The goal is to capture a holistic picture of the user’s interaction state.
Context Capture and Transmission: Once these context parameters are defined, the next step is to capture this information in real-time. And whenever a user engages with the help service or initiates a query, these context variables should be automatically captured and transmitted. This could be implemented via event-driven AWS Lambda functions that trigger upon user actions, collecting the context data and passing it along to the content retrieval components of the help system.

Intelligent Query Processing

The cornerstone of any effective, context-aware help system is its ability to not only capture user context but also to interpret user queries accurately.

Orchestrated within an AWS Lambda function, the intelligent query processing component serves this critical role by executing a series of key functions:

Query Filtering: Serving as the first line of defense, the system employs AWS Comprehend for its natural language processing capabilities to screen incoming queries. This includes sentiment analysis and key phrase extraction to identify queries that may contain harmful elements or exhibit malicious intent. Queries flagged by this AWS-powered mechanism are returned with an appropriate informational message to guide the user back to meaningful interactions.
Contextual Embedding: Leveraging machine learning algorithms, the system transforms the user’s query and the captured context parameters into embeddings. Specifically, the Titan Embeddings model housed within the Amazon Bedrock service is invoked to generate these numerical representations.
Content Retrieval: These embeddings are then used to fetch relevant content segments from Amazon OpenSearch.
Context to Model Transmission: Concluding the sequence of actions, the system transmits the retrieved content segments along with the transformed query to an appropriately selected Generative AI model for the generation of the contextually relevant response.

Embeddings are numerical vector representations of text or other data, designed to capture the semantic meaning or relationships between data points. These vectors enable machine algorithms to perform tasks like comparison, clustering, and classification more effectively.

Dynamic Content Generation

One of the key aspects of our context-aware help system is the real-time creation of tailored, contextually appropriate responses. For this, we’ve employed Anthropic Claude V1, available via Amazon Bedrock. The model is responsible for interpreting document segments retrieved from the Amazon OpenSearch database, along with the user’s query, and generating responses that are both coherent and user-friendly.

Instead of building a custom pre-trained model based on our user manuals or fine-tuning existing models, we’ve opted for a Retrieval Augmented Generation (RAG) approach. This methodology effectively combines the advantages of both retrieval-based and generative systems to offer highly tailored solutions.

Benefits of the chosen solution:

Cost-Effectiveness: By utilizing a pre-trained model like Anthropic Claude V1 and employing a Retrieval Augmented Generation (RAG) approach, we significantly reduce the costs associated with training and fine-tuning a model.
Ease of Updates: Storing the user manual segments in an external Amazon OpenSearch database allows for real-time updates to the content without the need to retrain the model. This not only keeps our system up-to-date but also reduces operational overhead.

Seamless User Experience

Delivering a seamless, intuitive user experience stands as a cornerstone in the implementation of advanced features like a context-aware help service. Even the most powerful backend technologies can be rendered ineffective if hindered by a clunky or counterintuitive user interface.

To mitigate such risks, we have engineered the contextual help system as a modular Web Component. This modularization facilitates straightforward, plug-and-play integration across varying parts of the application. In doing so, it ensures that users have ready access to the help system without being disrupted in their workflow.

For performance optimization and enhancing global accessibility, the Web Component is distributed through Amazon CloudFront. This ensures a low-latency, high-speed access to the feature, irrespective of the user’s geographical location, further contributing to a frictionless user experience.

Scalability, Performance, and Cost-Effectiveness

In the context of a context-aware help system, it’s paramount that the architecture is not only robust but also flexible enough to adapt to varying loads. Scalability, performance, and cost-effectiveness are three pillars that uphold the viability of any enterprise-level solution.

Serverless Architecture: A Strategic Choice
A cornerstone of this solution is its serverless architecture, enabled through AWS services. Utilizing serverless components such as AWS Lambda for orchestration, AWS Comprehend for natural language understanding, Amazon OpenSearch Serverless for efficient database operations, and Amazon Bedrock for generative AI tasks, we eliminate the need for dedicated server provisioning and maintenance. This results in three-fold benefits:

Elastic Scalability: The serverless components automatically scale in response to the traffic, ensuring that the system can handle varying loads without manual intervention.
Optimized Performance: Amazon OpenSearch Serverless efficiently manages the indexing and search capabilities, further enhancing system performance.
Cost-Effectiveness: The pay-as-you-go pricing model of serverless architecture means you’re only billed for the actual compute time your functions consume. This feature optimizes cost-effectiveness and thus yields an optimized ROI.

The integration of these serverless services results in a harmonized, high-performing backend. From the moment the user’s context is captured to the point where context-sensitive help is rendered, all operations are seamlessly orchestrated within a single AWS Lambda function. This cohesive workflow not only optimizes resource utilization but also substantially reduces the administrative and operational burdens.

Preparing User Manuals for Contextual Help

Following the “Working Backwards” approach, one key step remains: preparing the user manuals to be fed into Amazon OpenSearch. This involves several steps to transform traditional user manuals into a machine-readable format, segmented, tagged and indexed in a way that makes them easily retrievable based on user queries and context.

Let’s break down the process:

Text Transformation and Segmentation

Text Extraction: Initially, user manuals, which are commonly in PDF format, undergo a conversion process to transform them into plain text. For this task, we leverage PDF Loader, a service offered by LangChain, to ensure accurate and efficient text extraction.
Sectioning: The text is then programmatically divided into larger sections, usually based on headings or chapters.
Segmentation: Each section is further divided into smaller, more manageable segments that cover specific topics or steps.

Context Tagging and Embedding

Metadata and Tags: The same tags that are used for context in the UI are applied to each segment, facilitating coherence between the UI and the back-end help system.
Embedding Generation: Utilizing the Titan Embeddings model available through Amazon Bedrock, we convert each tagged text segment into its corresponding numerical vector representation, known as embeddings.

Storing in Amazon OpenSearch

Data Structuring: Each segment, its corresponding tags, and the generated embeddings are structured in a specific format.
Bulk Upload: This prepared data is then bulk-uploaded to Amazon OpenSearch.

By rigorously following these steps, we prepare a dynamic, context-aware dataset. This data serves as the backbone for our intelligent help system, enabling real-time, context-sensitive assistance that enhances user experience and contributes to a more efficient workflow.

Conclusion

In this technical exploration, we have outlined a robust, context-aware help system that leverages cutting-edge technologies and AWS services. By making strategic choices in architecture and technology, such as opting for serverless components including AWS Lambda, Amazon Bedrock, and Amazon OpenSearch Serverless, we have achieved not only functionality and performance but also cost-effectiveness.

By working backwards from an idealized vision, we have built a framework that doesn’t just answer queries but does so in a contextually aware and user-friendly manner. As technology continues to evolve, so too will the capabilities of this system, offering even more precise and timely assistance to users navigating complex digital interfaces.

Key components of the selected solution included:

Data Storage and Indexing: The user manual content is systematically segmented and then stored in Amazon OpenSearch database. For efficient querying, each segment is accompanied by generated embeddings.
Dynamic Query Processing: Upon receiving a query, the system generates embeddings from the combination of the query text and context. These embeddings are used to perform a rapid search in the OpenSearch database to identify the most relevant content segments.
Contextual Response Generation: We leverage Anthropic Claude V1, a state-of-the-art model available through Amazon Bedrock, to generate human-friendly responses. The model takes the retrieved segments from OpenSearch as context and crafts a detailed, relevant answer.
Seamless Orchestration: All these steps are cohesively integrated and executed within an AWS Lambda function, ensuring a streamlined operation with minimal latency.

Thank you for reading, and stay tuned for more in-depth articles on leveraging AWS services for your SaaS solutions.