A Step-by-Step Guide to Building a Generative AI Application with AWS

6 min readJan 9, 2024

Introduction: Welcome to a comprehensive walkthrough on building a Generative AI application using Amazon Web Services (AWS). In this blog, I will guide you through the process of setting up a Generative AI application that leverages AWS services like Kendra and SageMaker JumpStart. By the end of this blog, you’ll have a fully functional application capable of generating responses based on custom data sources.

Step 1: CloudFormation for Kendra Index

The first step involves utilizing AWS CloudFormation to create a Kendra index. This index will ingest data from specified sources, enabling Kendra to crawl and index the data for a better understanding. A CloudFormation template is provided to automate this process, ensuring a seamless setup. The template includes a section defining the data source, which can be customized to suit your specific needs.

CloudFormation template: https://github.com/sampathbasa/Gen-AI/blob/main/Build_Gen_AI_App/kendra-docs-index-cf.yaml

Step 2: Deploying a SageMaker JumpStart Model Endpoint

With the Kendra index in place, the next step is to deploy a large language model endpoint using SageMaker JumpStart. This process involves creating a SageMaker domain and setting up a VPC. The blog walks you through the quick setup, waiting for the domain to be ready, and launching the JumpStart assets to deploy the chosen model endpoint, in this case, Text2text Generation with Flan-T5 XL from Hugging Face.

Access SageMaker Console:

Open the AWS Management Console.
Search for “SageMaker” and open it in a new tab.

2. Create a SageMaker Domain:

If you haven’t used SageMaker before, it will direct you to the Domains screen.
Scroll down and select “Create Domain.”
Choose the Quick Setup and select “Set Up”.

3. Wait for Domain Setup:

Wait for a few minutes until the domain setup is complete.

4. Open SageMaker Studio:

After the domain is ready, select “SageMaker Studio.”
If prompted to create a domain, refresh your browser and then select “Open Studio.”

5. Access SageMaker JumpStart:

In SageMaker Studio, select “SageMaker JumpStart” from the left menu.

6. Launch JumpStart Assets:

Under SageMaker JumpStart, select “Launched JumpStart Assets.”

7. Navigate to Model Endpoints:

Within JumpStart Assets, select “Model Endpoints.”

8. Create Model Endpoints:

Select “Create Model Endpoints” to initiate the process of deploying a model.

9. Choose a Pre-trained Model:

Browse through the available pre-trained models.
In this example, the Text2text Generation model “Flan-T5 XL” from Hugging Face is selected.

10. Deploy the Model:

Select “Deploy” to begin the deployment process.
It may take a few minutes to create an endpoint that the application will interact with.

11. Wait for Endpoint Creation:

Allow some time for the SageMaker endpoint to be created.

12. Retrieve Endpoint Name:

Once the endpoint is ready, note down the endpoint name displayed.

13. Application Integration:

The endpoint name obtained will be used in the Python application to interact with the deployed model.

14. Finalization:

The process is complete. The SageMaker JumpStart model endpoint is now ready for interaction with the Generative AI application.

Step 3: Creating a Python Application on Cloud9

Now, it’s time to build a simple Python application that interacts with Kendra and the model endpoint. AWS Cloud9, a cloud-based integrated development environment, is used to configure the application. The blog covers creating a Cloud9 environment, installing the required dependencies, and setting up the environment variables for AWS region, Kendra Index ID, and the SageMaker model endpoint.

Open Cloud9:

In the AWS Console, search for “Cloud9” in the services search bar.
Click on “Cloud9” to open the Cloud9 IDE.

2. Create a New Environment:

Click on “Create Environment.”
Provide a name for your environment (e.g., myC9env).
Choose the default settings for EC2 instance and other configurations.
Click “Next” and review your settings.
Click “Create Environment.”

3. Wait for Environment Creation:

After creating the environment, wait for a few minutes for AWS to set up the Cloud9 environment.

4. Access Cloud9 IDE:

Once the environment is ready, select it using the radio button.
Click “Open Cloud9” to access the Cloud9 IDE.

5. Configure Python Version:

In the Cloud9 terminal, check the default Python version by typing: python --version.
If the version is not 3.8 or above, install Python 3.8 by typing:

sudo amazon-linux-extras install python3.8 -y

6. Create a Virtual Environment:

Create a virtual environment to manage dependencies. Type:

/usr/bin/python3.8 -m venv myenv

7. Activate the Virtual Environment:

source myenv/bin/activate

8. Install Application Dependencies:

Clone the GitHub repository containing the sample code:

git clone https://github.com/sampathbasa/Gen-AI.git

9. Navigate to the directory: Build_Gen_AI_App

Install the required dependencies specified in requirements.txt:

pip install -r requirements.txt

10. Set Up Environment Variables:

Export necessary environment variables for the application to communicate with Kendra and the SageMaker model endpoint.
For example, set the AWS region:

export AWS_REGION=us-east-1

Obtain the Kendra Index ID from CloudFormation stack outputs and set it:

export KENDRA_INDEX_ID=[your_kendra_index_id]

Obtain the SageMaker model endpoint name and set it:

export SAGEMAKER_ENDPOINT_NAME=[your_sagemaker_endpoint_name]

11. Run the Python Application:

Ensure you are in the correct directory containing the Python application.
Run the application by typing:

python kendra_chat_flan_xl.py

Step 4: Testing the Generative AI Application

The blog guides you through testing the Python application by asking it questions. The application queries the Kendra index to retrieve relevant data, sends the prompt along with the data to the model endpoint, and processes the request to provide a response.

Interact with the Generative AI:

The application will prompt you to provide a request or question.
Enter a question like “What is SageMaker?” and press Enter.
Observe the generated response based on the data sources indexed by Kendra.

2. Test Various Queries:

Ask different questions related to the provided data sources (e.g., “Tell me about Lex”).
Observe how the application leverages Kendra to provide contextually relevant responses.

3. Check Unknown Queries:

Test the application with questions that are not covered by the data sources (e.g., “What is Azure Resource Manager?”).
Confirm that the application gracefully handles unknown queries with an appropriate response.

4. Conclude Testing:

Test the application with various scenarios to ensure its robustness.
When satisfied with the testing, exit the application.

Step 5: Cleanup and Resource Deletion

To ensure cost-effectiveness and security, the blog emphasizes the importance of deleting resources after use. Users are guided through the process of deleting the SageMaker endpoint, SageMaker domain, CloudFormation stack, and Cloud9 environment.

Delete SageMaker Endpoint

Navigate to the SageMaker console.
Find and select the model endpoint created.
Click on “Delete Endpoint” to remove the SageMaker endpoint.
Confirm the deletion when prompted.

2. Delete SageMaker Domain

In the SageMaker Studio, go to the “Domains” section.
Select the SageMaker domain.
Choose the default user, click on “Action,” and then select “Delete.”
Confirm the deletion when prompted.

3. Delete CloudFormation Stack

Go to the AWS CloudFormation console.
Find and select the CloudFormation stack.
Click on “Delete” to delete the CloudFormation stack.
Confirm the deletion when prompted.

Step 4: Delete Cloud9 Environment

Open the AWS Cloud9 console.
Find and select the Cloud9 environment created.
Click on “Delete” to delete the Cloud9 environment.
Confirm the deletion when prompted.

Conclusion:

Congratulations! You’ve successfully built a Generative AI application using AWS services. This walkthrough highlights the seamless integration of CloudFormation, Kendra, and SageMaker to create a powerful application capable of generating contextually relevant responses. As you explore the possibilities of Generative AI, always remember to manage your resources wisely to optimize costs and maintain a secure environment. Happy coding!