OpenAI Streaming from AWS Lambda

4 min readJun 16, 2023

Building Real-Time AI-Powered WebSocket APIs with AWS API Gateway, Lambda, and OpenAI

Introduction:
Welcome to a comprehensive guide on building real-time AI-powered WebSocket APIs using AWS services. In this tutorial, we will explore how to set up an AWS WebSocket API Gateway that integrates with OpenAI to provide real-time AI responses. We will cover the architecture, AWS SAM template, and even the setup of a CodePipeline for automated deployments. Let’s get started!

Prerequisites
Before diving into the implementation, make sure you have the following in place:
- An AWS account
- Basic knowledge of AWS services, including API Gateway, Lambda, and CloudFormation
- An OpenAI API key (sign up on the OpenAI website if you don’t have one)

Architecture Overview

To understand the components involved, let’s first look at the overall architecture:

AWS WebSocket API Gateway: The entry point for WebSocket communication, allowing bidirectional, real-time communication between clients and the backend.
AWS Lambda Functions: Two Lambda functions will be implemented:
1. Connect Lambda: Handles the initial connection request and establishes a WebSocket connection.
2. Streaming OpenAI Lambda: Processes client requests, calls the OpenAI API, and sends the AI-powered response back to the client.
OpenAI API: The AI-powered service that provides responses to client queries.

Step 1: SAM Template and CloudFormation Implementation

To make the infrastructure deployment easier and repeatable, we will use the AWS Serverless Application Model (SAM) and CloudFormation. The SAM template provides a simplified way to define and deploy serverless applications.

1. Clone the GitHub repository containing the code and SAM template: [GitHub Repository Link]

2. Review the SAM template (`template.yaml`) to understand the resources and configurations defined.

3. Deploy the SAM template using the AWS CloudFormation service:
— Open the AWS Management Console and navigate to the CloudFormation service.
— Create a new stack and provide the necessary parameters, such as the OpenAI API key and AWS IAM role.
— Wait for the stack to be created. This will provision all the required AWS resources, including the API Gateway, Lambda functions, and permissions.

Note: You will need to create roles and setup necessary permissions for lambda role, API Gateway roles.

Step 2: CodePipeline Setup for Continuous Deployment

To automate the deployment process, we will set up a CodePipeline that listens to changes in the code repository and triggers deployments accordingly.

1. Create an AWS CodeCommit repository or connect your existing repository to CodePipeline.

2. Configure the CodePipeline stages:
— Source: Connect the CodePipeline to your CodeCommit repository.
— Build: Set up a build stage using AWS CodeBuild to package the application code and SAM template.
— Deploy: Configure the CloudFormation deployment stage to deploy the packaged SAM template.

3. Save the CodePipeline configuration and let it run. From now on, any code changes pushed to the repository will automatically trigger a deployment.

Step 3: Implementing the Connect Lambda Function

The Connect Lambda function handles the initial connection request and establishes the WebSocket connection. The`connect-lambda` is sample code to interacts with the WebSocket API Gateway, make any necessary modifications or customizations to suit your requirements.

Step 4: Integrating OpenAI with the Streaming Lambda Function

In this step, we will explore how the `Streaming OpenAI` Lambda function integrates with the OpenAI API to provide AI-powered responses in real-time.

Inside the `InvokeOpenai` class, the `read_ssm_parameter` method securely retrieves the OpenAI API key from AWS Systems Manager Parameter Store (SSM). The key is stored as an encrypted parameter, ensuring its security.
The `call_openai` method sets the OpenAI API key by assigning it to `openai.api_key`. This ensures that the API calls made by the `openai` Python library are properly authenticated.
To generate AI-powered responses, the `call_openai` method uses the `openai.ChatCompletion.create()` method. This method sends the user’s request to the OpenAI API and receives responses in a streaming manner.
Within the `openai.ChatCompletion.create()` method, we provide a list of messages as input. It includes a system message that sets the context, stating that the assistant is helpful, and the user’s message containing the request passed to the Lambda function.
By setting `stream=True` and `stop=None`, we enable the streaming functionality of the OpenAI API. This means that instead of waiting for the complete response, we can start receiving responses as they become available.
Inside the `for` loop, we extract the generated response from `resp.choices[0][“delta”][“content”]` and if it is not empty we send back results to the WebSocket connection using `self.conn.post_to_connection()` method.

By following these steps, the `Streaming OpenAI` Lambda function seamlessly integrates with the OpenAI API and provides AI-powered responses to WebSocket clients in real-time.

Step 5: Client-Side Implementation

Now that the backend is set up, let’s create a client-side script that interacts with the WebSocket API Gateway.

1. Review the `client.py` code in the repository to understand how it establishes a WebSocket connection and sends requests.

2. Make any necessary modifications to the script, such as updating the WebSocket API Gateway URL or customizing the request payload.

3. Run the client script locally to test the WebSocket connection and observe the AI-powered responses.

Congratulations! You have successfully set up a real-time AI-powered WebSocket API using AWS services and OpenAI. Feel free to explore the GitHub repository for additional customization options or advanced features.

Conclusion:
In this tutorial, we explored how to build real-time AI-powered WebSocket APIs using AWS API Gateway, Lambda, and OpenAI. We covered the architecture, SAM template, CloudFormation implementation, and even set up a CodePipeline for continuous deployment. By following the steps outlined, you can create powerful WebSocket APIs that leverage AI capabilities to provide real-time responses to your clients. Happy coding!

References:
- AWS Serverless Application Model (SAM) Documentation
- OpenAI API Documentation
- Github Repo