How to Build Serverless Generative AI App Using Amazon Bedrock, API Gateway, Lambda, S3 and Postman

The easiest way to build and scale generative AI applications with foundation models!!

Naman Gupta

6 min readJan 2, 2024

🤔 Why to use Amazon Bedrock to build generative AI applications?

AWS Bedrock has achieved HIPAA eligibility and GDPR compliance. Basically, this means is your data or content is not used to improve the base models and is not shared with the third party model providers. So with AWS Bedrock your data is always encrypted in transit and at rest.
Amazon Bedrock supports foundation models from industry-leading providers. For example: Cohere, Anthropic, Meta, etc. Therefore we can choose the model which is best suited for our use-case.

📚What are foundation models?

Foundation models are large and pre-trained deep learning neural networks. Basically these models are trained on massive amounts of unstructured and diverse data.
The key idea behind foundation models is to pre-train a model on a diverse dataset, exposing it to a wide range of language patterns, concepts, and styles. Once the model is trained then we can fine-tune that model for the specific tasks or applications, such as language translation, summarization, question-answering, image generation, etc.

🎯Problem Statement

Let’s say you have a meeting and due to some reason you were not able to attend the meeting. So after the meeting someone sent you the meeting transcript or the meeting notes and you just don’t have a time to read those notes or a transcript. So at this point all you want is to know what the meeting was all about in a clear and concise manner.

Basically this is a text summarization problem.

🌍In this article you will going to learn

How to call Amazon Bedrock API via the lambda function.
How to integrate lambda function with API Gateway.
How to Store output to the S3 bucket.
How to call API Gateway endpoint using Postman.

✅️Prerequisites

To implement the solution provided in this article, you should have an AWS account, must be able to get access to the Anthropic’s Claude model and familiarity with Python, AWS and LLM concepts.

Solution Implementation:

Download the data and code from my Github repository.

Solution Architecture

We will going to build this serverless architecture

Initial Setup

Request model access to “Anthropic’s Claude” foundation model.
(Note: To use Bedrock, you must request access to Bedrock’s FMs and all the models are not present in all the aws regions. So to implement the below solution, please select ‘us-east-1’ as your aws region for the bedrock service and then request access).

Create the lambda function

Now go to “General configuration” and change the default timeout value to 4 minutes, because we need sometime for the model to generate its responses.

Now go to ‘Permissions’ and attach ‘AdministratorAccess’ policy to the execution role (from production point of view attaching administrator access policy is not a good practice, but here we are keeping things simple).

Now go to API Gateway Service and choose “HTTP API” as a API Type
(just give the api name and then click on review and create, no custom config needed).

Now, create the new ‘post’ route.

Now, go to the route details and attach the above created lambda function by clicking on ‘Attach integration’ button.

Now, create the new stage called ‘dev’ and then deploy the changes to that ‘dev’ stage.

So by now, your Lambda function should have attached with your API Gateway. To verify this, go back to your lambda function and you should see API Gateway in function overview diagram.
(Note: Copy the API endpoint url and save it, as later on we will be invoking this url via the Postman)

Create a new S3 bucket, and change the name of the S3 bucket in the “lambda_handler” function with your S3 bucket name.

Now, paste below code in the lambda function.

(Note: do not forget to change the S3 bucket name)

# Importing packages

import json
import boto3
import base64
import botocore.config
from datetime import datetime
from email import message_from_bytes

#----------------------------------------------------------------------#


# This function will going to extract the data from the uploaded file.
def extract_text(data):
    message = message_from_bytes(data)

    text_content = ''

    if message.is_multipart():
        for part in message.walk():
            if part.get_content_type() == "text/plain":
                text_content += part.get_payload(decode=True).decode('utf-8') + "\n"

    else:
        if message.get_content_type() == "text/plain":
            text_content = message.get_payload(decode=True).decode('utf-8')

    return text_content.strip() if text_content else None

# This function will invoke the bedrock model, to summarize our uploaded content or data
def generate_summary(content:str)->str:

    prompt_text = f"""Human: Please summarize the meeting transcript for me: {content}
    Assistant:"""

    body = {
        "prompt":prompt_text,
        "max_tokens_to_sample":5000,
        "temperature":0.3,
        "top_k":250,
        "top_p":0.2,
        "stop_sequences": ["\n\nHuman:"]
    }

    # calling the bedrock API
    bedrock = boto3.client("bedrock-runtime", region_name="us-east-1",
                           config = botocore.config.Config(read_timeout=300, retries = {'max_attempts':3}))
    
    response = bedrock.invoke_model(body=json.dumps(body),modelId="anthropic.claude-v2")
    response_content = response.get('body').read().decode('utf-8')
    response_data = json.loads(response_content)
    summary = response_data["completion"].strip()
    
    return summary


# This function will save the model output to the S3 bucket
def save_summary_to_s3(summary, s3_bucket, s3_key):

    s3 = boto3.client('s3')
    s3.put_object(Bucket = s3_bucket, Key = s3_key, Body = summary)
    print("Summary saved to s3")


# This is our main function, as this function will call all the above functions
def lambda_handler(event, context):

    decoded_body = base64.b64decode(event['body'])

    text_content = extract_text(decoded_body)

    if not text_content:
        return {
            'statusCode':400,
            'body':json.dumps("Failed to extract content")
        }

    summary = generate_summary(text_content)

    if summary:
        current_time = datetime.now().strftime('%H%M%S')
        s3_key = f'summary-output/{current_time}.txt'

        # NOTE: Below give your S3 bucket name
        s3_bucket = 'bedrock-text-summarization-output'

        save_summary_to_s3(summary, s3_bucket, s3_key)

        return {
                'statusCode':200,
                'body':json.dumps("Summary generation finished and output is saved to S3 bucket.")
               }

    else:
        print("No summary was generated")

        return {
                'statusCode':400,
                'body':json.dumps("Failed to generate summary.")
               }

So, at this point our development work is complete, we have created a lambda function and attach that function to the API Gateway and we have written the code to call the Bedrock’s model via the lambda function and also in lambda function we have put the functionality to save the Bedrock’s model generated output to the S3 bucket. Now all we have to do is, call the API Gateway endpoint using Postman.

Invoking our serverless endpoint using Postman

(Note: you can also use python’s request module)

That’s all, we have generated the text summary and stored that summary in the S3 bucket. Now to analyse the generated output, just go to your S3 bucket and download the generated file.

🎄Conclusion

So in this article, we saw how we can integrate different AWS services in order to solve a particular problem and how to do the text summarization using FMs. We have used Anthropic’s Claude foundation model, but you can use any foundation model of your choice. For example: Amazon Titan, Cohere’s Command model, etc. Now these models are not perfect yet, so avoid using them for critical use-cases like medical report summarization, financial document analysis, etc.

Thanks for reading!!