Building serverless AI applications with Amazon Bedrock

Published in

AlamedaDev

9 min readApr 26, 2024

With serverless computing, you don’t have to worry about the heavy lifting of running servers or managing resources. It’s all about focusing on your app’s features. Amazon Bedrock and other services we’ll help you use powerful AI models to do things like understanding what’s said in an audio file or responding to user queries, all without needing to be an expert in AI.

In this guide, we’ll cover the basics like setting up your project, to more advanced stuff like turning speech into text and summarizing conversations. By the end, you’ll know how to make apps that can interact with users, using Amazon’s cloud services.

What is Amazon Bedrock?

Amazon Bedrock lets you easily use large language models without worrying about the underlying infrastructure. In this tutorial, we will use Amazon Bedrock to interact with these models. First, we’ll set up a simple environment by creating a new directory for our project.

mkdir bedrock
cd bedrock/

Basic response for generating a paragraph

Here, you’ll learn how to generate a paragraph from a model by writing a Python script that uses Amazon Bedrock. We’ll start with a simple task: asking the model to summarize a topic in one sentence. This example will help you understand how to interact with the model and process its responses.

import boto3
import json


bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')
prompt = "Write a one sentence summary of Las Vegas."

kwargs = {
    "modelId": "amazon.titan-text-express-v1",
    "contentType": "application/json",
    "accept": "*/*",
    "body" : json.dumps(
        {
            "inputText": prompt,
            "textGenerationConfig": {
                "maxTokenCount": 100,
                "temperature": 0.7,
                "topP": 0.9
            }
        }
    )
}
response = bedrock_runtime.invoke_model(**kwargs)
response_body = json.loads(response.get('body').read())

generation = response_body['results'][0]['outputText']
print(generation)

#> Las Vegas is a famous city known for its gambling, entertainment, and nightlife. It is located in Nevada and is the largest city within the Mojave Desert. Las Vegas is home to several iconic landmarks, including the Las Vegas Strip, the Bellagio, and the MGM Grand. The city is a popular destination for tourists from around the world, who come to enjoy the luxurious casinos, hotels, restaurants, and shows. Las Vegas is also known for its world-class entertainment, including live

Summarising the transcript file of a call

Now, we’ll take it a step further by summarizing the transcript of a conversation. This involves reading a text file, sending it to the model, and asking it to condense the conversation into key points. This part of the tutorial will show you how to process larger pieces of text and use models to identify and summarize important information.

with open('transcript.txt', "r") as file:
    dialogue_text = file.read()

prompt = f"""The text between the <transcript> XML tags is a transcript of a conversation. 
Write a short summary of the conversation.

<transcript>
{dialogue_text}
</transcript>

Here is a summary of the conversation in the transcript:"""


kwargs = {
    "modelId": "amazon.titan-text-lite-v1",
    "contentType": "application/json",
    "accept": "*/*",
    "body": json.dumps(
        {
            "inputText": prompt,
            "textGenerationConfig": {
                "maxTokenCount": 512,
                "temperature": 0,
                "topP": 0.9
            }
        }
    )
}

response = bedrock_runtime.invoke_model(**kwargs)
response_body = json.loads(response.get('body').read())
generation = response_body['results'][0]['outputText']
print(generation)

#> Alex is looking to book a room for his 10th wedding anniversary at the Crystal Heights Hotel in Singapore. The hotel offers several room types that offer stunning views of the city skyline and the fictional Sapphire Bay. The special diamond suite even comes with exclusive access to the moonlit pool and star deck. The package includes breakfast, complimentary access to the moonlit pool and star deck, a one-time spa treatment for two, and a special romantic dinner at the cloud nine restaurant. A preauthorization amount of $1000 will be held on the card, which will be released upon checkout. There is a 10% service charge and a 7% fantasy tax applied to the room rate.

Summarising an audio file from a call

Next, we’ll work with audio files. You’ll learn how to upload an audio recording to AWS S3, transcribe it into text using Amazon Transcribe, and then summarize the conversation using the model. This section covers the entire process from handling audio files, transcribing them, and summarizing the content, which is a common workflow in processing audio data.

For the next sample, we will use the following process:
1. Import packages and load the audio file.
2. Setup:
a. S3 client.
b. Transcribe client.
3. Upload the audio file to S3.
4. Create the unique job name.
5. Build the transcription response.
6. Access the needed parts of the transcript.
7. Setup Bedrock runtime.
8. Create the prompt template.
9. Configure the model response.
10. Generate a summary of the audio transcript.

import os
import boto3
import uuid
import time
from IPython.display import Audio

bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')
audio = Audio(filename="dialog.mp3")
bucket_name = os.environ['AI_LEARN_BEDROCK_BUCKETNAME']

# Setup s3 and upload audio file
s3_client = boto3.client('s3', region_name='us-east-1')
file_name = 'dialog.mp3'
s3_client.upload_file(file_name, bucket_name, file_name)

# Setup transcribe and generate job id
transcribe_client = boto3.client('transcribe', region_name='us-east-1')
job_name = 'transcription-job-' + str(uuid.uuid4())
response = transcribe_client.start_transcription_job(
    TranscriptionJobName=job_name,
    Media={'MediaFileUri': f's3://{bucket_name}/{file_name}'},
    MediaFormat='mp3',
    LanguageCode='en-US',
    OutputBucketName=bucket_name,
    Settings={
        'ShowSpeakerLabels': True,
        'MaxSpeakerLabels': 2
    }
)
while True:
    status = transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    time.sleep(2)
print(status['TranscriptionJob']['TranscriptionJobStatus'])

Upload the audio file to s3

Build a string for the job (transcription is an async process, so we need a job to be able to know its state and know when it is ready)
We need to poll the transcription job to see if it's complete or failed for some reason.
This returns the status, but not the actual transcription.

Access the needed parts of the transcript

In this next step, you will access the transcription result, extract the necessary parts of the conversation, and format it for summarization. This teaches you how to navigate and use the output from Amazon Transcribe, preparing it for further processing with Amazon Bedrock.

import json

if status['TranscriptionJob']['TranscriptionJobStatus'] == 'COMPLETED':
    # Load the transcript from S3.
    transcript_key = f"{job_name}.json"
    transcript_obj = s3_client.get_object(Bucket=bucket_name, Key=transcript_key)
    transcript_text = transcript_obj['Body'].read().decode('utf-8')
    transcript_json = json.loads(transcript_text)

    output_text = ""
    current_speaker = None

    items = transcript_json['results']['items']

    for item in items:

        speaker_label = item.get('speaker_label', None)
        content = item['alternatives'][0]['content']

        # Start the line with the speaker label:
        if speaker_label is not None and speaker_label != current_speaker:
            current_speaker = speaker_label
            output_text += f"\n{current_speaker}: "

        # Add the speech content:
        if item['type'] == 'punctuation':
            output_text = output_text.rstrip()

        output_text += f"{content} "

    # Save the transcript to a text file
    with open(f'{job_name}.txt', 'w') as f:
        f.write(output_text)

Setup Bedrock runtime, Create the prompt template

In this step, you’ll set up the Bedrock runtime and create a template for the prompt that will be sent to the model. This part involves reading the formatted transcript, creating a structured prompt that guides the model in generating a summary, and configuring the model response. It’s a crucial step in translating the raw transcript into a format that the model can understand and respond to effectively.

bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-west-2')
with open(f'{job_name}.txt', "r") as file:
    transcript = file.read()

with open('prompt_template.txt', "r") as file:
    template_string = file.read()
data = {
    'transcript' : transcript
}
template = Template(template_string)
prompt = template.render(data)
print(prompt)

###
I need to summarize a conversation. The transcript of the 
conversation is between the <data> XML like tags.

<data>

spk_0: Hi, is this the Crystal Heights Hotel in Singapore? 
spk_1: Yes, it is. Good afternoon. How may I assist you today? 
spk_0: Fantastic, good afternoon. I was looking to book a room for my 10th wedding anniversary. Ive heard your hotel offers exceptional views and services. Could you tell me more? 
spk_1: Absolutely, Alex and congratulations on your upcoming anniversary. Thats a significant milestone and wed be honored to make it a special occasion for you. We have several room types that offer stunning views of the city skyline and the fictional Sapphire Bay. Our special diamond suite even comes with exclusive access to the moonlit pool and star deck. We also have in house spa services, world class dining options and a shopping arcade. 
spk_0: That sounds heavenly. I think my spouse would love the moonlit pool. Can you help me make a reservation for one of your diamond suites with a sapphire bay view? 
spk_1: Of course. May I know the dates you planning to visit? 
spk_0: Sure. It would be from October 10th to 17th. 
spk_1: Excellent. Let me check the availability. Ah It looks like we have a diamond suite available for those dates. Would you like to proceed with the reservation? 
spk_0: Definitely. Whats included in the package? 
spk_1: Wonderful. The package includes breakfast, complimentary access to the moonlit pool and star deck. A one time spa treatment for two and a special romantic dinner at our cloud nine restaurant. 
spk_0: You making it impossible to resist. Lets go ahead with the booking. 
spk_1: Great. I'll need some personal information for the reservation. Can I get your full name, contact details and a credit card for the preauthorizations? 
spk_0: Certainly. My full name is Alexander Thompson. My contact number is 12345678910. And the credit card is, wait, did you say pre authorization? How much would that be? 
spk_1: Ah, I should have mentioned that earlier. My apologies. A pre authorization. A mt of $1000 will be held on your card which would be released upon checkout 
spk_0: $1000. That seems a bit excessive. Don't you think 
spk_1: I understand your concern, Alex. The pre authorization is a standard procedure to cover any incidental expenses you may incur during your stay. However, I assure you its only a hold and not an actual charge. 
spk_0: Thats still a lot. Are there any additional charges that I should know about? 
spk_1: Well, there is a 10% service charge and a 7% fantasy tax applied to the room rate. 
spk_0: Mm. You know what its a special occasion. So lets go ahead. 
spk_1: Thank you, Alex for understanding. Will ensure that your experience at Crystal Heights is well worth it. 
</data>

The summary must contain a one word sentiment analysis, and 
a list of issues, problems or causes of friction
during the conversation. The output must be provided in 
JSON format shown in the following example. 

Example output:
{
    "sentiment": <sentiment>,
    "issues": [
        {
            "topic": <topic>,
            "summary": <issue_summary>,
        }
    ]
}

Write the JSON output and nothing more.

Here is the JSON output:

Configure the model response

Finally, you will configure the model’s response to your prompt. This includes setting parameters for the text generation and invoking the model with your prepared prompt. This last step completes the process of summarizing audio transcripts by extracting meaningful insights and presenting them in a structured format. This demonstrates the full potential of integrating various AWS services to process and understand audio data.

kwargs = {
    "modelId": "amazon.titan-text-lite-v1",
    "contentType": "application/json",
    "accept": "*/*",
    "body": json.dumps(
        {
            "inputText": prompt,
            "textGenerationConfig": {
                "maxTokenCount": 512,
                "temperature": 0,
                "topP": 0.9
            }
        }
    )
}
response = bedrock_runtime.invoke_model(**kwargs)
response_body = json.loads(response.get('body').read())
generation = response_body['results'][0]['outputText']
print(generation)

###

{
    "sentiment": "Positive",
    "issues": [
        {
            "topic": "Hotel services",
            "summary": "The hotel offers exceptional views and services."
        },
        {
            "topic": "Room booking",
            "summary": "The hotel has several room types that offer stunning views of the city skyline and the fictional Sapphire Bay."
        },
        {
            "topic": "Diamond suite",
            "summary": "The diamond suite comes with exclusive access to the moonlit pool and star deck."
        },
        {
            "topic": "Spa services",
            "summary": "The hotel has in-house spa services, world-class dining options, and a shopping arcade."
        },
        {
            "topic": "Reservation process",
            "summary": "The reservation process includes breakfast, complimentary access to the moonlit pool and star deck, a one-time spa treatment for two, and a special romantic dinner at the cloud nine restaurant."
        },
        {
            "topic": "Pre-authorization",
            "summary": "A pre-authorization of $1000 is held on the credit card, which is released upon checkout."
        },
        {
            "topic": "Additional charges",
            "summary": "There is a 10% service charge and a 7% fantasy tax applied to the room rate."
        }
    ]
}

Conclusion

By now, you should have a solid foundation in using Amazon Bedrock alongside services like Amazon Transcribe and AWS Lambda. These skills are not just limited to the examples shown, but can be extended to a wide range of applications, from automated customer support to data analysis and beyond. The key takeaway is the ease with which we can integrate various AWS services to build powerful, serverless applications that understand and interact with human language, opening up a world of possibilities for developers and businesses alike.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

AlamedaDev provides full-service end-to-end software. Experts in modern software development and AI solutions.

From #barcelona

Website: www.alamedadev.com
Alameda-AI: www.alameda.dev