Leveraging GenAI: Bedrock, Claude, and Amazon Titan for Image Background Replacement

Published in

Data on Cloud: GenAI, Data Science, and Data Engineering Insights

9 min readJul 2, 2024

Introduction

Generating visual content has always been a significant challenge for brand marketers. The ability to swiftly change product images across various backgrounds allows for rapid iteration and better communication with designers. In this article, I will demonstrate how to leverage Amazon Bedrock and its supported large language models (LLMs) to achieve quick background replacements for marketing images. Specifically, we will focus on:

Locking the product subject in the image.
Using image-to-text capabilities to generate recommended prompts.
Creating an application to replace the background of images.

This implementation builds upon the AWS workshop lab Background Replacement and integrates functions from the Image Understanding lab. We will be using two powerful LLMs, Claude 3 and Amazon Titan, both accessible through Amazon Bedrock.

Our application will feature:

Manual Background Description: Users can manually describe the desired background, similar to the original lab.
AI-Generated Background Description: Users can upload a background image, which will be processed by Claude 3 to generate a suggested background description, providing a useful starting point for further modifications.

With these capabilities, marketers can efficiently generate high-quality visual content tailored to their specific needs.

Implementation

In this section, we will walk through the code implementation that enables background replacement using Amazon Bedrock and two powerful LLMs: Claude 3 and Amazon Titan.

import boto3
import json
import base64
from io import BytesIO
from random import randint

# Utility functions

# Convert file bytes to a BytesIO object
def get_bytesio_from_bytes(image_bytes):
    image_io = BytesIO(image_bytes)
    return image_io

# Get a base64-encoded string from file bytes
def get_base64_from_bytes(image_bytes):
    resized_io = get_bytesio_from_bytes(image_bytes)
    img_str = base64.b64encode(resized_io.getvalue()).decode("utf-8")
    return img_str

# Load bytes from a file on disk
def get_bytes_from_file(file_path):
    with open(file_path, "rb") as image_file:
        file_bytes = image_file.read()
    return file_bytes

# Create the JSON payload for the InvokeModel API call for background replacement
def get_titan_image_background_replacement_request_body(prompt, image_bytes, mask_prompt, negative_prompt=None, outpainting_mode="DEFAULT"):
    input_image_base64 = get_base64_from_bytes(image_bytes)
    body = {
        "taskType": "OUTPAINTING",
        "outPaintingParams": {
            "image": input_image_base64,
            "text": prompt,  # Description of the background to generate
            "maskPrompt": mask_prompt,  # The element(s) to keep
            "outPaintingMode": outpainting_mode,  # "DEFAULT" softens the mask. "PRECISE" keeps it sharp.
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,  # Number of variations to generate
            "quality": "premium",  # Allowed values are "standard" and "premium"
            "height": 512,
            "width": 512,
            "cfgScale": 8.0,
            "seed": randint(0, 100000),  # Use a random seed
        },
    }
    if negative_prompt:
        body['outPaintingParams']['negativeText'] = negative_prompt
    return json.dumps(body)

# Extract and decode the image data from the Titan Image Generator response
def get_titan_response_image(response):
    response = json.loads(response.get('body').read())
    images = response.get('images')
    image_data = base64.b64decode(images[0])
    return BytesIO(image_data)

# Create the JSON payload for the InvokeModel API call for image understanding
def get_image_understanding_request_body(prompt, image_bytes):
    input_image_base64 = get_base64_from_bytes(image_bytes)
    body = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2000,
        "temperature": 0,
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/jpeg",
                            "data": input_image_base64,
                        },
                    },
                    {
                        "type": "text",
                        "text": prompt
                    }
                ],
            }
        ],
    }
    return json.dumps(body)

# Use Claude 3 to describe the background image
def interpret_background_image(image_bytes):
    session = boto3.Session()
    bedrock = session.client(service_name='bedrock-runtime')
    prompt_content = "Describe this scene briefly to use as a product background."
    body = get_image_understanding_request_body(prompt_content, image_bytes)
    response = bedrock.invoke_model(body=body, modelId="anthropic.claude-3-sonnet-20240229-v1:0", contentType="application/json", accept="application/json")
    response_body = json.loads(response.get('body').read())
    output = response_body['content'][0]['text']
    if len(output) > 50:
        output = output[:47] + "..."  # Trim to 50 characters and add ellipsis
    return output

# Generate an image using Amazon Titan Image Generator, with optional background interpretation
def get_image_from_model(prompt_content, image_bytes, background_image_bytes=None, mask_prompt=None, negative_prompt=None, outpainting_mode="DEFAULT"):
    session = boto3.Session()
    bedrock = session.client(service_name='bedrock-runtime')
    
    if background_image_bytes:
        interpretation = interpret_background_image(background_image_bytes)
        prompt_content = interpretation + " " + prompt_content
        if len(prompt_content) > 50:
            prompt_content = prompt_content[:47] + '...'  # Ensure prompt does not exceed 50 characters
    
    body = get_titan_image_background_replacement_request_body(prompt_content, image_bytes, mask_prompt=mask_prompt, negative_prompt=negative_prompt, outpainting_mode=outpainting_mode)
    response = bedrock.invoke_model(body=body, modelId="amazon.titan-image-generator-v1", contentType="application/json", accept="application/json")
    output = get_titan_response_image(response)
    return output

Explanation of Our Implementation

In this implementation, we have expanded upon the original lab to include additional functionality that enhances the background replacement process. Here’s a breakdown of the key changes and additions:

Utility Functions:
get_bytesio_from_bytes, get_base64_from_bytes, and get_bytes_from_file are utility functions to handle image data conversion and reading.
Enhanced Request Body:
get_titan_image_background_replacement_request_body constructs the JSON payload for the API call to replace the background, similar to the original lab but with additional fields for flexibility.
Response Handling:
get_titan_response_image extracts and decodes the image data from the API response.
Image Understanding Integration:
get_image_understanding_request_body prepares the request body to use Claude 3 for understanding the background image.
interpret_background_image uses Claude 3 to describe the background image, generating a concise description to be used as part of the prompt for the background replacement.
Main Function:
get_image_from_model integrates the functions to interpret the background image (if provided), generate the appropriate prompt, and call Titan Image Generator to perform the background replacement.

Comparison with the Original Code

The original lab provided a basic implementation for background replacement using Amazon Titan Image Generator. Our enhanced implementation integrates an additional step to interpret the background image using Claude 3, allowing for more contextually appropriate prompts. This is achieved by adding the interpret_background_image function and modifying the get_image_from_model function to incorporate the interpreted background description into the prompt. Additionally, utility functions have been refined for better handling of image data conversions.

Front-End Implementation

Below is the modified front-end implementation using Streamlit. This code adds the functionality to upload a background template image and uses AI to generate a description for the background.

import streamlit as st
import image_background_lib_1 as glib

st.set_page_config(layout="wide", page_title="Image Background Replacement and Understanding")

st.title("Image Background Replacement and Understanding")

col1, col2, col3 = st.columns(3)

with col1:
    # Upload image for background replacement
    uploaded_file = st.file_uploader("Select an image for background replacement:", type=['png', 'jpg'])
    # Upload background template image
    background_file = st.file_uploader("Upload a background template image for understanding:", type=['png', 'jpg'])

    if uploaded_file:
        uploaded_image_preview = glib.get_bytesio_from_bytes(uploaded_file.getvalue())
        st.image(uploaded_image_preview, caption="Original Image for Background Replacement")
    
    if background_file:
        background_image_preview = glib.get_bytesio_from_bytes(background_file.getvalue())
        st.image(background_image_preview, caption="Background Template Image")

with col2:
    st.subheader("Image parameters")
    
    # Input for main object to retain
    mask_prompt = st.text_input("Object to keep:", value="A car", help="Describe the main object to retain.")
    
    # Manual description for the background
    prompt_text = st.text_area("Manual description for the background:", value="Car at the beach", height=100)
    if background_file:
        background_image_bytes = background_file.getvalue()
        # Use AI to interpret the background image and suggest a description
        interpreted_text = glib.interpret_background_image(background_image_bytes)
        prompt_text = st.text_area("Description of the scene from the background image:", value=interpreted_text, height=100, help="AI suggested description. You can modify it.")

    # Input for elements to exclude from the background
    negative_prompt = st.text_input("Exclude from the background:", help="Specify any elements to avoid in the background.")

    # Select outpainting mode
    outpainting_mode = st.radio("Outpainting mode:", ["DEFAULT", "PRECISE"], horizontal=True)
    
    generate_button = st.button("Generate Image")

with col3:
    st.subheader("Generated Result")
    
    if generate_button and uploaded_file:
        image_bytes = uploaded_file.getvalue()
        background_image_bytes = background_file.getvalue() if background_file else None
        
        with st.spinner("Generating image..."):
            # Generate the image with the specified parameters
            generated_image = glib.get_image_from_model(
                prompt_content=prompt_text, 
                image_bytes=image_bytes,
                background_image_bytes=background_image_bytes,
                mask_prompt=mask_prompt,
                negative_prompt=negative_prompt,
                outpainting_mode=outpainting_mode,
            )
        
        st.image(generated_image, caption="Generated Image")

Explanation

Upload Image:
The user can upload two images: one for background replacement and another for the background template. The uploaded images are displayed for preview.
Image Parameters:
Users can input the object to keep (mask_prompt), provide a manual description for the background (prompt_text), and specify elements to exclude (negative_prompt).
If a background image is uploaded, the application uses an AI model to generate a suggested description for the background, which the user can modify.
Generate Button:
When the “Generate Image” button is clicked, the app processes the input images and parameters, generates the new image, and displays it.

Comparison with the Original Code

The original lab code allowed users to upload an image and specify parameters for background replacement. The modified version adds the ability to upload a background template image, uses AI to suggest descriptions for the background, and integrates these descriptions into the prompt for generating the final image. This enhancement provides a more intuitive and context-aware approach to background replacement, making it easier for marketers to quickly adapt images for different contexts.

User Interface Operation Guide

The user interface of our application is designed to provide a seamless experience for background replacement using Amazon Bedrock. Here’s a step-by-step guide on how to use each part of the interface, covering both the original lab functionality and the new feature.

Step 1: Upload Images

In the first column, you can upload two types of images:

Image for Background Replacement: This is the primary image where you want to replace the background.
Background Template Image: This is an optional image that helps generate a contextually appropriate background description using AI.

Operation:

Click on the “Select an image for background replacement:” button to upload the main image.
Click on the “Upload a background template image for understanding:” button to upload a background template image.

Once uploaded, the images will be displayed for preview.

Scenario 1: Manual Description for the Background

Set Image Parameters

In the second column, you can specify various parameters to fine-tune the background replacement process.

Operation:

Object to Keep: Enter a description of the main object in the image that you want to retain (e.g., “A laptop”).
Manual Description for the Background: Enter a manual description for the new background (e.g., “laptop on grass”).
Exclude from the Background: Specify any elements you want to avoid in the new background.
Outpainting Mode: Choose between “DEFAULT” and “PRECISE” modes to control how the masked object blends with the new background.
Generate Image: Click the button to initiate the background replacement process.

View Generated Result

In the third column, the generated image will be displayed.

Operation:

After clicking the “Generate Image” button and if an image for background replacement is uploaded, the application will process the inputs and display the resulting image with the new background.

Scenario 2: Using Background Template Image for AI-Generated Description

Upload Background Template Image

Operation:

Upload a background template image that the AI will interpret to generate a contextually appropriate background description.

Set Image Parameters with AI-Generated Description

In the second column, you can specify various parameters with the help of AI-generated suggestions.

Operation:

Object to Keep: Enter a description of the main object in the image that you want to retain (e.g., “A car”).
AI-Generated Description for the Background: When a background template image is uploaded, Claude 3 will interpret the image and suggest a background description. This description is automatically filled into the prompt text area, providing a useful starting point for marketers. Users can further modify this text to better suit their needs.
Exclude from the Background: Specify any elements you want to avoid in the new background.
Outpainting Mode: Choose between “DEFAULT” and “PRECISE” modes to control how the masked object blends with the new background.
Generate Image: Click the button to initiate the background replacement process.

View Generated Result

In the third column, the generated image will be displayed.

Operation:
After clicking the “Generate Image” button and if an image for background replacement is uploaded, the application will process the inputs and display the resulting image with the new background.

Conclusion

In this project, we leveraged Amazon Bedrock and advanced LLMs like Claude 3 and Amazon Titan to create an efficient image background replacement application. Building on AWS workshop labs, we added the ability to generate both manual and AI-assisted background descriptions.

This tool allows marketers to quickly generate high-quality visual content, facilitating better collaboration with designers and faster adaptation to market needs. It streamlines the creation of marketing visuals, making the process more flexible and efficient.

Leveraging GenAI: Bedrock, Claude, and Amazon Titan for Image Background Replacement

Introduction

Implementation

Explanation of Our Implementation

Comparison with the Original Code

Front-End Implementation

Explanation

Comparison with the Original Code

User Interface Operation Guide

Step 1: Upload Images

Scenario 1: Manual Description for the Background

Scenario 2: Using Background Template Image for AI-Generated Description

Conclusion

Written by Willy Zhuang