Background Change with Imagen on Vertex AI : A Step-by-Step Guide
Imagen on Vertex AI brings Google’s state of the art generative AI capabilities to application developers. With Imagen on Vertex AI, application developers can build next-generation AI products that transform their user’s imagination into high quality visual assets, in seconds.
There are two options for editing images :
Mask free editing — Lets you edit an image without a mask (specific area).
Mask-based editing — Lets you specify a targeted area to apply edits to. This method of editing is good for edits that apply to only parts of an image.
In this article we see how can we do Mask-based editing using python API.
Background changing involved following steps :
- Removing existing background
- Creating a mask and inverted mask image
- Encoding Image to string
- Creating Request payload
Removing existing background :
We use rembg for removing the background, you can explore other options for doing the same.
Rembg provides simple python code for background removal:
output = remove(input_image_path)
Creating a mask and inverted mask image
We use rembg for extracting the mask, we provide the output from previous step as the input :
output = remove(img, only_mask =True)
We use open cv to create inverted mask
mask_image = cv2.imread(masked_image_path, cv2.IMREAD_GRAYSCALE)
inverted_mask = np.bitwise_not(mask_image)
Encoding Image to string
To make image generation requests you must send image data as Base64 encoded text.
encoded_inp_img = base64.b64encode(inp_img_file.read()).decode(“utf-8”)
Creating Request payload
Payload consists of
- Prompt : Description of background you want to generate
- Image : Here we specify the path to the image with the extracted background.
- Mask : inverted mask image
- sampleCount : Number of images to be generated
- MODE : backgroundEditing
We use the requests module to call imagen APIs
response = requests.post(endpoint, json=request_payload, headers=headers)
The output response consists of bytesBase64Encoded text which is then converted into image and saved on disk.
img_bytes = base64.b64decode(image_data[“bytesBase64Encoded”])
img = Image.open(BytesIO(img_bytes))
img.save("{}_pic.png".format(idx))
Sample prompts :
Prompt : A garden with lush green grass
Prompt : Snow on the ground
Prompt : beach with clear skies and blue water
Link to full code.
References:
Thanks for reading.
Your feedback and questions are highly appreciated. You can connect with me via LinkedIn.