Unleashing Creative Power: A Hands-On Guide to Image Generation with Google Cloud’s Vertex AI and Imagen API

Esther Irawati Setiawan
Google Developer Experts
5 min readJan 25, 2024

Ever wondered about the magic behind generating unique and captivating images effortlessly? Google’s Vertex AI API for Imagen, opens the door to a world of creative possibilities. This guide will dive into the basics and unleash your creativity by generating stunning images.

Requirements:

  • Curl
  • Any programming language that can convert base64 into images (preferably Python)

Let’s get familiar with the Vertex AI API for Imagen to kick things off. It’s a versatile tool that lets you create images programmatically. The best part? You can seamlessly integrate it into your website or software, making the process accessible and efficient. Before we start, ensure you have your Command Line Interface (CLI) running and authenticated with Google Cloud.

If you haven’t already, the first step is to set up a Google Cloud project so we can access the API. We’ll guide you through the setup process so you have the requirements. You can access Google Cloud Console with your Gmail through this link: https://console.cloud.google.com

When successfully logged in, you will be taken to the welcome page, which should look like the one below.

We can see our selected project on the page's top left corner and the search bar's left side. By default, the selected project will be “My First Project”; we don’t need to change it.

Next, click on the APIs & Services on the Quick Access panel and navigate to ENABLE APIS AND SERVICES.

Search for Vertex AI on the API Library search bar and select “Vertex AI API.” When taken to its page, click on ENABLE to enable the API. We can’t use Google Cloud’s APIs if we haven’t enabled them yet. It will take a few seconds, but we can access the API online when it's done.

For this guide, we will be using curl to access the API, but first, we need to install gcloud CLI to get the bearer token for the API authorization. We can read the detailed guide on how to install it here: https://cloud.google.com/sdk/docs/install.

After installing, we need to set it up with our account by running gcloud init on our shell. We can read the guide for it here: https://cloud.google.com/sdk/docs/initializing.

If we did everything correctly, when we run gcloud auth list on our shell, we should see our account as active.

Now that all the setup is done, we can try the API by calling its endpoint according to this documentation: https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/imagegeneration.

We should make a folder for this project and put request.json there to make things cleaner.

The contents of the request.json:

{
"instances": [
{
"prompt": "Women Developer"
}
],
"parameters": {
"sampleCount": 1
}
}

Next, we should navigate our shell to be inside the same folder as our request.json. Since I’m using Windows PowerShell, I use backticks (`) as the line separator. You can use backslashes (\) as the line separator on Linux. Here is the command I used to get our image generation result. It will get the request from the request.json and save the API response in response.json.

echo $(curl -X POST `
-H "Authorization: Bearer $(gcloud auth print-access-token)" `
-H "Content-Type: application/json; charset=utf-8" `
-d "@request.json" `
"https://us-central1-aiplatform.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/locations/us-central1/publishers/google/models/imagegeneration:predict") >> response.json

Don’t forget to replace the placeholder (YOUR-PROJECT-ID) in the endpoint with your project ID where you enabled the Vertex AI API.

If you don’t know where to get the project ID, you can get it on the gcloud console. Click on the project dropdown (by default, yours would say “My First Project”).

A modal will show up, and you can get your project ID on the right side of your project name. Copy the project ID where you enabled the Vertex AI API.

When we get our response, it will look like this:

Where’s our image? The image is encoded in base64, where the Python written in the requirements will be handy.

Write this short script below and save it as script.py.

#Import the modules
import json
import base64
from PIL import Image
from io import BytesIO

#Open the JSON file
with open("response.json", "r") as f:
data = json.load(f)
#Get the base 64 encoded string from the JSON object
base64_str = data["predictions"][0]["bytesBase64Encoded"]
#Decode the base 64 string to bytes
image_bytes = base64.b64decode(base64_str)
#Create an image object from the bytes
image = Image.open(BytesIO(image_bytes))
#Save the image to a file
image.save("output.png")

Run the script, and just like that, we have our first generated image!

Congratulations on mastering the art of image generation with the API. This guide serves as a launchpad for your creative exploration. The simplicity and potential of the Imagen API make it an exciting tool for artists, developers, and enthusiasts alike. Now, go forth and let your imagination run wild!

--

--