Spring Boot, Generative AI and Google Cloud: Pose Generator!

Abirami Sukumaran
Google Cloud - Community
9 min readOct 30, 2023

Part five and final of the Spring Boot on Google Cloud series

Text prompt based image generation with Imagen

UX development: Saransh Dhingra, Developer Programs Engineer, Google

A Quick Recap

In this series we have been focusing on building Spring Boot applications with Google Cloud services. We’ve seen that Spring Boot offers a powerful framework for efficiently creating and deploying production-ready applications. We discussed the components of Spring Boot framework and implemented a simple Spring Boot on Google Cloud application. We learnt about Thymeleaf, the server-side Java template engine to send data to front end html pages and how it integrates with Spring Boot on Google Cloud. We also learnt about Just In Time and Ahead Of Time compilation and how spring native converts applications into native executables. Finally we did a deep-dive into the database (ORM) layer for integrating a Spring Boot application with Cloud Spanner using Spring Data Spanner module as the ORM layer that connects the app and the data. We covered all of this with hands-on implementation.

In this final chapter, we’ll dive into AI and ML, specifically Generative AI. It is not only a popular topic, but is also highly relevant for driving business insights, media and content generation, prescriptive analytics and more. However, it’s crucial to recognize that Generative AI cannot exist in isolation. It is just one layer of the intricate full-stack ecosystem, and addressing the other layers, from infrastructure and data management to user interfaces and business logic, remains essential for creating meaningful and functional applications.

Spring Boot and Generative AI

We have already established that it is crucial to focus on all the layers of the full stack delivery in the journey of data to AI to use generative AI to for creating useful information and insights on your data. The holistic view of combining Generative AI in Spring Boot is essential for creating applications that are both functional and intelligent. Spring Boot’s flexibility and extensibility makes it a valuable choice for building the infrastructure and user interfaces of Generative AI applications, allowing developers to focus on the AI model’s core functionality rather than low-level application development concerns.

Spring Boot aids Generative AI applications by simplifying web-based interfaces, facilitating integration, scalability, and security. Its rapid development capabilities and support for microservices architecture enhance the efficiency of building and deploying AI models.

Generative AI on Google Cloud

Generative AI is a rapidly growing area of artificial intelligence that does a variety of tasks like creating realistic images and videos, generating text, translating languages, writing code etc. Google Cloud’s Vertex AI is a fully managed machine learning platform that lets you build, deploy and scale machine learning applications faster and easier. It has a suite of AI APIs, models and options that can be used to create a wide range of generative AI applications.

In this chapter, you’ll build an AI-powered pose generator using Java, Spring Boot, Cloud Spanner database and Vertex AI Imagen API. The user will input a prompt, and the application will generate a pose based on that prompt. It’s a fun and educational way to demonstrate the capabilities of Generative AI with Spring Boot on Google Cloud.

We will use the services, application, database and data we created in the last chapter and build on top of that so we know how Generative AI fits into the overall tech stack for delivering media based on user prompt.

Architecture

High level architecture for the implementation

In the high level flow diagram above, notice that the component Spanner data REST API app is deployed in Cloud Run and made available as a service. This endpoint is consumed in the client application we will build now. Spanner data as a service implementation is already covered in chapter 4 of this series. Make sure you complete that and have your REST endpoint for server ready before you move on to the next step.

Use Case

In this part of the implementation, we are going to build a Pose Generator application that will use the server REST API to fetch data from the service created in the previous chapter and display it in a user interface which will then be augmented with user prompt. It will then invoke the Imagen API with the prompt to generate a relevant image. Imagen responds with a Base64 encoded string which will then be displayed as an image on the UI.

Spanner Data as a Service

Go to chapter 4, if you have not already created the server app, implement the steps and get your API created to access the Spanner data as a service for this app. Once it is ready, test the endpoint to view the data from the Spanner database as seen in the image below:

Spanner data as a service result

Now that we have the data available, let’s move on to the next part which is building and deploying the client app for this use case. I hope you have already familiarized yourself with Spring Boot and Google Cloud console in blog 1 of this series. If not quickly read that short one up, so you are aware of how to launch Google Cloud Shell Terminal and execute the rest of the steps in this blog.

Build and deploy the client application

  1. By now, you are already familiar with the Spring Boot project structure and its significance. So will quickly jump straight into the cloning the repo into your cloud shell machine by running the command below in your cloud shell terminal:
git clone https://github.com/AbiramiSukumaran/genai-posegen

The cloned project structure shows up like this in the cloud shell editor:

Project structure after cloning

PromptController Java class has the database service invocation, implementation of the business logic and the generative AI API invocation of Imagen as well. This class interacts with the Thymeleaf templates that take care of data integration to the user interface. There are 3 service methods in this class — 1) for getting the prompt input 2) for processing the request and invoking the imagen API and 3) for processing imagen response.

Prompt and Yoga are the POJO classes that contain the fields, getters and setters to interface with the Imagen API and Spanner data server API respectively.

Index and getImage html files in the templates folder contain the templates for user interface and they have dependencies in JS and css scripts in the respective folders.

Vertex AI Imagen Integration

For the image generation use case we are using the Vertex AI’s Imagen API in the following format.

https://<<region>>-aiplatform.googleapis.com/v1/projects/<<your-project-id>>/locations/<<region>>/publishers/google/models/imagegeneration:predict

You can read more about Imagen capabilities here. It returns the response in Base64 encoded string format. To convert it into its image, we have used the javascript setattribute method (in the getImage.js file) on the image object as follows in the getImage.HTML file:

poseImage.setAttribute('src', "data:image/jpg;base64," + baseStr64);

Authorization

The Imagen API requires you to have bearer token authentication enabled to access it. In this case, we will look at the Application Default Credentials JSON approach. You can implement it by running the below command from the cloud shell terminal and following the instructions that follow in the terminal:

gcloud auth application-default login

Enter “Y” to authenticate with your account. Allow access and copy the authorization code that is shown in the pop-up. As soon as you do that, you will get the application default credentials in JSON file saved to a location similar to /tmp/tmp.Fh0Gf4yF0V/application_default_credentials.json. Download the file or copy the contents of the JSON file by running the cat command (cat /tmp/tmp.Fh0Gf4yF0V/application_default_credentials.json) and use it in the application in the callImagen() method of the PromptController.java class. You can read more about authentication, here.

User Interface

UX development: Saransh Dhingra, Developer Programs Engineer, Google

We have used Thymeleaf as the template engine to parse and render data to the front end template files and to add elegant design to the user interface. It is similar to HTML but supports more attributes to work with the rendered data. The index.html contains the home page design components and it allows the user to select the pose name and add an overriding prompt to generate the desired image.

If you notice the html files, besides rendering an elegant web page, it also passes the input attributes to the controller class using the prompt pojo.

<form id="formprompt" name="formprompt" action="#" th:action="@{/getimage}" th:object="${prompt}" method="get">

This HTML also has the “select” dropdown component that fetches the pose name list from the database through the poselist attribute that was passed to it from the controller class in the method extractLabels.

  <option th:each="p : ${poselist}" th:value="${p}" th:text="${p}"></option>     

Build and Deploy

Now that you have cloned the code, replaced the values for placeholders as applicable to your project, region and authentication credentials, let’s move on to building and deploying the app. Navigate to the project folder in the cloud shell terminal using the command:

cd genai-posegen

Run the below command to build:

./mvnw package

Once it is successful, run the below command to run the app in your local cloud shell machine:

./mvnw spring-boot:run

You can test it locally by choosing web preview from the cloud shell terminal menu. Otherwise you can deploy it in Cloud Run by running the following command:

gcloud run deploy --source .

Provide details like service name, region, unauthenticated access permission.

Demo

One the app is deployed, you should see the service URL in the terminal. Click the link and see your pose image generation app running on Google Cloud serverlessly!

Deployed app demo

Conclusion

In this blog, we were able to bring the full stack Spring Boot application that stores and handles data in Cloud Spanner, to generate poses using Google Cloud Vertex AI’s Imagen API in an interactive client application deployed in Cloud Run. By reusing the Cloud Spanner ORM and data from our previous application, we significantly accelerated the development process for this Generative AI image generation app. We began by establishing a robust data infrastructure using Google Cloud’s Spanner ORM, creating a strong foundation of data that is used to generate images using Vertex AI’s generative AI API.

Wish to contribute to the project?

As we wrap, I encourage you to use all the learning in this entire series in taking data to AI and contribute to this project by completing the 2 methods which are currently provided with placeholders code snippets. In the architecture diagram section of this blog, you see the Cloud Functions (Java) component which we never got to do? That is up for grabs if you wish to contribute. You can implement 2 Java Cloud Functions to perform the 2 methods that can be found in the getimage.html file:

  1. Save pose image to database:

The requirement for save pose to database is to save the image generated by the app to Cloud Storage and add the image encoded string in a new Spanner table (say Pose_image, in the image field) with a referential relationship to the name field in the Yoga_Poses table.

2) Upload image:

The requirement for upload image is to upload any image of your choice and save as encoded string in the database in the new table (say Pose_image, in the image field) with a referential relationship to the selected name field from the Yoga_poses table.

You can then invoke these Cloud Functions from the client application in their respective button click actions.

With this we conclude the Java Spring Boot and Google Cloud series! For feedback, comments and ideas, feel free to reach out to me in my socials at https://abirami.dev.

--

--

Abirami Sukumaran
Google Cloud - Community

Developer Advocate Google. With 18 years in data and software dev leadership, I’m passionate about addressing real world opportunities with technology.